parsing error - Page 89

Parsing Data in XML and Storing to DB in Python

- by Rakesh

Hi Guys i have problem parsing an xml file and entering the data to sqlite, the format is like i need to enter the chracters before the token like 111,AAA,BBB etc <DOCUMENT> <PAGE width="544.252" height="634.961" number="1" id="p1"> <MEDIABOX x1="0" y1="0" x2="544.252" y2="634.961"/> <BLOCK id="p1_b1"> <TEXT width="37.7" height="74.124" id="p1_t1" x="51.1" y="20.8652"> <TOKEN sid="p1_s11" id="p1_w1" font-name="Verdanae" bold="yes" italic="no">111</TOKEN> </TEXT> </BLOCK> <BLOCK id="p1_b3"> <TEXT width="151.267" height="10.725" id="p1_t6" x="24.099" y="572.096"> <TOKEN sid="p1_s35" id="p1_w22" font-name="Verdanae" bold="yes" italic="yes">AAA</TOKEN> <TOKEN sid="p1_s36" id="p1_w23" font-name="verdanae" bold="yes" italic="no">BBB</TOKEN> <TOKEN sid="p1_s37" id="p1_w24" font-name="verdanae" bold="yes" italic="no">CCC</TOKEN> </TEXT> </BLOCK> <BLOCK id="p1_b4"> <TEXT width="82.72" height="26" id="p1_t7" x="55.426" y="138.026"> <TOKEN sid="p1_s42" id="p1_w29" font-name="verdanae" bold="yes" italic="no">DDD</TOKEN> <TOKEN sid="p1_s43" id="p1_w30" font-name="verdanae" bold="yes" italic="no">EEE</TOKEN> </TEXT> <TEXT width="101.74" height="26" id="p1_t8" x="55.406" y="162.026"> <TOKEN sid="p1_s45" id="p1_w31" font-name="verdanae" bold="yes" italic="no">FFF</TOKEN> </TEXT> <TEXT width="152.96" height="26" id="p1_t9" x="55.406" y="186.026"> <TOKEN sid="p1_s47" id="p1_w32" font-name="verdanae" bold="yes" italic="no">GGG</TOKEN> <TOKEN sid="p1_s48" id="p1_w33" font-name="verdanae" bold="yes" italic="no">HHH</TOKEN> </TEXT> </BLOCK> </PAGE> </DOCUMENT> in .net it is done with 3 foreach loops 1. for "DOCUMENT/PAGE/BLOCK" 2."TEXT" 3. "TOKEN" and then it is entered into the DB i dont get how to do it in python and i am trying it with lxml module

Read the article

Error while creating a VM using KVM

- by Karan Gurnani

I am trying to set up a VM on my Ubuntu 13.04 Desktop and it's giving me error when I try to start the VM. The error states: virsh # start vm1 error: Failed to start domain vm1 error: internal error process exited while connecting to monitor: W: kvm binary is deprecated, please use qemu-system-x86_64 instead char device redirected to /dev/pts/2 (label charserial0) qemu: at most 2047 MB RAM can be simulated What is the workaround for this, if any?

Read the article

Show table gives - ERROR 2002 (HY000): Can't connect to local MySQL server through socket

- by arn

I am having the InnodB tables and show tables gives following error ? mysql (mydb) > show tables; ERROR 2006 (HY000): MySQL server has gone away No connection. Trying to reconnect... Connection id: 1 Current database: mydb ERROR 2006 (HY000): MySQL server has gone away No connection. Trying to reconnect... ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock1' (111) ERROR: Can't connect to the server

Read the article

How to install GIT on an offline RHEL?

- by Stijn Vanpoucke

I'm using the following commands from the manual to install GIT $ tar -zxf git-1.7.2.2.tar.gz $ cd git-1.7.2.2 $ make prefix=/usr/local all $ sudo make prefix=/usr/local install but I'm receiving the following exceptions ... cache.h: At top level: cache.h:746: error: expected declaration specifiers or â...â before âtime_tâ cache.h:889: warning: âstruct timevalâ declared inside parameter list cache.h:895: warning: âstruct timevalâ declared inside parameter list cache.h:970: error: expected specifier-qualifier-list before âoff_tâ cache.h:979: error: expected specifier-qualifier-list before âoff_tâ cache.h:997: error: expected specifier-qualifier-list before âoff_tâ cache.h:1057: error: expected declaration specifiers or â...â before âoff_tâ cache.h:1063: error: expected declaration specifiers or â...â before âuint32_tâ cache.h:1064: error: expected â=â, â,â, â;â, âasmâ or â__attribute__â before ânt h_packed_object_offsetâ cache.h:1065: error: expected â=â, â,â, â;â, âasmâ or â__attribute__â before âfi nd_pack_entry_oneâ cache.h:1067: error: expected declaration specifiers or â...â before âoff_tâ cache.h:1069: error: expected declaration specifiers or â...â before âoff_tâ cache.h:1070: error: expected declaration specifiers or â...â before âoff_tâ cache.h:1094: error: expected specifier-qualifier-list before âoff_tâ cache.h:1168: error: expected â)â before â*â token cache.h:1177: error: expected â=â, â,â, â;â, âasmâ or â__attribute__â before âre ad_in_fullâ cache.h:1178: error: expected â=â, â,â, â;â, âasmâ or â__attribute__â before âwr ite_in_fullâ cache.h:1179: error: expected â=â, â,â, â;â, âasmâ or â__attribute__â before âwr ite_str_in_fullâ cache.h:1252: error: expected declaration specifiers or â...â before âFILEâ In file included from credential-store.c:2: credential.h:28: error: expected declaration specifiers or â...â before âFILEâ credential.h:29: error: expected declaration specifiers or â...â before âFILEâ In file included from credential-store.c:4: parse-options.h:115: error: expected specifier-qualifier-list before âintptr_tâ credential-store.c: In function âparse_credential_fileâ: credential-store.c:13: error: âFILEâ undeclared (first use in this function) credential-store.c:13: error: âfhâ undeclared (first use in this function) credential-store.c:17: warning: implicit declaration of function âfopenâ credential-store.c:19: error: âerrnoâ undeclared (first use in this function) credential-store.c:19: error: âENOENTâ undeclared (first use in this function) credential-store.c:24: error: too many arguments to function âstrbuf_getlineâ credential-store.c:24: error: âEOFâ undeclared (first use in this function) credential-store.c:39: warning: implicit declaration of function âfcloseâ credential-store.c: In function âprint_entryâ: credential-store.c:44: warning: implicit declaration of function âprintfâ credential-store.c:44: warning: incompatible implicit declaration of built-in fu nction âprintfâ credential-store.c: In function âmainâ: credential-store.c:132: warning: implicit declaration of function âumaskâ credential-store.c:144: error: âstdinâ undeclared (first use in this function) credential-store.c:144: error: too many arguments to function âcredential_readâ credential-store.c:147: warning: implicit declaration of function âstrcmpâ Is this because I didn't install the dependencies? apt-get install libcurl4-gnutls-dev libexpat1-dev gettext libz-dev libssl-dev How do I install them offline?

Read the article

apt-get is broken

- by Amol Shinde

I Cannot install any package in the server, As I am newbie in Server. In Morning I found that some, I am not able to install any package from command line in the server,Now every package is now manually downloaded packages and then installed in the server. Can any one Please tell me what is the issue and how could it be resolved. OS:- Ubuntu 10.04.4 LTS \n \l (64 Bit) Below is the error: iam@ubuntu$ sudo apt-get install pidgin Reading package lists... Done Building dependency tree Reading state information... Done pidgin is already the newest version. 0 upgraded, 0 newly installed, 0 to remove and 102 not upgraded. 32 not fully installed or removed. After this operation, 0B of additional disk space will be used. Traceback (most recent call last): File "/usr/bin/apt-listchanges", line 33, in <module> from ALChacks import * File "/usr/share/apt-listchanges/ALChacks.py", line 32, in <module> sys.stderr.write(_("Can't set locale; make sure $LC_* and $LANG are correct!\n")) NameError: name '_' is not defined perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_CTYPE = "UTF-8", LANG = "en_IN" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory Setting up shared-mime-info (0.71-1ubuntu2) ... /var/lib/dpkg/info/shared-mime-info.postinst: line 13: 21935 Segmentation fault update-mime-database.real /usr/share/mime dpkg: error processing shared-mime-info (--configure): subprocess installed post-installation script returned error exit status 139 dpkg: dependency problems prevent configuration of libgtk2.0-0: libgtk2.0-0 depends on shared-mime-info; however: Package shared-mime-info is not configured yet. dpkg: error processing libgtk2.0-0 (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of chromium-browser: chromium-browser depends on libgtk2.0-0 (>= 2.20.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing chromium-browser (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of chromium-codecs-ffmpeg: chromium-codecs-ffmpeg depends on chromium-browser (>= 4.0.203.0~); however: Package chromium-browser is not configured yet. dpkg: error processing chromium-codecs-ffmpeg (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of chromium-browser-l10n: chromium-browser-l10n depends on chromium-browser (= 18.0.1025.151~r130497-0ubuntu0.10.04.No apport report written because the error message indicates its a followup error from a previous failure. No apport report written because the error message indicates its a followup error from a previous failure. No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already 1); however: Package chromium-browser is not configured yet. dpkg: error processing chromium-browser-l10n (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of libevdocument2: libevdocument2 depends on libgtk2.0-0 (>= 2.14.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing libevdocument2 (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of libevview2: libevview2 depends on libevdocument2 (>= 2.29.5); however: Package libevdocument2 is not configured yet. libevview2 depends on libgtk2.0-0 (>= 2.20.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing libevview2 (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of evince: evince depends on libevdocument2 (>= 2.29.5); however: Package libevdocument2 is not configured yet. evince depends on libevview2 (>= 2.29.No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already 5); however: Package libevview2 is not configured yet. evince depends on libgtk2.0-0 (>= 2.16.0); however: Package libgtk2.0-0 is not configured yet. evince depends on shared-mime-info; however: Package shared-mime-info is not configured yet. dpkg: error processing evince (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of firefox: firefox depends on libgtk2.0-0 (>= 2.20.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing firefox (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of gcalctool: gcalctool depends on libgtk2.0-0 (>= 2.18.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing gcalctool (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of libgdict-1.0-6: libgdict-1.0-6 depends on libgtk2.0-0 (>= 2.18.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing libgdict-1.0-6 (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of gnome-utils: gnome-utils depends on libgdict-1.0-6 (>= 2.23.90); however: Package libgdict-1.0-6 is not configured yet. gnome-utils depends on libgtk2.0-0 (>= 2.18.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing gnome-utils (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of gtk2-engines-pixbuf: gtk2-engines-pixbuf depends on gtk2.0-binver-2.10.0; however: Package gtk2.0-binver-2.10.0 is not installed. Package libgtk2.0-0 which provides gtk2.0-binver-2.10.0 is not configured yet. gtk2-engines-pixbuf depends on libgtk2.0-0 (= 2.20.1-0ubuntu2.1); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing gtk2-engines-pixbuf (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of libedataserverui1.2-8: libedataserverui1.2-8 depends on libgtk2.0-0 (>= 2.14.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing libedataserverui1.2-8 (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of libgail18: libgail18 depends on libgtk2.0-0 (= 2.20.1-0ubuntu2.1); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing libgail18 (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of libgtk2.0-bin: libgtk2.0-bin depends on libgtk2.0-0 (>= 2.20.1-0ubuntu2.1); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing libgtk2.0-bin (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of libgtk2.0-dev: libgtk2.0-dev depends on libgtk2.0-0 (= 2.20.1-0ubuntu2.1); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing libgtk2.0-dev (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of libnotify-dev: libnotify-dev depends on libgtk2.0-dev (>= 2.10); however: Package libgtk2.0-dev is not configured yet. dpkg: error processing libnotify-dev (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of network-manager-gnome: network-manager-gnome depends on libgtk2.0-0 (>= 2.16.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing network-manager-gnome (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of openoffice.org-core: openoffice.org-core depends on libgtk2.0-0 (>= 2.10); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing openoffice.org-core (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of openoffice.org-draw: openoffice.org-draw depends on openoffice.org-core (= 1:3.2.0-7ubuntu4.4); however: Package openoffice.org-core is not configured yet. dpkg: error processing openoffice.org-draw (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of openoffice.org-impress: openoffice.org-impress depends on openoffice.org-core (= 1:3.2.0-7ubuntu4.4); however: Package openoffice.org-core is not configured yet. openoffice.org-impress depends on openoffice.org-draw (= 1:3.2.0-7ubuntu4.4); however: Package openoffice.org-draw is not configured yet. dpkg: error processing openoffice.org-impress (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of pidgin: pidgin depends on libgtk2.0-0 (>= 2.18.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing pidgin (--configure): dependency problems - leaving unconfigured No apport report written because MaxReports is reached already Setting up update-manager (1:0.134.12.1) ... locale: Cannot set LC_CTYPE to default locale: No such file or directory dpkg: error processing update-manager (--configure): subprocess installed post-installation script returned error exit status 245 No apport report written because MaxReports is reached already dpkg: dependency problems prevent configuration of update-notifier: update-notifier depends on libgtk2.0-0 (>= 2.14.0); however: Package libgtk2.0-0 is not configured yet. update-notifier depends on update-manager; however: Package update-manager is not configured yet. dpkg: error processing update-notifier (--configure): dependency problems - leaving unconfigured No apport report written because MaxReports is reached already dpkg: dependency problems prevent configuration of xulrunner-1.9.2: xulrunner-1.9.2 depends on libgtk2.0-0 (>= 2.18.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing xulrunner-1.9.2 (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of xulrunner-1.9.2-dev: xulrunner-1.9.2-dev depends on xulrunner-1.9.2 (= 1.9.2.28+build1+nobinonly-0ubuntu0.10.04.1); however: Package xulrunner-1.9.2 is not configured yet. xulrunner-1.9.2-dev depends on libnotify-dev; however: Package libnotify-dev is not configured yet. dpkg: error processing xulrunner-1.9.2-dev (--configure): dependency problems - leaving unconfigured No apport report written because MaxReports is reached already No apport report written because MaxReports is reached already dpkg: dependency problems prevent configuration of icedtea6-plugin: icedtea6-plugin depends on xulrunner-1.9.2; however: Package xulrunner-1.9.2 is not configured yet. icedtea6-plugin depends on libgtk2.0-0 (>= 2.8.0); however: Package libgtk2.0-0 is not configured yet. dpkg: error processing icedtea6-plugin (--configure): dependency problems - leaving unconfigured Setting up libgweather-common (2.30.0-0ubuntu1.1) ... No apport report written because MaxReports is reached already locale: Cannot set LC_CTYPE to default locale: No such file or directory dpkg: error processing libgweather-common (--configure): subprocess installed post-installation script returned error exit status 245 No apport report written because MaxReports is reached already dpkg: dependency problems prevent configuration of libgweather1: libgweather1 depends on libgtk2.0-0 (>= 2.11.0); however: Package libgtk2.0-0 is not configured yet. libgweather1 depends on libgweather-common (>= 2.24.0); however: Package libgweather-common is not configured yet. dpkg: error processing libgweather1 (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of openoffice.org-style-galaxy: openoffice.org-style-galaxy depends on openoffice.org-core (>= 1:3.2.0~beta); however: Package openoffice.org-core is not configured yet. No apport report written because MaxReports is reached already dpkg: error processing openoffice.org-style-galaxy (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of openoffice.org-common: openoffice.org-common depends on openoffice.org-style-default | openoffice.org-style; however: Package openoffice.org-style-default is not installed. Package openoffice.org-style-galaxy which provides openoffice.org-style-default is not configured yet. Package openoffice.org-style is not installed. Package openoffice.org-style-galaxy which provides openoffice.org-style is not configured yet. No apport report written because MaxReports is reached already dpkg: error processing openoffice.org-common (--configure): dependency problems - leaving unconfigured No apport report written because MaxReports is reached already Errors were encountered while processing: shared-mime-info libgtk2.0-0 chromium-browser chromium-codecs-ffmpeg chromium-browser-l10n libevdocument2 libevview2 evince firefox gcalctool libgdict-1.0-6 gnome-utils gtk2-engines-pixbuf libedataserverui1.2-8 libgail18 libgtk2.0-bin libgtk2.0-dev libnotify-dev network-manager-gnome openoffice.org-core openoffice.org-draw openoffice.org-impress pidgin update-manager update-notifier xulrunner-1.9.2 xulrunner-1.9.2-dev icedtea6-plugin libgweather-common libgweather1 openoffice.org-style-galaxy openoffice.org-common E: Sub-process /usr/bin/dpkg returned an error code (1) While typing command in terminal, command is not auto-completing.

Read the article

parsed xml file: skip creation if blank?

- by GoodGets

This could be a HappyMapper specific question, but I don't think so. In my app, users can upload their blog subscriptions (via an OPML file), which I parse and add to their profile. The only problem is during the parsing, or more specifically the creation of each subscription, I can't figure out how to skip over entries that are just "labels". Since OPML files allow you to label your blogs, or organize them into folders, this is my problem. The actual blog subscriptions and their labels both have "outline" tags. <outline text="Rails" > <outline title="Katz Got Your Tongue?" text="Katz Got Your Tongue?" htmlUrl="http://yehudakatz.com" type="rss" xmlUrl="http://feeds.feedburner.com/KatzGotYourTongue" /> After parsing, I create each feed via a method call inside of the HappyMapper module def create_feed Feed.new( :feed_htmlUrl => self.htmlUrl, :feed_title => self.title, ... But how do I prevent it from creating new "feeds" for those outline tags that are just tags? (i.e. those that don't have an htmlUrl?)

Read the article

Shift-reduce: when to stop reducing?

- by Joey Adams

I'm trying to learn about shift-reduce parsing. Suppose we have the following grammar, using recursive rules that enforce order of operations, inspired by the ANSI C Yacc grammar: S: A; P : NUMBER | '(' S ')' ; M : P | M '*' P | M '/' P ; A : M | A '+' M | A '-' M ; And we want to parse 1+2 using shift-reduce parsing. First, the 1 is shifted as a NUMBER. My question is, is it then reduced to P, then M, then A, then finally S? How does it know where to stop? Suppose it does reduce all the way to S, then shifts '+'. We'd now have a stack containing: S '+' If we shift '2', the reductions might be: S '+' NUMBER S '+' P S '+' M S '+' A S '+' S Now, on either side of the last line, S could be P, M, A, or NUMBER, and it would still be valid in the sense that any combination would be a correct representation of the text. How does the parser "know" to make it A '+' M So that it can reduce the whole expression to A, then S? In other words, how does it know to stop reducing before shifting the next token? Is this a key difficulty in LR parser generation?

Read the article

int.Parse of "8" fails. int.Parse always requires CultureInfo.InvariantCulture?

- by Henrik Carlsson

We develop an established software which works fine on all known computers except one. The problem is to parse strings that begin with "8". It seems like "8" in the beginning of a string is a reserved character. Parsing: int.Parse("8") -> Exception message: Input string was not in a correct format. int.Parse("80") -> 0 int.Parse("88") -> 8 int.Parse("8100") -> 100 CurrentCulture: sv-SE CurrentUICulture: en-US The problem is solved using int.Parse("8", CultureInfo.InvariantCulture). However, it would be nice to know the source of the problem. Question: Why do we get this behaviour of "8" if we don't specify invariant culture? Additional information: I did send a small program to my client achieve the result above: private int ParseInt(string s) { int parsedInt = -1000; try { parsedInt = int.Parse(s); textBoxMessage.Text = "Success: " + parsedInt; } catch (Exception ex) { textBoxMessage.Text = string.Format("Error parsing string: '{0}'", s) + Environment.NewLine + "Exception message: " + ex.Message; } textBoxMessage.Text += Environment.NewLine + Environment.NewLine + "CurrentCulture: " + Thread.CurrentThread.CurrentCulture.Name + "\r\n" + "CurrentUICulture: " + Thread.CurrentThread.CurrentUICulture.Name + "\r\n"; return parsedInt; }

Read the article

How do I efficiently parse a CSV file in Perl?

- by Mike

I'm working on a project that involves parsing a large csv formatted file in Perl and am looking to make things more efficient. My approach has been to split() the file by lines first, and then split() each line again by commas to get the fields. But this suboptimal since at least two passes on the data are required. (once to split by lines, then once again for each line). This is a very large file, so cutting processing in half would be a significant improvement to the entire application. My question is, what is the most time efficient means of parsing a large CSV file using only built in tools? note: Each line has a varying number of tokens, so we can't just ignore lines and split by commas only. Also we can assume fields will contain only alphanumeric ascii data (no special characters or other tricks). Also, i don't want to get into parallel processing, although it might work effectively. edit It can only involve built-in tools that ship with Perl 5.8. For bureaucratic reasons, I cannot use any third party modules (even if hosted on cpan) another edit Let's assume that our solution is only allowed to deal with the file data once it is entirely loaded into memory. yet another edit I just grasped how stupid this question is. Sorry for wasting your time. Voting to close.

Read the article

How to parse a string (by a "new" markup) with R ?

- by Tal Galili

Hi all, I want to use R to do string parsing that (I think) is like a simplistic HTML parsing. For example, let's say we have the following two variables: Seq <- "GCCTCGATAGCTCAGTTGGGAGAGCGTACGACTGAAGATCGTAAGGtCACCAGTTCGATCCTGGTTCGGGGCA" Str <- ">>>>>>>..>>>>........<<<<.>>>>>.......<<<<<.....>>>>>.......<<<<<<<<<<<<." Say that I want to parse "Seq" According to "Str", by using the legend here Seq: GCCTCGATAGCTCAGTTGGGAGAGCGTACGACTGAAGATCGTAAGGtCACCAGTTCGATCCTGGTTCGGGGCA Str: >>>>>>>..>>>>........<<<<.>>>>>.......<<<<<.....>>>>>.......<<<<<<<<<<<<. | | | | | | | || | +-----+ +--------------+ +---------------+ +---------------++-----+ | Stem 1 Stem 2 Stem 3 | | | +----------------------------------------------------------------+ Stem 0 Assume that we always have 4 stems (0 to 3), but that the length of letters before and after each of them can very. The output should be something like the following list structure: list( "Stem 0 opening" = "GCCTCGA", "before Stem 1" = "TA", "Stem 1" = list(opening = "GCTC", inside = "AGTTGGGA", closing = "GAGC" ), "between Stem 1 and 2" = "G", "Stem 2" = list(opening = "TACGA", inside = "CTGAAGA", closing = "TCGTA" ), "between Stem 2 and 3" = "AGGtC", "Stem 3" = list(opening = "ACCAG", inside = "TTCGATC", closing = "CTGGT" ), "After Stem 3" = "", "Stem 0 closing" = "TCGGGGC" ) I don't have any experience with programming a parser, and would like advices as to what strategy to use when programming something like this (and any recommended R commands to use). What I was thinking of is to first get rid of the "Stem 0", then go through the inner string with a recursive function (let's call it "seperate.stem") that each time will split the string into: 1. before stem 2. opening stem 3. inside stem 4. closing stem 5. after stem Where the "after stem" will then be recursively entered into the same function ("seperate.stem") The thing is that I am not sure how to try and do this coding without using a loop. Any advices will be most welcomed.

Read the article

Banshee does not start (Ubuntu 12.04)

- by balg

I have installed banshee, but during the installation something went wrong and now i am experiencing this: balg@scorpion:~$ banshee Unhandled Exception: System.TypeLoadException: Could not load type 'Banshee.ServiceStack.DBusServiceManager' from assembly 'Banshee.Services, Version=2.4.0.0, Culture=neutral, PublicKeyToken=null'. [ERROR] FATAL UNHANDLED EXCEPTION: System.TypeLoadException: Could not load type 'Banshee.ServiceStack.DBusServiceManager' from assembly 'Banshee.Services, Version=2.4.0.0, Culture=neutral, PublicKeyToken=null'. I have tried to remove and purge banshee, delete the config files and then reinstall it, but it didn't help. Can anyone help me? Thanks, balg

Read the article

Weird SSIS Configuration Error

- by Christopher House

I ran into an interesting SSIS issue that I thought I'd share in hopes that it may save someone from bruising their head after repeatedly banging it on the desk like I did. I was trying to setup what I believe is referred to as "indirect configuration" in SSIS. This is where you store your configuration in some repository like a database or a file, then store the location of that repository in an environment variable and use that to configure the connection to your configuration repository. In my specific situation, I was using a SQL database. I had this all working, but for reasons I'll not bore you with, I had to move my SSIS development to a new VM last week. When I got my new VM, I set about creating a new package. I finished up development on the package and started setting up configuration. I created an OLE DB connection that pointed to my configuration table then went through the configuration wizard to have the connection string for this connection set through my environment variable. I then went through the wizard to set another property through a value stored in the configuration table. When I got to the point where you select the connection, my connection wasn't in the list: As you can see in the screen capture above, the ConfigurationDb connection isn't in the list of available SQL connections in the configuration wizard. Strange. I canceled out of the wizard, went to the properties for ConfigurationDb, tested the connection and it was successful. I went back to the wizard again and this time ConfigurationDb was there. I completed the wizard then went to test my package. Unfortunately the package wouldn't run, I got the following error: Unfortunately, googling for this error code didn't help much as none of the results appears related to package configuration. I did notice that when I went back through the package configuration and tried to edit a previously saved config entry, I was getting the following error: I checked the connection string I had stored in my environment variable and noticed that indeed, it did not have a provider name. I didn't recall having included one on my previous VM, but I figured I'd include it just to see what happened. That made no difference at all. After a day and a half of trying to figure out what the problem was, I'm pleased to report that through extensive trial and error, I have resolved the error. As it turns out, the person who setup this new VM for me named the server SQLSERVER2008. This meant my configuration connection string was: Initial Catalog=SSISConfigDb;Data Source=SQLSERVER2008;Integrated Security=SSPI; Just for the heck of it, I tried changing it to: Initial Catalog=SSISConfigDb;Data Source=(local);Integrated Security=SSPI; That did the trick! As soon as I restarted BIDS, I was able to run the package with no errors at all. Crazy. So, the moral of the story is, don't name your server SQLSERVER2008 if you want SSIS configuration to work when using SQL as your config store.

Read the article

Dependency not satisfiable - Offline deb package install

- by catia

I have a new installation that has no chance of an internet connection. Since I want to add a few development software packages, I downloaded a few *.deb files. The problem is that for every package I try to install I get the same error: "Dependency not satisfiable...." Also downloaded other versions of that software (the deb files) but it didn't work. I've researched other questions in here and Google and I haven't been able to solve this yet.

Read the article

A problem with installing skype

- by Arnas

I recently got a 9,04 version of ubuntu and for the past day i was trying to install skype ( sounds funny i know ) and when I write 'sudo apt-get install skype' I get this error in the terminal The following packages have unmet dependencies: skype: Depends: libqt4-dbus (>= 4.4.3) but 4.4.0-1ubuntu5~hardy1 is to be installed Depends: libqt4-network (>= 4.4.3) but 4.4.0-1ubuntu5~hardy1 is to be installed Depends: libqtcore4 (>= 4.4.3) but 4.4.0-1ubuntu5~hardy1 is to be installed Depends: libqtgui4 (>= 4.4.3) but it is not installable How could I fix this problem? THanks

Read the article

Baidu spider is hammering my server and bloating my error_log file

- by Gravy

I am getting the following errors in my /etc/httpd/logs/error_log file [Sun Oct 20 00:04:15 2013] [error] [client 180.76.5.16] File does not exist: /usr/local/apache/htdocs/homes [Sun Oct 20 00:08:31 2013] [error] [client 180.76.5.113] File does not exist: /usr/local/apache/htdocs/homes [Sun Oct 20 00:12:47 2013] [error] [client 180.76.5.88] File does not exist: /usr/local/apache/htdocs/homes [Sun Oct 20 00:17:07 2013] [error] [client 180.76.5.138] File does not exist: /usr/local/apache/htdocs/homes These kinds of errors are so often, that my error log files are over 500MB! I have done an IP trace on the client address to find that it belongs to something called baidu. Beijing Baidu Netcom Science and Technology Co in China. Is there a way that I can just get apache to deny any incoming requests from some crummy spider that is repeatedly hitting my site??? Is there a better way of dealing with the problem? I am happy to completely block out China if it means that I can actually track real errors.

Read the article

httpd memory could not be written on winxp

- by Shawn

I have a apache server on a winxp box, ocassionaly I got a "httpd error, memory could not be written" error, here is what I found in the apache error-log `[Sat Sep 12 10:58:34 2009] [error] [client 113.68.84.79] Invalid URI in request ;\xece\r\xd5m\xed{\xbcf\xbf\xffq\bZNB\xf0a\xf9\x13\xf3[\x06Y\x02G\xca\xc5\xf3\x9ft\x89b\xed\xb5m\x9f\x1c\xa6\x03\x10\xee\xe9G\xb5\xe0glLf\xd4eFT\x8f.{Ysl\x89\x05\x18\x0f\x0fp\xdd\xaf\x11G\xbe\xbf\x96/Pr\x9e\xf4\x89\xf2\xd4^mA\x13y2\xe3\x95\xaeD\x02\xa7*G\xe4\x1d\x07r^\xaf_J\xf7\xbc\x90\x17\xda\x90\x17\xec\xd4\xe8\xe4\xfcU\x04\xbc2V\xe1\x170\xeb Error in my_thread_global_end(): 66 threads didn't exit [Sat Sep 12 11:08:43 2009] [notice] Parent: child process exited with status 3221225477 -- Restarting. [Sat Sep 12 11:08:51 2009] [notice] Apache/2.2.4 (Win32) PHP/5.2.3 configured -- resuming normal operations` Anybody can tell what this means and where the problem is ? Thanks.

Read the article

SCO UNIX problem: "Cannot create /var/adm/utmp or /var/adm/utmpx"

- by Maktouch

Hey everyone, I have an old server that doesn't boot. I don't know the version of unix installed, but I see SCO UNIX. It stops with that error: UX:init: ERROR: Cannot create /var/adm/utmp or /var/adm/utmpx UX:init: ERROR: failed write of utmpx entry: " " UX:init: ERROR: failed write of utmpx entry: " " UX:init: INFO: SINGLE USER MODE After that message, it just stops. I cannot write or press anything. Even CTRL + ALT + DEL does not work. I cannot get into the system. I have tried booting with a DamnSmallLinux LiveCD but it does not recognize the file system on HDA. Is there a way to either log in as root or bypass this error? Thanks.

Read the article

Parsing GeoRSS Feed with jQuery

- by senfo

I'm attempting to use the jQuery jFeed plugin for parsing an Atom, GeoRSS feed and I'm running into issues extracting the information I need. For example, I need to extract the summary element and I would like to render the contents in a div on my HTML page. Additionally, I'd like to extract the contents from the georss:point elements and pass them into Google Maps to render them as points on a map. The problem is that it seems jFeed is stripping out the GeoRSS-related information. For example, I can extract the title element without issues, but it seems it doesn't extract the summary or georss:point elements, at all. Following is a snippet of the XML I'm working with: <feed xmlns="http://www.w3.org/2005/Atom" xmlns:georss="http://www.georss.org/georss"> <title>Search Results from DataWarehouse.HRSA.gov</title> <link rel="self" href="http://datawarehouse.hrsa.gov/HGDWDataWebService/HGDWDataService.aspx?service=HC&zip=20002&radius=10"/> <link rel="alternate" href="http://datawarehouse.hrsa.gov/"/> <author> <name>HRSA Geospatial Data Warehouse</name> </author> <id>tag:datawarehouse.hrsa.gov,2010-04-05:/</id> <updated>2010-04-05T19:25:28-05:00</updated> <entry> <title>Christ House</title> <link href="http://www.christhouse.org" /> <id>tag:datawarehouse.hrsa.gov,2010-04-05:/D388C4C6-FFA4-4091-819B-64D67DC64931</id> <summary type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <div class="vcard"> <div class="fn org">Christ House</div> <div class="adr"> <div class="street-address">1717 Columbia Rd. N.W.</div> <span class="locality">Washington</span>, <span class="region">District of Columbia</span>, <span class="postal-code">20009-2803</span> </div> <div class="tel">202-328-1100</div> </div> <div> Categories: <span class="category">Service Delivery Site</span> </div> </div> </summary> <georss:point>38.9243636363636 -77.0395636363637</georss:point> <updated>2010-04-04T00:00:00-05:00</updated> </entry> </feed> Following is the jQuery code that I'm using: $(document).ready(function() { $.getFeed({ //url: 'http://datawarehouse.hrsa.gov/HGDWDataWebService/HGDWDataService.aspx?service=HC&zip=20002&radius=10', url: 'test.xml', success: function(feed) { $.each(feed.items, function(index, value) { $('#rssContent').append(value.title); // Set breakpoint here }); } }); }); I set a breakpoint on the line that appends to the rssContent div and noticed the objects in feed.items don't have the properties I'm after. Am I doing something wrong or was jFeed simply not designed to work the way I want it to?

Read the article

Linker error when compiling boost.asio example

- by Alon

Hi, I'm trying to learn a little bit C++ and Boost.Asio. I'm trying to compile the following code example: #include <iostream> #include <boost/array.hpp> #include <boost/asio.hpp> using boost::asio::ip::tcp; int main(int argc, char* argv[]) { try { if (argc != 2) { std::cerr << "Usage: client <host>" << std::endl; return 1; } boost::asio::io_service io_service; tcp::resolver resolver(io_service); tcp::resolver::query query(argv[1], "daytime"); tcp::resolver::iterator endpoint_iterator = resolver.resolve(query); tcp::resolver::iterator end; tcp::socket socket(io_service); boost::system::error_code error = boost::asio::error::host_not_found; while (error && endpoint_iterator != end) { socket.close(); socket.connect(*endpoint_iterator++, error); } if (error) throw boost::system::system_error(error); for (;;) { boost::array<char, 128> buf; boost::system::error_code error; size_t len = socket.read_some(boost::asio::buffer(buf), error); if (error == boost::asio::error::eof) break; // Connection closed cleanly by peer. else if (error) throw boost::system::system_error(error); // Some other error. std::cout.write(buf.data(), len); } } catch (std::exception& e) { std::cerr << e.what() << std::endl; } return 0; } With the following command line: g++ -I /usr/local/boost_1_42_0 a.cpp and it throws an unclear error: /tmp/ccCv9ZJA.o: In function `__static_initialization_and_destruction_0(int, int)': a.cpp:(.text+0x654): undefined reference to `boost::system::get_system_category()' a.cpp:(.text+0x65e): undefined reference to `boost::system::get_generic_category()' a.cpp:(.text+0x668): undefined reference to `boost::system::get_generic_category()' a.cpp:(.text+0x672): undefined reference to `boost::system::get_generic_category()' a.cpp:(.text+0x67c): undefined reference to `boost::system::get_system_category()' /tmp/ccCv9ZJA.o: In function `boost::system::error_code::error_code()': a.cpp:(.text._ZN5boost6system10error_codeC2Ev[_ZN5boost6system10error_codeC5Ev]+0x10): undefined reference to `boost::system::get_system_category()' /tmp/ccCv9ZJA.o: In function `boost::asio::error::get_system_category()': a.cpp:(.text._ZN5boost4asio5error19get_system_categoryEv[boost::asio::error::get_system_category()]+0x7): undefined reference to `boost::system::get_system_category()' /tmp/ccCv9ZJA.o: In function `boost::asio::detail::posix_thread::~posix_thread()': a.cpp:(.text._ZN5boost4asio6detail12posix_threadD2Ev[_ZN5boost4asio6detail12posix_threadD5Ev]+0x1d): undefined reference to `pthread_detach' /tmp/ccCv9ZJA.o: In function `boost::asio::detail::posix_thread::join()': a.cpp:(.text._ZN5boost4asio6detail12posix_thread4joinEv[boost::asio::detail::posix_thread::join()]+0x25): undefined reference to `pthread_join' /tmp/ccCv9ZJA.o: In function `boost::asio::detail::posix_tss_ptr<boost::asio::detail::call_stack<boost::asio::detail::task_io_service<boost::asio::detail::epoll_reactor<false> > >::context>::~posix_tss_ptr()': a.cpp:(.text._ZN5boost4asio6detail13posix_tss_ptrINS1_10call_stackINS1_15task_io_serviceINS1_13epoll_reactorILb0EEEEEE7contextEED2Ev[_ZN5boost4asio6detail13posix_tss_ptrINS1_10call_stackINS1_15task_io_serviceINS1_13epoll_reactorILb0EEEEEE7contextEED5Ev]+0xf): undefined reference to `pthread_key_delete' /tmp/ccCv9ZJA.o: In function `boost::asio::detail::posix_tss_ptr<boost::asio::detail::call_stack<boost::asio::detail::task_io_service<boost::asio::detail::epoll_reactor<false> > >::context>::posix_tss_ptr()': a.cpp:(.text._ZN5boost4asio6detail13posix_tss_ptrINS1_10call_stackINS1_15task_io_serviceINS1_13epoll_reactorILb0EEEEEE7contextEEC2Ev[_ZN5boost4asio6detail13posix_tss_ptrINS1_10call_stackINS1_15task_io_serviceINS1_13epoll_reactorILb0EEEEEE7contextEEC5Ev]+0x22): undefined reference to `pthread_key_create' collect2: ld returned 1 exit status How can I fix it? Thank you.

Read the article

Parsing HTML using HtmlParser

- by Blankman

My html has 20 or so rows of the following HTML pattern. So the below is considered a single instance of the pattern. Each instance of this pattern represents a product. Again the below is a single instance, it spans multiple rows in the HTML table. <table> ..  <tr> <td rowspan="5" class="product" valign="top"><nobr> ????????????</td> </tr> <tr> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> </tr> <tr> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> </tr> </tr> <tr> <td colspan="5" ????????</td> </tr> <tr> <td colspan="6" width="100%"> <hr></td> </tr>   .. <table> I am trying to use HtmlParser for this. Parser rowParser = new Parser(); rowParser.setInputHtml(page.getHtml()); // page object represents a html page rowParser.setEncoding("UTF-8"); NodeFilter productRowFilter = new AndFilter( new TagNameFilter("tr"), new HasChildFilter( new AndFilter( new TagNameFilter("td"), new HasAttributeFilter("class", "product"))) The above filter doesn't work, just showing you what I have so far. I need to somehow combine these filters, and use the last td to mark the end of the pattern i.e. the td with the colspan=6 and width=100% with child element hr. I have been struggling with this, and have resorted to Regex'ing but was told numerous times to NOT use regex for html parsing, so here I am! Your help is much appreciated!

Read the article

Parsing JSON into XML using Windows Phone

- by Henry Edwards

I have this code, but can't get it all working. I am trying to get a json string into xml. So that I can get a list of items when i parse the data. Is there a better way to parse json into xml. If so what's the best way to do it, and if possible could you give me a working example? The URL that is in the code is not the URL that i am using using System; using System.Collections.Generic; using System.Linq; using System.Net; using System.Windows; using System.Windows.Controls; using System.Windows.Documents; using System.Windows.Input; using System.Windows.Media; using System.Windows.Media.Animation; using System.Windows.Shapes; using Microsoft.Phone.Controls; using Newtonsoft.Json; using Newtonsoft.Json.Serialization; using Newtonsoft.Json.Converters; using Newtonsoft.Json.Utilities; using Newtonsoft.Json.Linq; using Newtonsoft.Json.Schema; using Newtonsoft.Json.Bson; using System.Xml; using System.Xml.Serialization; using System.Xml.Linq; using System.Xml.Linq.XDocument; using System.IO; namespace WindowsPhonePanoramaApplication3 { public partial class Page2 : PhoneApplicationPage { public Page2() { InitializeComponent(); } private void Form1_Load(object sender, EventArgs e1) { /* because the origional JSON string has multiple root's this needs to be added */ string json = "{BFBC2_GlobalStats:"; json += DownlodUrl("http://api.bfbcs.com/api/xbox360?globalstats"); json += "}"; XmlDocument doc = (XmlDocument)JsonConvert.DeserializeObject(json); textBox1.Text = GetXmlString(doc); } private string GetXmlString() { throw new NotImplementedException(); } private string DownlodUrl(string url) { string result = null; try { WebClient client = new WebClient(); result = client.DownloadString(url); } catch (Exception ex) { // handle error result = ex.Message; } return result; } private string GetXmlString(XmlDocument xmlDoc) { sw = new StringWriter(); XmlTextWriter xw = new XmlTextWriter(sw); xw.Formatting = System.Xml.Formatting.Indented; xmlDoc.WriteTo(xw); return sw.ToString(); } } } The URL outputs the following code: {"StopName":"Race Hill", "stopId":7553, "NaptanCode":"bridwja", "LongName":"Race Hill", "OperatorsCode1":" 5", "OperatorsCode2":" ", "OperatorsCode3":" ", "OperatorsCode4":"bridwja", "Departures":[ { "ServiceName":"", "Destination":"", "DepartureTimeAsString":"", "DepartureTime":"30/01/2012 00:00:00", "Notes":""}` Thanks for your responses. So Should i just leave the data a json and then view the data via that??? Is this a way to show the data from a json string. public void Load() { // form the URI UriBuilder uri = new UriBuilder("http://mysite.com/events.json"); WebClient proxy = new WebClient(); proxy.OpenReadCompleted += new OpenReadCompletedEventHandler(OnReadCompleted); proxy.OpenReadAsync(uri.Uri); } void OnReadCompleted(object sender, OpenReadCompletedEventArgs e) { if (e.Error == null) { var serializer = new DataContractJsonSerializer(typeof(EventList)); var events = (EventList)serializer.ReadObject(e.Result); foreach (var ev in events) { Items.Add(ev); } } } public ObservableCollection<EventDetails> Items { get; private set; } Edit: Have now kept the url as json and have now got it working by using the json way.

Read the article

Proxy Issues with Javascript Cross Domain RSS Feed Parsing

- by Amir

This is my Javascript function which grabs an rss feed via the proxy script and then spits out the 5 latest rss items from the feed along with a link to my stylesheet: function getWidget (feed,limit) { if (window.XMLHttpRequest) { xhttp=new XMLHttpRequest() } else { xhttp=new ActiveXObject("Microsoft.XMLHTTP") } xhttp.open("GET","http://MYSITE/proxy.php?url="+feed,false); xhttp.send(""); xmlDoc=xhttp.responseXML; var x = 1; var div = document.getElementById("div"); srdiv.innerHTML = '<link type="text/css" href="http://MYSITE/css/widget.css" rel="stylesheet" /><div id="rss-title"></div></h3><div id="items"></div><br /><br /><a href="http://MYSITE">Powered by MYSITE</a>'; document.body.appendChild(div); content=xmlDoc.getElementsByTagName("title"); thelink=xmlDoc.getElementsByTagName("link"); document.getElementByTagName("rss-title").innerHTML += content[0].childNodes[0].nodeValue; for (x=1;x<=limit;srx++) { y=x; y--; var shout = '<div class="item"><a href="'+thelink[y].childNodes[0].nodeValue+'">'+content[x].childNodes[0].nodeValue+'</a></div>'; document.getElementById("items").innerHTML += shout; } } Here is the the code from proxy.php: $session = curl_init($_GET['url']); // Open the Curl session curl_setopt($session, CURLOPT_HEADER, false); // Don't return HTTP headers curl_setopt($session, CURLOPT_RETURNTRANSFER, true); // Do return the contents of the call $xml = curl_exec($session); // Make the call header("Content-Type: text/xml"); // Set the content type appropriately echo $xml; // Spit out the xml curl_close($session); // And close the session Now when I try to load this on any domain that's not my site nothing loads. I get no JS errors, but I in the Console tab in firebug I get "407 Proxy Authentication Required" So I'm not really sure how to make this work. The goal is to be able to grab the RSS feed, parse it to grab the titles and links and spit it out into some HTML on any website on the web. I"m basically making a simple RSS widget for my site's various RSS feeds. My Javascript is wack Also, I'm really a beginner with Javascript. I know jQuery pretty well, but I wasn't able to use it in this case, because this script will be embeded on any site and I can't really rely on the jQuery library. So I was decided to write some basic Javascript relying on the default XML parsing options available. Any suggestions here would be cool. Thanks! What's with the x and y They way my site creates RSS feeds is that the first title is actually the RSS feed title. The second title is the title of the first item. The first link is the link to the first item. So when using the javascript to get the title, I had to first grab the first title (which is the RSS title) and then start with the second title that being the first title of the item. Sorry for the confusion, but I don't think this is related to my issue. Just wanted to clarify my code.

Read the article

Parsing xml file that comes in as one object per line

- by Casey

I haven't been here in so long, I forgot my prior account! Anyways, I am working on parsing an xml document that comes in ugly. It is for banking statements. Each line is a <statement>all tags</statement>. Now, what I need to do is read this file in, and parse the XML document at the same time, while formatting it more human readable too. Point beeing, Original input looks like this: <statement><accountHeader><fiAddress></fiAddress><accountNumber></accountNumber><startDate>20140101</startDate><endDate>20140228</endDate><statementGroup>1</statementGroup><sortOption>0</sortOption><memberBranchCode>1</memberBranchCode><memberName></memberName><jointOwner1Name></jointOwner1Name><jointOwner2Name></jointOwner2Name></summary></statement> <statement><accountHeader><fiAddress></fiAddress><accountNumber></accountNumber><startDate>20140101</startDate><endDate>20140228</endDate><statementGroup>1</statementGroup><sortOption>0</sortOption><memberBranchCode>1</memberBranchCode><memberName></memberName><jointOwner1Name></jointOwner1Name><jointOwner2Name></jointOwner2Name></summary></statement> <statement><accountHeader><fiAddress></fiAddress><accountNumber></accountNumber><startDate>20140101</startDate><endDate>20140228</endDate><statementGroup>1</statementGroup><sortOption>0</sortOption><memberBranchCode>1</memberBranchCode><memberName></memberName><jointOwner1Name></jointOwner1Name><jointOwner2Name></jointOwner2Name></summary></statement> I need the final output to be as follows: <statement> <name></name> <address></address> </statement> This is fine and dandy. I am using the following "very slow considering 5.1 million lines, 254k data file, and about 60k statements takes around 8 minutes". foreach(String item in lines) { XElement xElement = XElement.Parse(item); sr.WriteLine(xElement.ToString().Trim()); } Then when the file is formatted this is what sucks. I need to check every single tag in transaction elements, and if a tag is missing that could be there, I have to fill it in. Our designer software will default prior values in if a tag is possible, and the current objects does not have. It defaults in the value of a prior one that was not Null. "I know, and they swear up and down it is not a bug... ok?" So, that is also taking about 5 to 10 minutes. I need to break all this down, and find a faster method for working with the initial XML. This is a preprocess action, and cannot take that long if not necessary. It just seems redundant. Is there a better way to parse the XML, or is this the best I can do? I parse the XML, write to a temp file, and then read that file in, to the output file inserting the missing tags. 2 IO runs for one process. Yuck.

Read the article

Need help with regex parsing (in perl)

- by Charlie

Hi all, need some help parsing an html file in perl. I used the LWP module to retrieve a webpage into $_ with $/ undefined so there are no newline issues. Then I'm trying to find all strings matching a pattern. How do I do that? I know how to find 1 instance of it, but how do I match all instances? and what data structure would the results go to? a multi dimensional array? my text (excerpt) looks like the following: <TR> <TD BGCOLOR=EEEEEE><A HREF="/program.cgi?pid=1233"><FONT FACE="ARIAL,HELVETICA,SANS-SERIF" SIZE=2>Title 1</A></FONT></TD> <TD BGCOLOR=EEEEEE nowrap><FONT FACE="ARIAL,HELVETICA" SIZE=2>Jun 27 2010 3:00PM</FONT></TD> <TD BGCOLOR=EEEEEE> </TD> </TR> <TR><TD BGCOLOR=EEEEEE COLSPAN=3><IMG SRC="http://images.domain.com/images/spacer.gif" WIDTH=1 HEIGHT=2><BR></TD></TR> <TR><TD COLSPAN=3 BGCOLOR=999999><IMG SRC="http://images.domain.com/images/spacer.gif" HEIGHT=1 WIDTH=1></TD></TR> <TR><TD COLSPAN=3 ><IMG SRC="http://images.domain.com/images/spacer.gif" WIDTH=1 HEIGHT=2><BR></TD></TR> <TR> <TD><A HREF="/program.cgi?pid=1234"><FONT FACE="ARIAL,HELVETICA,SANS-SERIF" SIZE=2>Title 2</A></FONT></TD> <TD nowrap><FONT FACE="ARIAL,HELVETICA" SIZE=2>Jun 29 2010 7:00PM</FONT></TD> <TD> </TD> </TR> <TR><TD COLSPAN=3><IMG SRC="http://images.domain.com/images/spacer.gif" WIDTH=1 HEIGHT=2><BR></TD></TR> <TR><TD COLSPAN=3 BGCOLOR=999999><IMG SRC="http://images.domain.com/images/spacer.gif" HEIGHT=1 WIDTH=1></TD></TR> <TR><TD COLSPAN=3 BGCOLOR=EEEEEE><IMG SRC="http://images.domain.com/images/spacer.gif" WIDTH=1 HEIGHT=2><BR></TD></TR> <TR> <TD BGCOLOR=EEEEEE><A HREF="/program.cgi?pid=1235"><FONT FACE="ARIAL,HELVETICA,SANS-SERIF" SIZE=2>Title 3</A></FONT></TD> <TD BGCOLOR=EEEEEE nowrap><FONT FACE="ARIAL,HELVETICA" SIZE=2>Jul 3 2010 7:00PM</FONT></TD> <TD BGCOLOR=EEEEEE> </TD> </TR> I want to get the following into an array (or any structure): { ["/program.cgi?pdi=1233", "Title 1"], ["/program.cgi?pdi=1234", "Title 2"], ["/program.cgi?pdi=1235", "Title 3"] } Thanks

Read the article

Getting text position while parsing pdf with Quartz 2D

- by Koteg

Hi guys, another question regarding pdf parsing... Just read PDF Reference version 1.7 "5.3.1 Text-Positioning Operators" and I am a little bit confused. I wrote some code to get transformation matrix and initial text position. CGPDFOperatorTableSetCallback (table, "MP", &op_MP);//Define marked-content point CGPDFOperatorTableSetCallback (table, "DP", &op_DP);//Define marked-content point with property list CGPDFOperatorTableSetCallback (table, "BMC", &op_BMC);//Begin marked-content sequence CGPDFOperatorTableSetCallback (table, "BDC", &op_BDC);//Begin marked-content sequence with property list CGPDFOperatorTableSetCallback (table, "EMC", &op_EMC);//End marked-content sequence //Text State operators CGPDFOperatorTableSetCallback(table, "Tc", &op_Tc); CGPDFOperatorTableSetCallback(table, "Tw", &op_Tw); CGPDFOperatorTableSetCallback(table, "Tz", &op_Tz); CGPDFOperatorTableSetCallback(table, "TL", &op_TL); CGPDFOperatorTableSetCallback(table, "Tf", &op_Tf); CGPDFOperatorTableSetCallback(table, "Tr", &op_Tr); CGPDFOperatorTableSetCallback(table, "Ts", &op_Ts); //text showing operators CGPDFOperatorTableSetCallback(table, "TJ", &op_TJ); CGPDFOperatorTableSetCallback(table, "Tj", &op_Tj); CGPDFOperatorTableSetCallback(table, "'", &op_apostrof); CGPDFOperatorTableSetCallback(table, "\"", &op_double_apostrof); //text positioning operators CGPDFOperatorTableSetCallback(table, "Td", &op_Td); CGPDFOperatorTableSetCallback(table, "TD", &op_TD); CGPDFOperatorTableSetCallback(table, "Tm", &op_Tm); CGPDFOperatorTableSetCallback(table, "T*", &op_T); //text object operators CGPDFOperatorTableSetCallback(table, "BT", &op_BT);//Begin text object CGPDFOperatorTableSetCallback(table, "ET", &op_ET);//End text object So this is the output after application lunch: 2010-09-02 15:09:23.041 testSearch[8251:207] op_BT begin Integer value: 0 2010-09-02 15:09:23.043 testSearch[8251:207] op_BT end 2010-09-02 15:09:23.043 testSearch[8251:207] op_Tf begin Integer value: 1 2010-09-02 15:09:23.044 testSearch[8251:207] op_Tf end 2010-09-02 15:09:23.044 testSearch[8251:207] op_Tm begin Float value: 557.364197 2010-09-02 15:09:23.045 testSearch[8251:207] op_Tm end 2010-09-02 15:09:23.045 testSearch[8251:207] op_TJ begin 2010-09-02 15:09:23.046 testSearch[8251:207] Array string value [0]: F 2010-09-02 15:09:23.046 testSearch[8251:207] Array integer value [1]: 94985208 2010-09-02 15:09:23.047 testSearch[8251:207] Array string value [2]: r 2010-09-02 15:09:23.047 testSearch[8251:207] Array integer value [3]: 94985208 2010-09-02 15:09:23.048 testSearch[8251:207] Array string value [4]: o 2010-09-02 15:09:23.048 testSearch[8251:207] Array integer value [5]: 94985208 2010-09-02 15:09:23.049 testSearch[8251:207] Array string value [6]: m s 2010-09-02 15:09:23.049 testSearch[8251:207] Array integer value [7]: 94985208 2010-09-02 15:09:23.049 testSearch[8251:207] Array string value [8]: a 2010-09-02 15:09:23.050 testSearch[8251:207] Array integer value [9]: 94985208 2010-09-02 15:09:23.050 testSearch[8251:207] Array string value [10]: m 2010-09-02 15:09:23.051 testSearch[8251:207] Array integer value [11]: 94985208 2010-09-02 15:09:23.051 testSearch[8251:207] Array string value [12]: p 2010-09-02 15:09:23.052 testSearch[8251:207] Array integer value [13]: 94985208 2010-09-02 15:09:23.053 testSearch[8251:207] Array string value [14]: l 2010-09-02 15:09:23.054 testSearch[8251:207] Array integer value [15]: 94985208 2010-09-02 15:09:23.055 testSearch[8251:207] Array string value [16]: e t 2010-09-02 15:09:23.055 testSearch[8251:207] Array integer value [17]: 94985208 2010-09-02 15:09:23.057 testSearch[8251:207] Array string value [18]: o r 2010-09-02 15:09:23.057 testSearch[8251:207] Array integer value [19]: 94985208 2010-09-02 15:09:23.058 testSearch[8251:207] Array string value [20]: e 2010-09-02 15:09:23.058 testSearch[8251:207] Array integer value [21]: 94985208 2010-09-02 15:09:23.059 testSearch[8251:207] Array string value [22]: s 2010-09-02 15:09:23.059 testSearch[8251:207] Array integer value [23]: 94985208 2010-09-02 15:09:23.060 testSearch[8251:207] Array string value [24]: u 2010-09-02 15:09:23.061 testSearch[8251:207] Array integer value [25]: 94985208 2010-09-02 15:09:23.061 testSearch[8251:207] Array string value [26]: l 2010-09-02 15:09:23.062 testSearch[8251:207] Array integer value [27]: 94985208 2010-09-02 15:09:23.062 testSearch[8251:207] Array string value [28]: t 2010-09-02 15:09:23.063 testSearch[8251:207] op_TJ end If someone is familiar with text matrix and text positioning operators it would be nice to explain how all those thing work. How to calculate text position (or glyph?) using Tm (transformation matrix and other data)?

Search Results

Search found 54956 results on 2199 pages for 'parsing error'.

Page 89/2199 | < Previous Page | 85 86 87 88 89 90 91 92 93 94 95 96 | Next Page >

- by Rakesh

- by Karan Gurnani

- by arn

- by Stijn Vanpoucke

- by Amol Shinde

- by GoodGets

- by Joey Adams

- by Henrik Carlsson

- by Mike

- by Tal Galili

- by balg

- by Christopher House

- by catia

- by Arnas

- by Gravy

- by Shawn

- by Maktouch

- by senfo

- by Alon

- by Blankman

- by Henry Edwards

- by Amir

- by Casey

- by Charlie

- by Koteg

< Previous Page | 85 86 87 88 89 90 91 92 93 94 95 96 | Next Page >