Search Results

Search found 54956 results on 2199 pages for 'parsing error'.

Page 158/2199 | < Previous Page | 154 155 156 157 158 159 160 161 162 163 164 165 | Next Page >

What's the best way to retrieve two pieces of data from an XML file?

- by Morinar

I've got an XML document that is in either a pre or post FO transformed state that I need to extract some information from. In the pre-case, I need to pull out two tags that represent the pageWidth and pageHeight and in the post case I need to extract the page-height and page-width parameters from a specific tag (I forget which one it is off the top of my head). What I'm looking for is an efficient/easily maintainable way to grab these two elements. I'd like to only read the document a single time fetching the two things I need. I initially started writing something that would use BufferedReader + FileReader, but then I'm doing string searching and it gets messy when the tags span multiple lines. I then looked at the DOMParser, which seems like it would be ideal, but I don't want to have to read the entire file into memory if I could help it as the files could potentially be large and the tags I'm looking for will nearly always be close to the top of the file. I then looked into SAXParser, but that seems like a big pile of complicated overkill for what I'm trying to accomplish. Anybody have any advice? Or simple implementations that would accomplish my goal? Thanks.

Read the article
PyParsing: What does Combine() do?

- by Rosarch

What is the difference between: foo = TOKEN1 + TOKEN2 and foo = Combine(TOKEN1 + TOKEN2) Thanks.

Read the article
PHP Regex to match lines with all-caps with occaisional hyphens.

- by Yaaqov

I'm trying to to convert an existing PHP Regular Expression match case to apply to a slightly different style of document. Here's the original style of the document: **FOODS - TYPE A** ___________________________________ **PRODUCT** 1) Mi Pueblito Queso Fresco Authentic Mexican Style Fresh Cheese; 2) La Fe String Cheese **CODE** Sell by date going back to February 1, 2009 And the successfully-running PHP Regex match code that only returns "true" if the line is surrounded by asterisks, and stores each side of the "-" as $m[1] and $m[2], respectively. if ( preg_match('#^\*\*([^-]+)(?:-(.*))?\*\*$#', $line, $m) ) { // only for **header - subheader** $m[2] is set. if ( isset($m[2]) ) { return array(TYPE_HEADER, array(trim($m[1]), trim($m[2]))); } else { return array(TYPE_KEY, array($m[1])); } } So, for line 1: $m[1] = "FOODS" AND $m[2] = "TYPE A"; Line 2 would be skipped; Line 3: $m[1] = "PRODUCT", etc. The question: How would I re-write the above regex match if the headers did not have the asterisks, but still was all-caps, and was at least 4 characters long? For example: FOODS - TYPE A ___________________________________ PRODUCT 1) Mi Pueblito Queso Fresco Authentic Mexican Style Fresh Cheese; 2) La Fe String Cheese CODE Sell by date going back to February 1, 2009 Thank you.

Read the article
Linux: shell builtin string matching

- by gmatt

I am trying to become more familiar with using the builtin string matching stuff available in shells in linux. I came across this guys posting, and he showed an example a="abc|def" echo ${a#*|} # will yield "def" echo ${a%|*} # will yield "abc" I tried it out and it does what its advertised to do, but I don't understand what the $,{},#,*,| are doing, I tried looking for some reference online or in the manuals but I couldn't find anything. Can anyone explain to me what's going on here?

Read the article
C# - Parse HTML source as XML

- by fonix232

I would like to read in a dynamic URL what contains a HTML file, and read it like an XML file, based on nodes (HTML tags). Is this somehow possible? I mean, there is this HTML code: <table class="bidders" cellpadding="0" cellspacing="0"> <tr class="bidRow4"> <td>kucik (automata)</td> <td class="right">9 374 Ft</td> <td class="bidders_date">2010-06-10 18:19:52</td> </tr> <tr class="bidRow4"> <td>macszaf (automata)</td> <td class="right">9 373 Ft</td> <td class="bidders_date">2010-06-10 18:19:52</td> </tr> <tr class="bidRow2"> <td>kucik (automata)</td> <td class="right">9 372 Ft</td> <td class="bidders_date">2010-06-10 18:19:42</td> </tr> <tr class="bidRow2"> <td>macszaf (automata)</td> <td class="right">9 371 Ft</td> <td class="bidders_date">2010-06-10 18:19:42</td> </tr> <tr class="bidRow0"> <td>kucik (automata)</td> <td class="right">9 370 Ft</td> <td class="bidders_date">2010-06-10 18:19:32</td> </tr> <tr class="bidRow0"> <td>macszaf (automata)</td> <td class="right">9 369 Ft</td> <td class="bidders_date">2010-06-10 18:19:32</td> </tr> <tr class="bidRow8"> <td>kucik (automata)</td> <td class="right">9 368 Ft</td> <td class="bidders_date">2010-06-10 18:19:22</td> </tr> <tr class="bidRow8"> <td>macszaf (automata)</td> <td class="right">9 367 Ft</td> <td class="bidders_date">2010-06-10 18:19:22</td> </tr> <tr class="bidRow6"> <td>kucik (automata)</td> <td class="right">9 366 Ft</td> <td class="bidders_date">2010-06-10 18:19:12</td> </tr> <tr class="bidRow6"> <td>macszaf (automata)</td> <td class="right">9 365 Ft</td> <td class="bidders_date">2010-06-10 18:19:12</td> </tr> </table> I want to parse this into a ListView (or a Grid) to create rows with the data contained. All tr are different row, and all td in a given td is a column in the given row. And also I want it to be as fast as possible, as it would update itself in 5 seconds. Is there any library for this?

Read the article
PHP: What is an efficient way to parse a text file containing very long lines?

- by Shaun

I'm working on a parser in php which is designed to extract MySQL records out of a text file. A particular line might begin with a string corresponding to which table the records (rows) need to be inserted into, followed by the records themselves. The records are delimited by a backslash and the fields (columns) are separated by commas. For the sake of simplicity, let's assume that we have a table representing people in our database, with fields being First Name, Last Name, and Occupation. Thus, one line of the file might be as follows [People] = "\Han,Solo,Smuggler\Luke,Skywalker,Jedi..." Where the ellipses (...) could be additional people. One straightforward approach might be to use fgets() to extract a line from the file, and use preg_match() to extract the table name, records, and fields from that line. However, let's suppose that we have an awful lot of Star Wars characters to track. So many, in fact, that this line ends up being 200,000+ characters/bytes long. In such a case, taking the above approach to extract the database information seems a bit inefficient. You have to first read hundreds of thousands of characters into memory, then read back over those same characters to find regex matches. Is there a way, similar to the Java String next(String pattern) method of the Scanner class constructed using a file, that allows you to match patterns in-line while scanning through the file? The idea is that you don't have to scan through the same text twice (to read it from the file into a string, and then to match patterns) or store the text redundantly in memory (in both the file line string and the matched patterns). Would this even yield a significant increase in performance? It's hard to tell exactly what PHP or Java are doing behind the scenes.

Read the article
How do I ignore unrecognized characters which come back from web service?

- by Jay Askren

I am working on an app which calls a rest web service. Sometimes the xml responses contain characters which the phone can not understand. When displaying these characters, an empty box is displayed instead. I would like to filter out these characters. How can I detect if a character will be able to be displayed on the screen?

Read the article
How do I keep a scanner from throwing exceptions when the wrong type is entered? (java)

- by David

Here's some sample code: import java.util.Scanner; class In { public static void main (String[]arg) { Scanner in = new Scanner (System.in) ; System.out.println ("how many are invading?") ; int a = in.nextInt() ; System.out.println (a) ; } } if i run the program and give it an int like 4then everything goes fine. if, on the other hand, i answer too many it doesn't laugh at my funny joke. instead i get this: (as expected) Exception in thread "main" java.util.InputMismatchException at java.util.Scanner.throwFor(Scanner.java:819) at java.util.Scanner.next(Scanner.java:1431) at java.util.Scanner.nextInt(Scanner.java:2040) at java.util.Scanner.nextInt(Scanner.java:2000) at In.main(In.java:9) is there a way so that i can make it so that it either ignores entries that aren't ints or re prompts with "how many are invading?"? i'd like to know how to do both of these.

Read the article
Python: Is there a way to get HTML that was dynamically created by Javascript?

- by Joschua

As far as I can tell, this is the case for LyricWikia. The lyrics (example) can be accessed from the browser, but can't be found in the source code (can be opened with CTRL + U in most browsers) or reading the contents of the site with Python: from urllib.request import urlopen URL = 'http://lyrics.wikia.com/Billy_Joel:Piano_Man' r = urlopen(URL).read().decode('utf-8') And the test: >>> 'Now John at the bar is a friend of mine' in r False >>> 'John' in r False But when you select and look at the source code of the box in which the lyrics are displayed, you can see that there is: <div class="lyricbox">[...]</div> Is there a way to get the contents of that div-element with Python?

Read the article
Android bluetooth socket error

- by ashwini

I am using backport bluetooth api on android 1.6. I am using Google Bluetooth Chat sample app for testing. The app works fine in normal scenarios. In a scenario, when I try to connect to paired device which is in off state, I get following error. 01-04 09:00:11.629: ERROR/BluetoothEventLoop.cpp(84): onGetRemoteServiceChannelResult: D-Bus error: org.bluez.Error.ConnectionAttemptFailed (Host is down) 01-04 09:00:11.729: DEBUG/dalvikvm(128): GC freed 4535 objects / 256008 bytes in 296ms 01-04 09:00:21.880: ERROR/bluetooth_RfcommSocket.cpp(1433): connect error: Host is down (112) But it sets the state as connected. The app is unable to catch the exception. Why does it happen? Or is it the case with backport api? Any help is appreciated as I am struggling a lot to get things run fine.

Read the article
How can I parse a namespace using the SAX parser?

- by Silvestri

Hello, Using a twitter search URL ie. http://search.twitter.com/search.rss?q=android returns CSS that has an item that looks like: <item> <title>@UberTwiter still waiting for @ubertwitter android app!!!</title> <link>http://twitter.com/meals69/statuses/21158076391</link> <description>still waiting for an app!!!</description> <pubDate>Sat, 14 Aug 2010 15:33:44 +0000</pubDate> <guid>http://twitter.com/meals69/statuses/21158076391</guid> <author>Some Twitter User</author> <media:content type="image/jpg" height="48" width="48" url="http://a1.twimg.com/profile_images/756343289/me2_normal.jpg"/> <google:image_link>http://a1.twimg.com/profile_images/756343289/me2_normal.jpg</google:image_link> <twitter:metadata> <twitter:result_type>recent</twitter:result_type> </twitter:metadata> </item> Pretty simple. My code parses out everything (title, link, description, pubDate, etc.) without any problems. However, I'm getting null on: <google:image_link> I'm using Java to parse the RSS feed. Do I have to handle compound localnames differently than I would a more simple localname? This is the bit of code that parses out Link, Description, pubDate, etc: @Override public void endElement(String uri, String localName, String name) throws SAXException { super.endElement(uri, localName, name); if (this.currentMessage != null){ if (localName.equalsIgnoreCase(TITLE)){ currentMessage.setTitle(builder.toString()); } else if (localName.equalsIgnoreCase(LINK)){ currentMessage.setLink(builder.toString()); } else if (localName.equalsIgnoreCase(DESCRIPTION)){ currentMessage.setDescription(builder.toString()); } else if (localName.equalsIgnoreCase(PUB_DATE)){ currentMessage.setDate(builder.toString()); } else if (localName.equalsIgnoreCase(GUID)){ currentMessage.setGuid(builder.toString()); } else if (uri.equalsIgnoreCase(AVATAR)){ currentMessage.setAvatar(builder.toString()); } else if (localName.equalsIgnoreCase(ITEM)){ messages.add(currentMessage); } builder.setLength(0); } } startDocument looks like: @Override public void startDocument() throws SAXException { super.startDocument(); messages = new ArrayList<Message>(); builder = new StringBuilder(); } startElement looks like: @Override public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException { super.startElement(uri, localName, name, attributes); if (localName.equalsIgnoreCase(ITEM)){ this.currentMessage = new Message(); } } Tony

Read the article
How to retrieve a numbered sequence range from a List of filenames?

- by glenneroo

I would like to automatically parse the entire numbered sequence range of a List<FileData> of filenames (sans extensions) by checking which part of the filename changes. Here is an example (file extension already removed): First filename: IMG_0000 Last filename: IMG_1000 Numbered Range I need: 0000 to 1000 Except I need to deal with every possible type of file naming convention such as: 0000 ... 9999 20080312_0000 ... 20080312_9999 IMG_0000 - Copy ... IMG_9999 - Copy 8er_green3_00001 .. 8er_green3_09999 etc. I need the entire 0-padded range e.g. 0001 not just 1 The sequence number is 0-padded e.g. 0001 The sequence number can be located anywhere e.g. IMG_0000 - Copy The range can start and end with anything i.e. doesn't have to start with 1 and end with 9999 Whenever I get something working for 8 random test cases, the 9th test breaks everything and I end up re-starting from scratch. I've currently been comparing only the first and last filenames (as opposed to iterating through all filenames): void FindRange(List<FileData> files, out string startRange, out string endRange) { string firstFile = files.First().ShortName; string lastFile = files.Last().ShortName; ... } Does anyone have any clever ideas?

Read the article
How Do I Pull Info from String

- by Russ Bradberry

I am trying to pull dynamics from a load that I run using bash. I have gotten to a point where I get the string I want, now from this I want to pull certain information that can vary. The string that gets returned is as follows: Records: 2910 Deleted: 0 Skipped: 0 Warnings: 0 Each of the number can and will vary in length, but the overall structure will remain the same. What I want to do is be able to get these numbers and load them into some bash variables ie: RECORDS=?? DELETED=?? SKIPPED=?? WARNING=?? In regex I would do it like this: Records: (\d*?) Deleted: (\d*?) Skipped (\d*?) Warnings (\d*?) and use the 4 groups in my variables.

Read the article
how to dispaly image in grid view reading imageUrl from xml using sax parser in android

- by Pramod kuamr

thanks for answer but i am able to read xml file from url but i need if in xml imageUrl is there so show in grid view ..this is my xml file and read URL <?xml version="1.0" encoding="UTF-8"?> <channels> <channel> <name>ndtv</name> <logo>http://a3.twimg.com/profile_images/670625317/aam-logo--twitter.png</logo> <description>this is a news Channel</description> <rssfeed>ndtv.com</rssfeed> </channel> <channel> <name>star news</name> <logo>http://a3.twimg.com/profile_images/740897825/AndroidCast-350_normal.png</logo> <description>this is a newsChannel</description> <rssfeed>starnews.com</rssfeed> </channel> </channels>

Read the article
Cant get description rss tag data with javascript

- by AdamB

I'm currently making a widget to take and display items from a feed. I have this working for the most part, but for some reason the data within the tag within the item comes back as empty, but I get the data in the and tags no problem. feed is and xmlhttp.responseXML object. var items = feed.getElementsByTagName("item"); for (var i=0; i<10; i++){ container = document.getElementById('list'); new_element = document.createElement('li'); title = items[i].getElementsByTagName("title")[0].firstChild.nodeValue; link = items[i].getElementsByTagName("link")[0].firstChild.nodeValue; alert(items[i].getElementsByTagName("description")[0].firstChild.nodeValue); new_element.innerHTML = "<a href=\""+link+"\">"+title+"</a> "; container.insertBefore(new_element, container.firstChild); } I have no idea why it wouldn't be working for the tag and would be for the other tags. Here is an example of the rss feed its trying to parse: <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"> <channel> <title>A title</title> <link>http://linksomehwere</link> <description>The title of the feed</description> <language>en-us</language> <item> <pubDate>Fri, 10 Jul 2009 11:34:49 -0500</pubDate> <title>Awesome Title</title> <link>http://link/to/thing</link> <guid>http://link/to/thing</guid> <description> <![CDATA[ <p>some html crap</p> blah blah balh ]]> </description> </item> </channel> </rss>

Read the article
Evaluating mathematical expressions in Python

- by vander

Hi, I want to tokenize a given mathematical expression into a binary tree like this: ((3 + 4 - 1) * 5 + 6 * -7) / 2 '/' / \ + 2 / \ * * / \ / \ - 5 6 -7 / \ + 1 / \ 3 4 Is there any pure Python way to do this? Like passing as a string to Python and then get back as a tree like mentioned above. Thanks.

Read the article
How to convert a string with escaped characters into a "normal" string

- by M28

Let's say that I have like this: "\\n", I need to convert it to a string like if the interpreter would do it: \n. A simple replace like this wouldn't work: function aaa(s){ return s.replace(/\\n/gm,'\n').replace(/\\\\/gm,'\\'); } I need this behaviour: "Line 1\nLine 2" = Line 1<Line break>Line 2 "Line 1\\nLine 1" = Line 1\nLine1

Read the article
Combine two numbers into one. Example: 123 and 456 become 123456

- by Bei337

In C++, how do I combine (note: not add) two integers into one big integer? For example: int1 = 123; int2 = 456; Is there a function to take the two numbers and turn intCombined into 123456?

Read the article
Extract information from javascript counter via PHP

- by Jennifer Weinberg

Hi, I'm looking for a way to extract some information from this site via PHP: http://www.mycitydeal.co.uk/deals/london There ist a counter where the time left is displayed, but the information is within the JavaScript. Since I'm really a JavaScript rookie, I didn't really know how to get the information. Normally I would extract the information with "preg_match" and some regular expressions. Can someone help me to extract the information (Hrs., Min., Sec.) ? Jennifer

Read the article
Simple way of getting the Last.fm artist image for recently listened songs?

- by animuson

On the Last.fm website, your recently listened track include the 34x34 (or whatever size) image at the left of each song. However, in the RSS feed that they give you, no image URLs are provided for the songs. I was wondering if there was a good way of figuring out the ID for the image that needs to be used for that artist and displaying it based on the data that we're given. I know it is possible to load the artist page from their website and then grab the image values from JavaScript, but that seems overly complicated and would probably take quite some time to do. What we're given: <item> <title>Owl City – Rainbow Veins</title> <link>http://www.last.fm/music/Owl+City/_/Rainbow+Veins</link> <pubDate>Thu, 20 May 2010 18:15:29 +0000</pubDate> <guid>http://www.last.fm/user/animuson#1274379329</guid> <description>http://www.last.fm/music/Owl+City</description> </item> and the 34x34 image for this song would be here (ID# 37056785). Does anything like this exist? I've considered storing the ID number in a cache of some sort once it has been checked once, but what if the image changes?

Read the article
Why does Joda time change the PM in my input string to AM?

- by Tree

My input string is a PM time: log(start); // Sunday, January 09, 2011 6:30:00 PM I'm using Joda Time's pattern syntax as follows to parse the DateTime: DateTimeFormatter parser1 = DateTimeFormat.forPattern("EEEE, MMMM dd, yyyy H:mm:ss aa"); DateTime startTime = parser1.parseDateTime(start); So, why is my output string AM? log(parser1.print(startTime)); // Sunday, January 09, 2011 6:30:00 AM

Read the article
JSoup - Select only one listobject

- by Zyril

I'm trying to extract some certain data from a website using JSoup and Java. So far I've been successful in what I'm trying to achieve. <ul class="beverageFacts"> <li><span>Årgång</span><strong>**2009** </strong></li> I want to extract what is inside the ** in the above HTML. I can do this by using the code that follows in JSoup: doc.select("ul.beverageFacts li:lt(1) strong"); I'm using the lt(1) because there are several more list items following that I want to omit. Now to my problem; there's an optional information tab on the site I'm extracting data from, and it also has a class called "beverageFacts". My code will at the moment extract that data too, which I don't want it to do. The code is further down in the source of the website, and I've tried to use the indexer :lt(1) here aswell, but it wont work. <div id="beverageMoreFacts" style="display: block"> <ul class="beverageFacts"><li class="half"> <span> Färg</span><strong> Ljusgul färg.</strong> My overall result is that I extract "2009 Ljusgul färg." instead of only "2009". How can I write my code so it will only extract the first part, which it succesfully does, and omits the rest? EDIT: I get the same result using: doc.select("ul.beverageFacts li:eq(0) strong"); Thanks, Z

Read the article
remove parent xml tag

- by cru3l

For example, we have xml file with this format: <A> <B> <C></C> <D></D> <D></D> </B> </A> i need that: if all "D"-tags elements are empty, then we need to delete whole "A"-tag element and, of course, we need to do this with all "A"-tags in xml.

Read the article
Is it valid to have more than one question mark in a URL?

- by Bungle

I came across the following URL today: http://www.sfgate.com/cgi-bin/blogs/inmarin/detail??blogid=122&entry_id=64497 Notice the doubled question mark at the beginning of the query string: ??blogid=122&entry_id=64497 My browser didn't seem to have any trouble with it, and running a quick bookmarklet: javascript:alert(document.location.search); just gave me the query string shown above. Is this a valid URL? The reason I'm being so pedantic (assuming that I am) is because I need to parse URLs like this for query parameters, and supporting doubled question marks would require some changes to my code. Obviously if they're in the wild, I'll need to support them; I'm mainly curious if it's my fault for not adhering to URL standards exactly, or if it's in fact a non-standard URL.

Read the article
Code/Approach Golf: Find row in text file with too many columns

- by awshepard

Given a text file that is supposed to contain 10 tab-delimited columns (i.e. 9 tabs), I'd like to find all rows that have more than 10 columns (more than 9 tabs). Each row ends with CR-LF. Assume nothing about the data, field widths, etc, other than the above. Comments regarding approach, and/or working code would be extremely appreciated. Bonus for printing line numbers of offending lines as well. Thanks in advance!

Read the article

< Previous Page | 154 155 156 157 158 159 160 161 162 163 164 165 | Next Page >