Search Results

Search found 7251 results on 291 pages for 'pdf parsing'.

Page 72/291 | < Previous Page | 68 69 70 71 72 73 74 75 76 77 78 79 | Next Page >

How can we write the html tidy coding to insert the closing tag ?

- by Harikrishna

How can we write html tidy coding only for inserting closing tag in the html file where closing tags are missing ? I am parsing html tabular information using Html Agilitiy Pack. But where the ending tags are missing extracting information with html agility pack are not performed well. And if we write the ending tags manually and then we can extract the information perfectly with html agility pack.So I want to insert the closing tags where they are missing so html agility pack extracts the information perfectly.

Read the article
Best way to parse RSS/Atom feeds with PHP

- by carson

I'm currently using Magpie RSS but it sometimes falls over when the RSS or Atom feed isn't well formed. Are there any other options for parsing RSS and Atom feeds with PHP?

Read the article
Insert unicode strings into CleverCSS

- by Brian M. Hunt

How can one insert a Unicode string CSS into CleverCSS? In particular, how could one produce the following CSS using CleverCSS: li:after { content: "\00BB \0020"; } I've figured out CleverCSS's parsing rules, but suffice that the permutations I've thought sensible have failed, for example: li: content: "\\00BB \\0020" // becomes content: 'BB 0' EDIT: My other examples and the rest of my post weren't saved. Suffice that I had a longer list of examples that also failed, as did my closing which was something like: I'd be grateful for any thoughts and input. Brian

Read the article
Simple regex question?

- by Joan Venge

In the streams I am parsing I need to parse something in this pattern: <b>PaintTitle</b></td><td class=detail valign="top" align=left><div align=left><font size=small><b>The new great album by Pet Shop Boys</b> How would I get the string "The new great album by Pet Shop Boys" where <b>PaintTitle</b> is guaranteed to be once per album?

Read the article
Fetch excerpt from Wikipedia article?

- by Felix

I've been up and down the Wikipedia API, but I can't figure out if there's a nice way to fetch the excerpt of an article (usually the first paragraph). It would be nice to get the HTML formatting of that paragraph, too. The only way I currently see of getting something that resembles a snippet is by performing a fulltext search (example), but that's not really what I want (too short). Is there any other way to fetch the first paragraph of a Wikipedia article than barbarically parsing HTML/WikiText?

Read the article
Shift / reduce conflicts in grammar of arithmetic expression with n-ary sums / products

- by aioobe

Parsing binary sums / products are easy, but I'm having troubles defining a grammar that parses a + b * c + d + e as sum(a, prod(b, c), d, e) My initial (naive) attempt generated 61 shift / reduce conflicts. I'm using java cup (but I suppose a solution for any other parser generator would be easily translated).

Read the article
Parser problem - Else-If and a Function Declaration

- by Amar Ravikumar

A quick, fun question - What is the difference between a function declaration in C/C++ and an else-if statement block from a purely parsing standpoint? void function_name(arguments) { [statement-block] } else if(arguments) { [statement-block] } Looking for the best solution! =)

Read the article
MalformedByteSequenceException while trying to pars XML

- by poeschlorn

Hey guy, maybe someone can help: I have the following .gpx data from wikipedia: <?xml version="1.0" encoding="UTF-8" standalone="no" ?> <gpx xmlns="http://www.topografix.com/GPX/1/1" creator="byHand" version="1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd"> <wpt lat="39.921055008" lon="3.054223107"> <ele>12.863281</ele> <time>2005-05-16T11:49:06Z</time> <name>Cala Sant Vicenç - Mallorca</name> <sym>City</sym> </wpt> </gpx> When I call my parsing method, I get a exception (see below) The call looks like this: Document tmpDoc = getParsedXML(currentGPX); My method to parse looks like this (standart parsing code, nothing exctiting....): public static Document getParsedXML(String fileWithPath){ DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db; Document doc = null; try { db = dbf.newDocumentBuilder(); doc = db.parse(new File(fileWithPath)); } catch (ParserConfigurationException e) { e.printStackTrace(); } catch (SAXException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } return doc; } This simple code throws following exception: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 3-byte UTF-8 sequence. at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(Unknown Source) at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(Unknown Source) at Zeugs.getParsedXML(Zeugs.java:38) at Zeugs.main(Zeugs.java:25) I guess the error lies within the format of the first file, but I don't know where exactly. Can you please give me a hint?

Read the article
HTML or Alternate markup for wiki site?

- by at

In choosing an editor for my wiki-like site, I'm debating whether to allow HTML or a custom alternate markup (maybe like wikipedia/wikimedia's or BBCode). HTML benefits: Easy for users to deal with (copying and pasting, learning) Somewhat future proof Many more editing tools available, usually WYSIWYG too Alternate markup benefits: On the server side I don't have to worry about parsing malicious javascript or styles or HTML that I don't allow Can be easy to learn Can be easier to decipher if not HTML-savvy Am I missing something, what's the best solution?

Read the article
How to parse a xhtml ignoring the DOCTYPE declaration using DOM parser

- by Rachel

Hi I face issue parsing xhtml with DOCTYPE declaration using DOM parser. Error: java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd%20 Is there a way to parse the xhtml to a Document object iognoring the DOCTYPE.

Read the article
Does XML::LibXML::Reader read html?

- by sid_com

I didn't find anything about parsing html in the XML::LibXML::Reader-documentation. And I tried to parse a html-site and it didn't work. Is my conclusion, that XML::LibXML::Reader doesn't work with html right?

Read the article
How to parse a HTML file at a URL?

- by Warrior

I am new to iphone development.I am able to parse a Xml file at a URL and retrieve it contents from a particular nodes. For Parsing at url NSString * path = @"xxxxxxxxxxxxxxxxxxxxxx"; [self parseXMLFileAtURL:path]; For retrieving the data i use NSXMLParser .How can i achieve the same thing if i have HTML file at my URL(Source code of the webpage is HTML).Please help me out.Thanks.

Read the article
What grammar based parser-generator tools exist for ruby?

- by cartoonfox

What open source (preferably gem-based) parser-generator options do I have in Ruby? I've used (flex&bison)|(lex&yacc) from C in the past, and I'm comfortable with BNF-style specifications. I've heard of treetop, but it looks a bit alien and verbose compared to yacc... Purpose: I want to convert my text markup language to a BNF and generate the parsing code. I think it's a better strategy than my first-order solution: http://github.com/dafydd/semantictext/blob/master/lib/semantictext/rich_text_parser.rb

Read the article
Solve math question in PHP

- by Koning WWWWWWWWWWWWWWWWWWWWWWW

The user can enter a math problem like 5 + 654, 6 ^ 24, 2!, sqrt(543), log(54), sin 5, sin(50). After some reformatting (e.g. change sin 5 into sin(5)), and doing an eval, PHP gives me the right result. However, this is quite unsafe. Can anyone point me in the right direction parsing and solving a math question like the examples above, which is safe? Thanks.

Read the article
HTML parser for GAE

- by Richard

Generally I use lxml for my HTML parsing needs, but that isn't available on Google App Engine. The obvious alternative is BeautifulSoup, but I find it chokes too easily on malformed HTML. Currently I am testing libxml2dom and have been getting better results. Which pure Python HTML parser have you found performs best? My priority is the ability to handle bad HTML over speed.

Read the article
What quality, parser-generator options exist for ruby?

- by cartoonfox

What open source (preferably gem-based) parser-generator options do I have in Ruby? I've used (flex&bison)|(lex&yacc) from C in the past, and I'm comfortable with BNF-style specifications. I've heard of treetop, but it looks a bit alien and verbose compared to yacc... Purpose: I want to convert my text markup language to a BNF and generate the parsing code. I think it's a better strategy than my first-order solution: http://github.com/dafydd/semantictext/blob/master/lib/semantictext/rich_text_parser.rb

Read the article
Best 3rd Party Resume Parser Tool

- by Krishna Kumar

We are working on a hiring application and need the ability to easily parse resumes. Before trying to build one, was wondering what resume parsing tools are available out there and what is the best one, in your opinion? We need to be able to parse both Word and TXT files.

Read the article
Are there faster XML parsers in Java than Xalan/Xerces

- by jm04469

I haven't found many ways to increase the performance of a Java application that does intensive XML processing other than to leverage hardware such as Tarari or Datapower. Does anyone know of any open source ways to accelerate XML parsing?

Read the article
PHP - Read TXT from specific position

- by user1466766

I'm having trouble with PHP text parsing I have a txt file which has this kind of information: sometext::sometext.0 = INTEGER: 254 What i need is to get the last value of 254 as variable in PHP. in this txt file this last value can change from 0 to 255 "sometext::sometext.0 = INTEGER: " this part doesn't change at all. It has a length of 36 symbols, so i need get with PHP what is after 36 symbol into variable. Thank you.

Read the article
How to parse responses from a Django server in android?

- by primal

Hi, In the Android application I am building, I want to be able to communicate with a local server developed in Django. (Basically a login page and a home page populated with posts and images from users) So do I need to use XML Parsers for the parsing the response from a Django server or is it possible for the server to respond with strings which can be directly used? Also what about images? Regards, Primal

Read the article
Objective C - Parse NSData

- by EZFrag

I have the following data inside an NSData object: <00000000 6f2d840e 31504159 2e535953 2e444446 3031a51b 8801015f 2d02656e 9f110101 bf0c0cc5 0affff3f 00000003 ffff03 I'm having issues parsing this data. This data contains information which is marked by tags Tag 1 is from byte value 0x84 to 0xa5 Tag 2 is from byte value 0xa5 to 0x88 Tag 3 is from byte value 0x88 to 0x5f0x2d Tag 4 is from byte value 0x5f0x2d to 0x9f0x11 How would I go about to get those values from the NSData object? Regards, EZFrag

Read the article
What parameter parser libraries are there for C++?

- by Jim

I'd like to pass parameters to my C++ program in the following manner: ./myprog --setting=value Are there any libraries which will help me to do this easily? See also http://stackoverflow.com/questions/189972/argument-parsing-helpers-for-c-unix/191821

Read the article
Parse all RSS item into c# class

- by user285677

What's the best way of parsing the folowing rss feed item into a C# class. <item> <fh:FlightHistory FlightHistoryId="189895136" > <fh:Airline AirlineCode="EI" Name="Aer Lingus" /> </fh:FlightHistory> </item>

Read the article
is there any faster way to parse than by walk each byte?

- by uray

is there any faster way to parse a text than by walk each byte of the text? I wonder if there is any special CPU (x86/x64) instruction for string operation that is used by string library, that somehow used to optimize the parsing routine. for example instruction like finding a token in a string that could be run by hardware instead of looping each byte until a token is found.

Read the article
What is the best way to modify a few fields in an XML using Java

- by Kailas J C

I have a big XML which contains around 300 elements. I need to modify 2 or 3 elements in this xml using Java. I don't want to go for conventional marshalling and unmarshalling as it involves the parsing of the whole XML. How is XPath/XSLT manipulation? I know that I can easily read the data but i need to modify the same and put in back in the same XML. The primary concern here is performance. Kindly advise

Read the article

< Previous Page | 68 69 70 71 72 73 74 75 76 77 78 79 | Next Page >