pdf parsing - Page 102 - Developer IT

Unable to Parse Date using NSDateFormatter

- by Ansari

Hi, I am fetching a RSS, in which i receive the following Date stamp: 2010-05-10T06:11:14.000Z Now i am using NSDateFormatter to parse this datetime stamp. [parseFormatter setDateFormat:@"yyyy-MM-dTH:m:s.z"]; But its not working fine if just remove the time stamp part it works for the date [parseFormatter setDateFormat:@"yyyy-MM-d"]; But if i add the rest of the stuff it returns nil. Any idea ? Thanks in Advance....

Read the article

Recognize Dates In A String

- by Tim Scott

I want a class something like this: public interface IDateRecognizer { DateTime[] Recognize(string s); } The dates might exist anywhere in the string and might be any format. For now, I could limit to U.S. culture formats. The dates would not be delimited in any way. They might have arbitrary amounts of whitespace between parts of the date. The ideas I have are: ANTLR Regex Hand rolled I have never used ANTLR, so I would be learning from scratch. I wonder if there are libraries or code samples out there that do something similar that could jump start me. Is ANTLR too heavy for such a narrow use? I have used Regex a lot before, but I hate it for all the reasons that most people hate it. I could certainly hand roll it but I'd rather not re-solve a solved problem. Suggestions? UPDATE: Here is an example. Given this input: This is a date 11/3/63. Here is another one: November 03, 1963; and another one Nov 03, 63 and some more (11/03/1963). The dates could be in any U.S. format. They might have dashes like 11-2-1963 or weird extra whitespace inside like this: Nov 3, 1963, and even maybe the comma is missing like [Nov 3 63] but that's an edge case. The output should be an array of seven DateTimes. Each date would be the same: 11/03/1963 00:00:00.

Read the article

Changing href atributes with nokogiri and ruby on rails

- by fool

Hi, I Have a HTML document with links links, for exemple: <html> <body> <ul> <li><a href="http://someurl.com/etc/etc">teste1</a></li> <li><a href="http://someurl.com/etc/etc">teste2</a></li> <li><a href="http://someurl.com/etc/etc">teste3</a></li> <ul> </body> </html> I want with Ruby on Rails, with nokogiri or some other method, to have a final doc like this: <html> <body> <ul> <li><a href="http://myproxy.com/?url=http://someurl.com/etc/etc">teste1</a></li> <li><a href="http://myproxy.com/?url=http://someurl.com/etc/etc">teste2</a></li> <li><a href="http://myproxy.com/?url=http://someurl.com/etc/etc">teste3</a></li> <ul> </body> </html> What's the best strategy to achieve this?

Read the article

Sanitize Content: removing markup from Amazon's content

- by StackOverflowNewbie

I'm using Amazon Web Service to get product descriptions of various items. The problem is that Amazon's content contains mark up that is sometimes destructive to the layout of my web page (e.g. unclosed DIVs, etc.). I want to sanitize the content I get from Amazon. My solution would be to do the following (my initial list so far): Remove unnecessary tags such as div, span, etc. while keeping tags like p, ul, ol, etc. Remove all attributes from all the tags (e.g. seems like there are style attributes in some of the tags) Remove excess white space (e.g. multiple spaces, carriage returns, new lines, tabs, etc.) Etc. Before I go off trying to build my solution, I'm wondering if anyone has a better idea (or an already existing solution). Thanks.

Read the article

How would I use HTMLAgilityPack to extract the value I want

- by Nai

For the given HTML I want the value of id <div class="name" id="john-5745844"> <div class="name" id="james-6940673"> UPDATE This is what I have at the moment HtmlDocument htmlDoc = new HtmlDocument(); htmlDoc.Load(new StringReader(pageResponse)); HtmlNode root = htmlDoc.DocumentNode; List<string> anchorTags = new List<string>(); foreach (HtmlNode div in root.SelectNodes("//div[@class='name' and @id]")) { HtmlAttribute att = div.Attributes["id"]; Console.WriteLine(att.Value); } The error I am getting is at the foreach line stating: Object reference not set to an instance of an object.

Read the article

Using Python's ConfigParser to read a file without section name

- by Arrieta

Hello: I am using ConfigParser to read the runtime configuration of a script. I would like to have the flexibility of not providing a section name (there are scripts which are simple enough; they don't need a 'section'). ConfigParser will throw the NoSectionError exception, and will not accept the file. How can I make ConfigParser simply retrieve the (key, value) tuples of a config file without section names? For instance: key1=val1 key2:val2 I would rather not write to the config file.

Read the article

How can I parse a C header file with Perl?

- by Alphaneo

Hi, I have a header file in which there is a large struct. I need to read this structure using some program and make some operations on each member of the structure and write them back. For example I have some structure like const BYTE Some_Idx[] = { 4,7,10,15,17,19,24,29, 31,32,35,45,49,51,52,54, 55,58,60,64,65,66,67,69, 70,72,76,77,81,82,83,85, 88,93,94,95,97,99,102,103, 105,106,113,115,122,124,125,126, 129,131,137,139,140,149,151,152, 153,155,158,159,160,163,165,169, 174,175,181,182,183,189,190,193, 197,201,204,206,208,210,211,212, 213,214,215,217,218,219,220,223, 225,228,230,234,236,237,240,241, 242,247,249}; Now, I need to read this and apply some operation on each of the member variable and create a new structure with different order, something like: const BYTE Some_Idx_Mod_mul_2[] = { 8,14,20, ... ... 484,494,498}; Is there any Perl library already available for this? If not Perl, something else like Python is also OK. Can somebody please help!!!

Read the article

Why does this JSON.parse code not work?

- by SuZi

I am trying to pass json encoded values from a php script to a, GnuBookTest.js, javascript file that initiates a Bookreader object and use the values i have passed in via the variable i named "result". The php script is sending the values like: <div id="bookreader"> <div id="BookReader" style="left:10px; right:10px; top:30px; bottom:30px;">x</div> <script type="text/javascript">var result = {"istack":"zi94sm65\/BUCY\/BUCY200707170530PM","leafCount":"14","wArr":"[893,893,893,893,893,893,893,893,893,893,893,893,893,893]","hArr":"[1155,1155,1155,1155,1155,1155,1155,1155,1155,1155,1155,1155,1155,1155]","leafArr":"[0,1,2,3,4,5,6,7,8,9,10,11,12,13]","sd":"[\"RIGHT\",\"LEFT\",\"RIGHT\",\"LEFT\",\"RIGHT\",\"LEFT\",\"RIGHT\",\"LEFT\",\"RIGHT\",\"LEFT\",\"RIGHT\",\"LEFT\",\"RIGHT\",\"LEFT\"]"}</script> <script type="text/javascript" src="http://localhost:8080/application/js/GnuBookTest.js"></script> </div> </div> and in the GnuBookTest.js file i am trying to use the values like: br = new BookReader(); // Return the width of a given page. br.getPageWidth = function(index) { return this.pageW[index]; } // Return the height of a given page. br.getPageHeight = function(index) { return this.pageH[index]; } br.pageW = JSON.parse(result.wArr); br.pageH = JSON.parse(result.hArr); br.leafMap = JSON.parse(result.leafArr); //istack is an url fragment for location of image files var istack = result.istack; . . . Using JSON.parse as i have written it above loads the Bookreader and uses my values correctly in a few web-browsers: Firefox, IE8, and desktop-Safari; but does not work at all in mac-Chrome, mobile-Safari, plus older versions of IE. Mobile safari keeps giving me a reference error msg: can't find variable: JSON. The other browsers just do not load the Bookreader and show the "x" instead, like they did not get the values from the php script. Where is the problem?

Read the article

How do Relational Databases Work Under the Hood?

- by Pierreten

I've always been interested in how you can throw some SQL at at database, and it nearly instantaneously returns your results in an orderly manner without thinking about it as anything other than a black box. What is really going on? I'm pretty sure it has something to do with how values are laid out regularly in memory, similar to an array; but aside from that, I don't know much else. How is SQL parsed in a manner to facilitate all of this?

Read the article

A database of questions with unambiguous numeric answers.

- by dreeves

I (and co-hackers) are building a sort of trivia game inspired by this blog post: http://messymatters.com/calibration. The idea is to give confidence intervals and learn how to be calibrated (when you're "90% sure" you should be right 90% of the time). We're thus looking for, ideally, thousands of questions with unambiguous numerical answers. Also, they shouldn't be too boring. There are a lot of random statistics out there -- eg, enclosed water area in different countries -- that would make the game mind-numbing. Things like release dates of classic movies are more interesting (to most people). Other interesting ones we've found include Olympic records, median incomes for different professions, dates of famous inventions, and celebrity ages. Scraping things like above, by the way, was my reason for asking this question: http://stackoverflow.com/questions/2611418/scrape-html-tables So, if you know of other sources of interesting numerical facts (in a parsable form) I'm eager for pointers to them. Thanks!

Read the article

java version of python-dateutil

- by elhefe

Python has a very handy package that can parse nearly any unambiguous date and provides helpful error messages on a parse failure, python-dateutil. Comparison to the SimpleDateFormat class is not favorable - AFAICT SimpleDateFormat can only handle one exact date format and the error messages have no granularity. I've looked through the Joda API but it appears Joda is the same way - only one explicit format can be parsed at a time. Is there any package or library that reproduces the python-dateutil behavior? Or am I missing something WRT Joda/SimpleDateFormat?

Read the article

Extracting a table row with a particular attribute,using HTMLAGILITY pack

- by Soham

Consider this piece of code: <tr> <td valign=top class="tim_new"><a href="/stocks/company_info/pricechart.php?sc_did=MI42" class="tim_new">3M India</a></td> <td class="tim_new" valign=top><a href='/stocks/marketstats/indcomp.php?optex=NSE&indcode=Diversified' class=tim>Diversified</a></td> I want to write a piece of code using HTMLAgility pack which would extract the link in the first line.

Read the article

How to use Wordpress' http.php in external projects?

- by NJTechGuy

I am trying to parse data from a pipe-delimited text file hosted on another server which in turn will be inserted in a database. My host (1and1) disabled allow_url_fopen in php.ini I guess. Error message : Warning: fopen() [function.fopen]: URL file-access is disabled in the server configuration in Code : <? // make sure curl is installed if (function_exists('curl_init')) { // initialize a new curl resource $ch = curl_init(); // set the url to fetch curl_setopt($ch, CURLOPT_URL, 'http://abc.com/data/output.txt'); // don't give me the headers just the content curl_setopt($ch, CURLOPT_HEADER, 0); // return the value instead of printing the response to browser curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // use a user agent to mimic a browser curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0'); $content = curl_exec($ch); // remember to always close the session and free all resources curl_close($ch); } else { // curl library is not installed so we better use something else } //$contents = fread ($fd,filesize ($filename)); //fclose ($fd); $delimiter = "|"; $splitcontents = explode($delimiter, $contents); $counter = ""; ?> <font color="blue" face="arial" size="4">Complete File Contents</font> <hr> <? echo $contents; ?> <br><br> <font color="blue" face="arial" size="4">Split File Contents</font> <hr> <? foreach ( $splitcontents as $color ) { $counter = $counter+1; echo "<b>Split $counter: </b> $colorn<br>"; } ?> Wordpress has this cool http.php file. Is there a better way of doing it? If not, how do I use http.php for this task? Thank you guys..

Read the article

Code editor with autocomplete

- by Andrey

I need to create a code editor for my own simple language: className.MethodName(parameterName = 2, ... ) I've created the appropriate grammar and autogenerate parser using ANTLR tool. Now I would like to have an autocomplete for class, method, variables and parameter names. This list should be context dependent, f.e. for "class." it should display methods and for "class.Method(" - parameters. I was going to parse the text and display the list depending on in which node the cursor is. The problem is that for incomplete code like "aaa.bbb(" the parser produces an error instead of a syntax tree. Any idea how to solve this problem? Maybe I'm on the wrong way and I shouldn't parse code to display autocomplete?

Read the article

Python/YACC: Resolving a shift/reduce conflict

- by Rosarch

I'm using PLY. Here is one of my states from parser.out: state 3 (5) course_data -> course . (6) course_data -> course . course_list_tail (3) or_phrase -> course . OR_CONJ COURSE_NUMBER (7) course_list_tail -> . , COURSE_NUMBER (8) course_list_tail -> . , COURSE_NUMBER course_list_tail ! shift/reduce conflict for OR_CONJ resolved as shift $end reduce using rule 5 (course_data -> course .) OR_CONJ shift and go to state 7 , shift and go to state 8 ! OR_CONJ [ reduce using rule 5 (course_data -> course .) ] course_list_tail shift and go to state 9 I want to resolve this as: if OR_CONJ is followed by COURSE_NUMBER: shift and go to state 7 else: reduce using rule 5 (course_data -> course .) How can I fix my parser file to reflect this? Do I need to handle a syntax error by backtracking and trying a different rule?

Read the article

PHP Fomatting Regex - BBCode

- by Wayne

To be honest, I suck at regex so much, I would use RegexBuddy, but I'm working on my Mac and sometimes it doesn't help much (for me). Well, for what I need to do is a function in php function replaceTags($n) { $n = str_replace("[[", "<b>", $n); $n = str_replace("]]", "</b>", $n); } Although this is a bad example in case someone didn't close the tag by using ]] or [[, anyway, could you help with regex of: [[ ]] = Bold format ** ** = Italic format (( )) = h2 heading Those are all I need, thanks :) P.S - Is there any software like RegexBuddy available for Mac (Snow Leopard)?

Read the article

Python: Is there a way to get HTML that was dynamically created by Javascript?

- by Joschua

As far as I can tell, this is the case for LyricWikia. The lyrics (example) can be accessed from the browser, but can't be found in the source code (can be opened with CTRL + U in most browsers) or reading the contents of the site with Python: from urllib.request import urlopen URL = 'http://lyrics.wikia.com/Billy_Joel:Piano_Man' r = urlopen(URL).read().decode('utf-8') And the test: >>> 'Now John at the bar is a friend of mine' in r False >>> 'John' in r False But when you select and look at the source code of the box in which the lyrics are displayed, you can see that there is: <div class="lyricbox">[...]</div> Is there a way to get the contents of that div-element with Python?

Read the article

MalformedURLException with file URI

- by Paul Reiners

While executing the following code: doc = builder.parse(file); where doc is an instance of org.w3c.dom.Document and builder is an instance of javax.xml.parsers.DocumentBuilder, I'm getting the following exception: Exception in thread "main" java.net.MalformedURLException: unknown protocol: c at java.net.URL.<init>(Unknown Source) at java.net.URL.<init>(Unknown Source) at java.net.URL.<init>(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(Unknown Source) at com.acme.ItemToThetaValues.createFiles(ItemToThetaValues.java:47) It's choking on this line of the file: <!DOCTYPE questestinterop SYSTEM "C:\Program Files\Acme\parsers\acme_full.dtd"> I am not getting this error on my machine, while a user is getting it on his machine. We are both using version 6 of the Sun JRE. This error also occurs when he's uses double backslashes in the path instead of single backslashes and when he uses forward slashes instead of backslashes. First of all, is the XML correct? Is the path expressed correctly? Second of all, why is this error occurring on one computer but not on another?

Read the article

Why the double.Parse throw error in live server and how to track?

- by Kovu

Hi, I build a website, that: reads data from a website by HttpWebRequest Sort all Data Parse values of the data and give out newly On local server it works perfect, but when I push it to my live server, the double.Parse fails with an error. So: - how to track what the double.parse is trying to parse? - how to debug live server? Lang is ASP.Net / C#.net 2.0

Read the article

Lexing partial SQL in C#

- by Chris T

I'd need to parse partial SQL queries (it's for a SQL injection auditing tool). For example '1' AND 1=1-- Should break down into tokens like [0] => [SQL_STRING, '1'] [1] => [SQL_AND] [2] => [SQL_INT, 1] [3] => [SQL_AND] [4] => [SQL_INT, 1] [5] => [SQL_COMMENT] [6] => [SQL_QUERY_END] Are their any at least lexers for SQL that I base mine off of or any good tools like bison for C# (though I'd rather not write my own grammar as I need to support most if not all the grammar of MySQL 5)

Read the article

how to dispaly image in grid view reading imageUrl from xml using sax parser in android

- by Pramod kuamr

thanks for answer but i am able to read xml file from url but i need if in xml imageUrl is there so show in grid view ..this is my xml file and read URL <?xml version="1.0" encoding="UTF-8"?> <channels> <channel> <name>ndtv</name> <logo>http://a3.twimg.com/profile_images/670625317/aam-logo--twitter.png</logo> <description>this is a news Channel</description> <rssfeed>ndtv.com</rssfeed> </channel> <channel> <name>star news</name> <logo>http://a3.twimg.com/profile_images/740897825/AndroidCast-350_normal.png</logo> <description>this is a newsChannel</description> <rssfeed>starnews.com</rssfeed> </channel> </channels>

Read the article

ICalendar parser in PHP that supports timezones

- by Vincent Robert

I am looking for a PHP class that can parse an ICalendar (ICS) file and correctly handle timezones. I already created an ICS parser myself but it can only handle timezones known to PHP (like 'Europe/Paris'). Unfortunately, ICS file generated by Evolution (default calendar software of Ubuntu) does not use default timezone IDs. It exports events with its a specific timezone ID exporting also the full definition of the timezone: daylight saving dates, recurrence rule and all the hard stuff to understand about timezones. This is too much for me. Since it was only a small utility for my girlfriend, I won't have time to investigate further the ICalendar specification and create a full blown ICalendar parser myself. So is there any known implementation in PHP of ICalendar file format that can parse timezones definitions?

Read the article

How do I extract a substring from a string until the second space is encountered?

- by gbprithvi

i have a string like this: "o1 1232.5467 1232.5467 1232.5467 1232.5467 1232.5467 1232.5467" How do I extract only "o1 1232.5467"? The number of characters to be extracted are not the same always.. hence I want to extract until the second space is encountered.

Read the article

How do I get Bison/YACC to not recognize a command until it parses the whole string?

- by chucknelson

I have some bison grammar: input: /* empty */ | input command ; command: builtin | external ; builtin: CD { printf("Changing to home directory...\n"); } | CD WORD printf("Changing to directroy %s\n", $2); } ; I'm wondering how I get Bison to not accept (YYACCEPT?) something as a command until it reads ALL of the input. So I can have all these rules below that use recursion or whatever to build things up, which either results in a valid command or something that's not going to work. One simple test I'm doing with the code above is just entering "cd mydir mydir". Bison parses CD and WORD and goes "hey! this is a command, put it to the top!". Then the next token it finds is just WORD, which has no rule, and then it reports an error. I want it to read the whole line and realize CD WORD WORD is not a rule, and then report an error. I think I'm missing something obvious and would greatly appreciate any help - thanks! Also - I've tried using input command NEWLINE or something similar, but it still pushes CD WORD to the top as a command and then parses the extra WORD separately.

Read the article

how to detect an escape sequence in a string

- by mix

Given a string named line whose raw version has this value: \rRAWSTRING how can I detect if it has the escape character \r? What I've tried is: if repr(line).startswith('\r'): blah... but it doesn't catch it. I also tried find, such as: if repr(line).find('\r') != -1: blah doesn't work either. What am I missing? thx! EDIT: thanks for all the replies and the corrections re terminolgy and sorry for the confusion. OK, if i do this print repr(line) then what it prints is: '\rSET ENABLE ACK\n' (including the single quotes). i have tried all the suggestions, including: line.startswith(r'\r') line.startswith('\\r') each of which returns False. also tried: line.find(r'\r') line.find('\\r') each of which returns -1

Search Results

Search found 7251 results on 291 pages for 'pdf parsing'.

Page 102/291 | < Previous Page | 98 99 100 101 102 103 104 105 106 107 108 109 | Next Page >

- by Ansari

- by Tim Scott

- by fool

- by StackOverflowNewbie

- by Nai

- by Arrieta

- by Alphaneo

- by SuZi

- by Pierreten

- by dreeves

- by elhefe

- by Soham

- by NJTechGuy

- by Andrey

- by Rosarch

- by Wayne

- by Joschua

- by Paul Reiners

- by Kovu

- by Chris T

- by Pramod kuamr

- by Vincent Robert

- by gbprithvi

- by chucknelson

- by mix

< Previous Page | 98 99 100 101 102 103 104 105 106 107 108 109 | Next Page >