Search Results

Search found 17966 results on 719 pages for 'xml parsing'.

Page 156/719 | < Previous Page | 152 153 154 155 156 157 158 159 160 161 162 163  | Next Page >

  • SAX parser does not resolve filename

    - by phantom-99w
    Another day, another strange error with SAX, Java, and friends. I need to iterate over a list of File objects and pass them to a SAX parser. However, the parser fails because of an IOException. However, the various File object methods confirm that the file does indeed exist. The output which I get: 11:53:57.838 [MainThread] DEBUG DefaultReactionFinder - C:\project\trunk\application\config\reactions\TestReactions.xml 11:53:57.841 [MainThread] ERROR DefaultReactionFinder - C:\project\trunk\application\config\reactions\null (The system cannot find the file specified) So the problem is obviously that null in the second line. I've tried nearly all variations of passing the file as a parameter to the parser, including as a String (both from getAbsolutePath() and entered by hand), as a URI and, even more weirdly, as a FileInputStream (for this I get the same error, except that the entire relative path gets reported as null, so C:\project\trunk\null). All that I can think of is that the SAXParserFactory is incorrectly configured. I have no idea what is wrong, though. Here is the code concerned: SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true); try { parser = factory.newSAXParser(); } catch (ParserConfigurationException e) { throw new InstantiationException("Error configuring an XML parser. Given error message: \"" + e.getMessage() + "\"."); } catch (SAXException e) { throw new InstantiationException("Error creating a SAX parser. Given error message: \"" + e.getMessage() + "\"."); } ... for (File f : fileLister.getFileList()) { logger.debug(f.getAbsolutePath()); try { parser.parse(f, new ReactionHandler(input)); //FileInputStream fs = new FileInputStream(f); //parser.parse(fs, new ReactionHandler(input)); //fs.close(); } catch (IOException e) { logger.error(e.getMessage()); throw new ReactionNotFoundException("An error occurred processing file \"" + f + "\"."); } ... } I have made no special provisions to provide a custom SAX parser implementation: I use the system default. Any help would be greatly appreciated!

    Read the article

  • Iterating through json object doesn't seem to work for me...

    - by Pandiya Chendur
    From a previous question on Stackoverflow Iterating through/Parsing JSON Object via JavaScript.... My json object doesn't seem get parsed.... here is my function function Iteratejsondata(HfJsonValue) { var jsonObj = eval('(' + HfJsonValue + ')'); for (var i = 0, len = HfJsonValue.length; i < len; ++i) { var employee = HfJsonValue[i]; document.write(employee.Emp_Name); } } employee.Emp_Name is undefined but when i give document.write(employee); i get this {"Table" : [{"Emp_Id" : "3","Identity_No" : "","Emp_Name" : "Jerome","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Supervisior","Desig_Description" : "Supervisior of the Construction","SalaryBasis" : "Monthly","FixedSalary" : "25000.00"},{"Emp_Id" : "4","Identity_No" : "","Emp_Name" : "Mohan","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Acc ","Desig_Description" : "Accountant","SalaryBasis" : "Monthly","FixedSalary" : "200.00"},{"Emp_Id" : "5","Identity_No" : "","Emp_Name" : "Murugan","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Mason","Desig_Description" : "Mason","SalaryBasis" : "Weekly","FixedSalary" : "150.00"},{"Emp_Id" : "6","Identity_No" : "","Emp_Name" : "Ram","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Mason","Desig_Description" : "Mason","SalaryBasis" : "Weekly","FixedSalary" : "120.00"},{"Emp_Id" : "7","Identity_No" : "","Emp_Name" : "Raja","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Mason","Desig_Description" : "Mason","SalaryBasis" : "Weekly","FixedSalary" : "135.00"},{"Emp_Id" : "8","Identity_No" : "","Emp_Name" : "Raja kumar","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Mason Helper","Desig_Description" : "Mason Helper","SalaryBasis" : "Weekly","FixedSalary" : "105.00"},{"Emp_Id" : "9","Identity_No" : "","Emp_Name" : "Lakshmi","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Mason Helper","Desig_Description" : "Mason Helper","SalaryBasis" : "Weekly","FixedSalary" : "100.00"},{"Emp_Id" : "10","Identity_No" : "","Emp_Name" : "Palani","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Carpenter","Desig_Description" : "Carpenter","SalaryBasis" : "Weekly","FixedSalary" : "200.00"},{"Emp_Id" : "11","Identity_No" : "","Emp_Name" : "Annamalai","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Carpenter","Desig_Description" : "Carpenter","SalaryBasis" : "Weekly","FixedSalary" : "220.00"},{"Emp_Id" : "12","Identity_No" : "","Emp_Name" : "David","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Steel Fixer","Desig_Description" : "Steel Fixer","SalaryBasis" : "Weekly","FixedSalary" : "220.00"},{"Emp_Id" : "13","Identity_No" : "","Emp_Name" : "Chandru","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Steel Fixer","Desig_Description" : "Steel Fixer","SalaryBasis" : "Weekly","FixedSalary" : "220.00"},{"Emp_Id" : "14","Identity_No" : "","Emp_Name" : "Mani","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Steel Helper","Desig_Description" : "Steel Helper","SalaryBasis" : "Weekly","FixedSalary" : "175.00"},{"Emp_Id" : "15","Identity_No" : "","Emp_Name" : "Karthik","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Wood Fixer","Desig_Description" : "Wood Fixer","SalaryBasis" : "Weekly","FixedSalary" : "195.00"},{"Emp_Id" : "16","Identity_No" : "","Emp_Name" : "Bala","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Wood Fixer","Desig_Description" : "Wood Fixer","SalaryBasis" : "Weekly","FixedSalary" : "185.00"},{"Emp_Id" : "17","Identity_No" : "","Emp_Name" : "Tamil arasi","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Wood Helper","Desig_Description" : "Wood Helper","SalaryBasis" : "Weekly","FixedSalary" : "185.00"},{"Emp_Id" : "18","Identity_No" : "","Emp_Name" : "Perumal","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Cook","Desig_Description" : "Cook","SalaryBasis" : "Weekly","FixedSalary" : "105.00"},{"Emp_Id" : "19","Identity_No" : "","Emp_Name" : "Andiappan","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Watchman","Desig_Description" : "Watchman","SalaryBasis" : "Weekly","FixedSalary" : "150.00"}]} Any suggestion how to get this done...

    Read the article

  • How can I use Convert.ChangeType to convert string into numerics with group separator?

    - by Loic
    Hello, I want to make a generic string to numeric converter, and provide it as a string extension, so I wrote the following code: public static bool TryParse<T>( this string text, out T result, IFormatProvider formatProvider ) where T : struct try { result = (T)Convert.ChangeType( text, typeof( T ), formatProvider ); return true; } catch(... I call it like this: int value; var ok = "123".TryParse(out value, NumberFormatInfo.CurrentInfo) It works fine until I want to use a group separator: As I live in France, where the thousand separator is a space and the decimal separator is a comma, the string "1 234 567,89" should be equals to 1234567.89 (in Invariant culture). But, the function crashes! When a try to perform a non generic conversion, like double.Parse(...), I can use an overload which accepts a NumberStyles parameter. I specify NumberStyles.Number and this time it works! So, the questions are : Why the parsing does not respect my NumberFormatInfo (where the NumberGroupSeparator is well specified to a space, as I specified in my OS) How could I make work the generic version with Convert.ChangeTime, as it has no overload wich accepts a NumberStyles parameter ?

    Read the article

  • Extract known pattern substring from NSString (without regex)

    - by d11wtq
    I'm really tempted to drop RegexKit (or my own libpcre wrapper) into my project in order to do this, but before I do that I want to know how Cocoa developers manage to do half of this basic stuff without really convoluted code or without linking with RegexKit or another regular expression library. I find it gobsmacking that Cocoa does not include any regular expression matching features. I've so accustomed to using regular expressions for all kinds of things that I'm lost without them. I can do what I need without them, but the code would be rather convoluted. So, Cocoa devs, I ask you, what's the "Cocoa way" to do this... The problem is an everyday problem in programming as far as I'm concerned. Cocoa must have ways of doing this with the built-in features. Note that the position of the elements I want to match changes, and sometimes "quotes" are present. Whitespace is variable. Take the following strings: Content-Type: application/xml; charset=utf-8 Content-Type: text/html; charset="iso-8859-1" Content-Type: text/plain; charset=us-ascii Content-Type: text/plain; name="example.txt"; charset=utf-8 From all of these strings, how would you go about determining the mime type (e.g. text/plain) and the charset (e.g. utf-8) using just the built-in Cocoa classes? I'd end up performing a series of -rangeOfString: and substring calls, with conditional checks to deal with the optional quotes etc. Is there a way to do this with NSScanner? The NSScanner class seems to have a pretty naive API to me. Something like C's sscanf() that works for NSString objects would be an ideal fit. Most of my string parsing needs are simple such as this example so maybe regular expressions, while I'm accustomed to them, are overkill?

    Read the article

  • Is there any open source tool that automatically 'detects' email threading like Gmail?

    - by Chris W.
    For instance, if the original message (message 1) is... Hey Jon, Want to go get some pizza? -Bill And the reply (message 2) is... Bill, Sorry, I can't make lunch today. Jonathon Parks, CTO Acme Systems On Wed, Feb 24, 2010 at 4:43 PM, Bill Waters wrote: Hey John, Want to go get some pizza? -Bill In Gmail, the system (a) detects that message 2 is a reply to message 1 and turns this into a 'thread' of sorts and (b) detects where the replied portion of the message actually is and hides it from the user. (In this case the hidden portion would start at "On Wed, Feb..." and continue to the end of the message.) Obviously, in this simple example it would be easy to detect the "On <Date, <Name wrote:" or the "" character prefixes. But many email systems have many different style of marking replies (not to mention HTML emails). I get the feeling that you would have to have some damn smart string parsing algorithms to get anywhere near how good GMail's is. Does this technology already exist in an open source project somewhere? Either in some library devoted to this exclusively or perhaps in some open source email client that does similar message threading? Thanks.

    Read the article

  • Extracting images from a PDF

    - by sagar
    My Query I want to extract only images from a PDF document, using Objective-C in an iPhone Application. My Efforts I have gone through the info on this link, which has details regarding different operators on PDF documents. I also studied this document from Apple about PDF parsing with Quartz. I also went through the entire PDF reference document from the Adobe site. According to that document, for each image there are the following operators: q Q BI EI I have created a table to get the image: myTable = CGPDFOperatorTableCreate(); CGPDFOperatorTableSetCallback(myTable, "q", arrayCallback2); CGPDFOperatorTableSetCallback(myTable, "TJ", arrayCallback); CGPDFOperatorTableSetCallback(myTable, "Tj", stringCallback); I use this method to get the image: void arrayCallback2(CGPDFScannerRef inScanner, void *userInfo) { // THIS DOESN'T WORK // CGPDFStreamRef stream; // represents a sequence of bytes // if (CGPDFDictionaryGetStream (d, "BI", &stream)){ // CGPDFDataFormat t=CGPDFDataFormatJPEG2000; // CFDataRef data = CGPDFStreamCopyData (stream, &t); // } } This method is called for the operator "q", but I don't know how to extract an image from it. What should be the solution for extracting the images from the PDF documents? Thanks in advance for your kind help.

    Read the article

  • How to parse text fragments located outside tags (inbetween tags) by simplehtmldom?

    - by moogeek
    Hello! I'm using simplehtmldom to parse html and I'm stuck in parsing plaintext located outside of any tag (but between two different tags): <div class="text_small"> <b>?dress:</b> 7 Hange Road<br> <b>Phone:</b> 415641587484<br> <b>Contact:</b> Alex<br> <b>Meeting Time:</b> 12:00-13:00<br> </div> Is it possible to get these values of Adress, Phone, Contact, Meeting Time? I wonder if there is a opportunity to pass CSS Selectors into nextSibling/previousSibling functions... foreach($html->find('div.text_small') as $div_descr) { foreach($div_descr->find('b') as $b) { if ($b->innertext=="?dress:") {//someaction } if ($b->innertext=="Phone:") { //someaction } if ($b->innertext=="Contact:") { //someaction } if ($b->innertext=="Meeting Time:") { //someaction } } } What I should use instead "someaction" ? upd. Yes, I don't have an access for editing the target page. Otherwise, would it be worth to? :)

    Read the article

  • Can't read some attributes with SAX

    - by akappa
    Hi all, I'm trying to parse that document with SAX: <scxml version="1.0" initialstate="start" name="calc"> <datamodel> <data id="expr" expr="0" /> <data id="res" expr="0" /> </datamodel> <state id="start"> <transition event="OPER" target="opEntered" /> <transition event="DIGIT" target="operand" /> </state> <state id="operand"> <transition event="OPER" target="opEntered" /> <transition event="DIGIT" /> </state> </scxml> I read all the attributes well, except "initialstate" and "name"... I get the attributes with the startElement handler, but the size of the attribute list for scxml is zero. Why? How I can overcome that problem? Edit: public void startElement(String uri, String localName, String qName, Attributes attributes){ System.out.println(attributes.getValue("initialstate")); System.out.println(attributes.getValue("name")); } that, when parsing the first tag, doesn't work (prints "null" two times). In fact, attributes.getLength(); evaluates to zero. Thanks

    Read the article

  • Why does 12:20 PM parse to 0:20 on the next day?

    - by Hanno Fietz
    I'm using java.text.SimpleDateFormat to parse string representations of date/time values inside an XML document. I'm seeing all times that have an hour value of 12 shifted by 12 hours into the future, i. e. 20 minutes past noon gets parsed to mean 20 minutes past midnight the following day. I wrote a unit test which seems to confirm that the error is made upon parsing (I checked the return values from getTime() with the linux shell command date). Now I'm wondering: is there a bug in the parse() method? is there something wrong with the input string? am I using the wrong format string for the input? The input data is taken from Yahoo's YWeather service. Here's the test and its output: public class YWeatherReaderTest { public static final String[] rgDateSamples = { "Thu, 08 Apr 2010 12:20 PM CEST", "Thu, 08 Apr 2010 12:20 AM CEST" }; public void dateParsing() throws ParseException { DateFormat formatter = new SimpleDateFormat("EEE, dd MMM yyyy K:m a z", Locale.US); for (String dtsSrc : YWeatherReaderTest.rgDateSamples) { Date dt = formatter.parse(dtsSrc); String dtsDst = formatter.format(dt); System.out.println(dtsSrc); System.out.println(dtsDst); System.out.println(); } } } Thu, 08 Apr 2010 12:20 PM CEST Fri, 09 Apr 2010 0:20 AM CEST Thu, 08 Apr 2010 12:20 AM CEST Thu, 08 Apr 2010 0:20 PM CEST The second output line of the second iteration is slightly weird, because 00:20 isn't PM. The milliseconds value of the Date object, however, corresponds to the (wrong) time of 20 minutes past noon.

    Read the article

  • Can Haskell's Parsec library be used to implement a recursive descent parser with backup?

    - by Thor Thurn
    I've been considering using Haskell's Parsec parsing library to parse a subset of Java as a recursive descent parser as an alternative to more traditional parser-generator solutions like Happy. Parsec seems very easy to use, and parse speed is definitely not a factor for me. I'm wondering, though, if it's possible to implement "backup" with Parsec, a technique which finds the correct production to use by trying each one in turn. For a simple example, consider the very start of the JLS Java grammar: Literal: IntegerLiteral FloatingPointLiteral I'd like a way to not have to figure out how I should order these two rules to get the parse to succeed. As it stands, a naive implementation like this: literal = do { x <- try (do { v <- integer; return (IntLiteral v)}) <|> (do { v <- float; return (FPLiteral v)}); return(Literal x) } Will not work... inputs like "15.2" will cause the integer parser to succeed first, and then the whole thing will choke on the "." symbol. In this case, of course, it's obvious that you can solve the problem by re-ordering the two productions. In the general case, though, finding things like this is going to be a nightmare, and it's very likely that I'll miss some cases. Ideally, I'd like a way to have Parsec figure out stuff like this for me. Is this possible, or am I simply trying to do too much with the library? The Parsec documentation claims that it can "parse context-sensitive, infinite look-ahead grammars", so it seems like something like I should be able to do something here.

    Read the article

  • How to make a small engine like Wolfram|Alpha?

    - by Koning WWWWWWWWWWWWWWWWWWWWWWW
    Lets say I have three models/tables: operating_systems, words, and programming_languages: # operating_systems name:string created_by:string family:string Windows Microsoft MS-DOS Mac OS X Apple UNIX Linux Linus Torvalds UNIX UNIX AT&T UNIX # words word:string defenitions:string window (serialized hash of defenitions) hello (serialized hash of defenitions) UNIX (serialized hash of defenitions) # programming_languages name:string created_by:string example_code:text C++ Bjarne Stroustrup #include <iostream> etc... HelloWorld Jeff Skeet h AnotherOne Jon Atwood imports 'SORULEZ.cs' etc... When a user searches hello, the system shows the defenitions of 'hello'. This is relatively easy to implement. However, when a user searches UNIX, the engine must choose: word or operating_system. Also, when a user searches windows (small letter 'w'), the engine chooses word, but should also show Assuming 'windows' is a word. Use as an <a href="etc..">operating system</a> instead. Can anyone point me in the right direction with parsing and choosing the topic of the search query? Thanks. Note: it doesn't need to be able to perform calculations as WA can do.

    Read the article

  • How could I parse this HTML file?

    - by Sergio Tapia
    <div id="main"> <style type="text/css"> </style> <script language="JavaScript"> </script> <p style="margin: 0pt 0pt 0.5em;"><b>Media from&nbsp;<a onclick="(new Image()).src='/rg/find-media-title/media_strip/images/b.gif?link=/title/tt0087538/';" href="/title/tt0087538/">The Karate Kid</a> (1984)</b></p> <style type="text/css"> </style> <table style="border-collapse: collapse;"> </table> </div> I need to somehow extract the href value of the (new Image()). How exactly would I accomplish this with HtmlAgilityPack? I'm new to it, and so far I haven't found a useful tutorial on how to effectively use it for parsing. Thanks for the help!

    Read the article

  • BeautifulSoup can't parse a webpage?

    - by JLTChiu
    I am using beautiful soup for parsing webpage now, I've heard it's very famous and good, but it doesn't seems works properly. Here's what I did import urllib2 from bs4 import BeautifulSoup page = urllib2.urlopen("http://www.cnn.com/2012/10/14/us/skydiver-record-attempt/index.html?hpt=hp_t1") soup = BeautifulSoup(page) print soup.prettify() I think this is kind of straightforward. I open the webpage and pass it to the beautifulsoup. But here's what I got: Warning (from warnings module): File "C:\Python27\lib\site-packages\bs4\builder\_htmlparser.py", line 149 "Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help.")) ... HTMLParseError: bad end tag: u'</"+"script>', at line 634, column 94 I thought CNN website should be well designed, so I am not very sure what's going on though. Does anyone has idea about this?

    Read the article

  • Using JavaCC to infer semantics from a Composite tree

    - by Skice
    Hi all, I am programming (in Java) a very limited symbolic calculus library that manages polynomials, exponentials and expolinomials (sums of elements like "x^n * e^(c x)"). I want the library to be extensible in the sense of new analytic forms (trigonometric, etc.) or new kinds of operations (logarithm, domain transformations, etc.), so a Composite pattern that represent the syntactic structure of an expression, together with a bunch of Visitors for the operations, does the job quite well. My problem arise when I try to implement operations that depends on the semantics more than on the syntax of the Expression (like integrals, for instance: there are a lot of resolution methods for specific classes of functions, but these same classes can be represented with more than a single syntax). So I thought I need something to "parse" the Composite tree to infer its semantics in order to invoke the right integration method (if any). Someone pointed me to JavaCC, but all the examples I've seen deal only with string parsing; so, I don't know if I'm digging in the right direction. Some suggestions? (I hope to have been clear enough!)

    Read the article

  • Best and simple way to handle JSON in Django

    - by primal
    Hi, As part of the application we are developing (with android client and Django server) a json object which contains user name and pass word is sent to server from android client as follows HttpPost post = new HttpPost(URL); /*Adding key value pairs */ json.put("username", un); json.put("password", pwd); StringEntity se = new StringEntity(json.toString()); post.setEntity(se); response = client.execute(post); The response is parsed like this result = responsetoString(response.getEntity().getContent()); //Converts response to String jObject = new JSONObject(result); JSONObject post = jObject.getJSONObject("post"); username = post.getString("username"); message = post.getString("message"); Hope upto this everything is fine. The problem comes when parsing or sending JSON responses in Django server. Whats the best way to do this? We tried using SimpleJSON and it turned out not to be so simple as we didn't find any good tutorials or sample code for the same? Are there any python functions similiar to get,put and opt in java for JSON? Any help would be much appreciated..

    Read the article

  • Technique to remove common words(and their plural versions) from a string

    - by Jake M
    I am attempting to find tags(keywords) for a recipe by parsing a long string of text. The text contains the recipe ingredients, directions and a short blurb. What do you think would be the most efficient way to remove common words from the tag list? By common words, I mean words like: 'the', 'at', 'there', 'their' etc. I have 2 methodologies I can use, which do you think is more efficient in terms of speed and do you know of a more efficient way I could do this? Methodology 1: - Determine the number of times each word occurs(using the library Collections) - Have a list of common words and remove all 'Common Words' from the Collection object by attempting to delete that key from the Collection object if it exists. - Therefore the speed will be determined by the length of the variable delims import collections from Counter delim = ['there','there\'s','theres','they','they\'re'] # the above will end up being a really long list! word_freq = Counter(recipe_str.lower().split()) for delim in set(delims): del word_freq[delim] return freq.most_common() Methodology 2: - For common words that can be plural, look at each word in the recipe string, and check if it partially contains the non-plural version of a common word. Eg; For the string "There's a test" check each word to see if it contains "there" and delete it if it does. delim = ['this','at','them'] # words that cant be plural partial_delim = ['there','they',] # words that could occur in many forms word_freq = Counter(recipe_str.lower().split()) for delim in set(delims): del word_freq[delim] # really slow for delim in set(partial_delims): for word in word_freq: if word.find(delim) != -1: del word_freq[delim] return freq.most_common()

    Read the article

  • Why am I getting a ParseException when using SimpleDateFormat to format a date and then parse it?

    - by Greg
    I have been debugging some existing code for which unit tests are failing on my system, but not on colleagues' systems. The root cause is that SimpleDateFormat is throwing ParseExceptions when parsing dates that should be parseable. I created a unit test that demonstrates the code that is failing on my system: import java.text.DateFormat; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.Date; import java.util.TimeZone; import junit.framework.TestCase; public class FormatsTest extends TestCase { public void testParse() throws ParseException { DateFormat formatter = new SimpleDateFormat("yyyyMMddHHmmss.SSS Z"); formatter.setTimeZone(TimeZone.getDefault()); formatter.setLenient(false); formatter.parse(formatter.format(new Date())); } } This test throws a ParseException on my system, but runs successfully on other systems. java.text.ParseException: Unparseable date: "20100603100243.118 -0600" at java.text.DateFormat.parse(DateFormat.java:352) at FormatsTest.testParse(FormatsTest.java:16) I have found that I can setLenient(true) and the test will succeed. The setLenient(false) is what is used in the production code that this test mimics, so I don't want to change it.

    Read the article

  • PyParsing: Not all tokens passed to setParseAction()

    - by Rosarch
    I'm parsing sentences like "CS 2110 or INFO 3300". I would like to output a format like: [[("CS" 2110)], [("INFO", 3300)]] To do this, I thought I could use setParseAction(). However, the print statements in statementParse() suggest that only the last tokens are actually passed: >>> statement.parseString("CS 2110 or INFO 3300") Match [{Suppress:("or") Re:('[A-Z]{2,}') Re:('[0-9]{4}')}] at loc 7(1,8) string CS 2110 or INFO 3300 loc: 7 tokens: ['INFO', 3300] Matched [{Suppress:("or") Re:('[A-Z]{2,}') Re:('[0-9]{4}')}] -> ['INFO', 3300] (['CS', 2110, 'INFO', 3300], {'Course': [(2110, 1), (3300, 3)], 'DeptCode': [('CS', 0), ('INFO', 2)]}) I expected all the tokens to be passed, but it's only ['INFO', 3300]. Am I doing something wrong? Or is there another way that I can produce the desired output? Here is the pyparsing code: from pyparsing import * def statementParse(str, location, tokens): print "string %s" % str print "loc: %s " % location print "tokens: %s" % tokens DEPT_CODE = Regex(r'[A-Z]{2,}').setResultsName("DeptCode") COURSE_NUMBER = Regex(r'[0-9]{4}').setResultsName("CourseNumber") OR_CONJ = Suppress("or") COURSE_NUMBER.setParseAction(lambda s, l, toks : int(toks[0])) course = DEPT_CODE + COURSE_NUMBER.setResultsName("Course") statement = course + Optional(OR_CONJ + course).setParseAction(statementParse).setDebug()

    Read the article

  • How to transform a production to LL(1) for a list separated by a semicolon?

    - by Subb
    Hi, I'm reading this introductory book on parsing (which is pretty good btw) and one of the exercice is to "build a parser for your favorite language." Since I don't want to die today, I thought I could do a parser for something relatively simple, ie a simplified CSS. Note: This book teach you how to right a LL(1) parser using the recursive-descent algorithm. So, as a sub-exercice, I am building the grammar from what I know of CSS. But I'm stuck on a production that I can't transform in LL(1) : //EBNF block = "{", declaration, {";", declaration}, [";"], "}" //BNF <block> =:: "{" <declaration> "}" <declaration> =:: <single-declaration> <opt-end> | <single-declaration> ";" <declaration> <opt-end> =:: "" | ";" This describe a CSS block. Valid block can have the form : { property : value } { property : value; } { property : value; property : value } { property : value; property : value; } ... The problem is with the optional ";" at the end, because it overlap with the starting character of {";", declaration}, so when my parser meet a semicolon in this context, it doesn't know what to do. The book talk about this problem, but in its example, the semicolon is obligatory, so the rule can be modified like this : block = "{", declaration, ";", {declaration, ";"}, "}" So, Is it possible to achieve what I'm trying to do using a LL(1) parser?

    Read the article

  • How to parse phpDoc style comment block with php?

    - by Reveller
    Please consider the following code with which I'm trying to parse only the first phpDoc style comment (noy using any other libraries) in a file (file contents put in $data variable for testing purposes): $data = " /** * @file A lot of info about this file * Could even continue on the next line * @author [email protected] * @version 2010-05-01 * @todo do stuff... */ /** * Comment bij functie bar() * @param Array met dingen */ function bar($baz) { echo $baz; } "; $data = trim(preg_replace('/\r?\n *\* */', ' ', $data)); preg_match_all('/@([a-z]+)\s+(.*?)\s*(?=$|@[a-z]+\s)/s', $data, $matches); $info = array_combine($matches[1], $matches[2]); print_r($info) This almose works, except for the fact that everything after @todo (including the bar() comment block and code) is considered the value of @todo: Array ( [file] => A lot of info about this file Could even continue on the next line [author] => [email protected] [version] => 2010-05-01 [todo] => do stuff... / /** Comment bij functie bar() [param] => Array met dingen / function bar() { echo ; } ) How does my code need to be altered so that only the first comment block is being parsed (in other words: parsing should stop after the first "*/" encountered?

    Read the article

  • Pull specific information from a long list with Perl

    - by melignus
    The file that I've got to work with here is the result of an LDAP extraction but I need to ultimately get the information formatted over to something that a spreadsheet can use. So, the data is as follows: DataDataDataDataDataDataDataDataDataDataDataDataDataDataDataData DataDataDataDataDataDataDataDataDataDataDataDataDataDataDataData displayName: John Doe name: ##userName DataDataDataDataDataDataDataDataDataDataDataDataDataDataDataData DataDataDataDataDataDataDataDataDataDataDataDataDataDataDataData displayName: Jane Doe name: ##userName DataDataDataDataDataDataDataDataDataDataDataDataDataDataDataData DataDataDataDataDataDataDataDataDataDataDataDataDataDataDataData displayName: Ted Doe name: ##userName The format that I need to export to is: firstName lastName userName firstName lastName userName firstName lastName userName Where the spaces are tabs so I can then impor that file into a database. I have experience doing this in VBScript but I'm trying to switch over to using Perl for as much server administration as possible. I'm not sure on the syntax for what I want which is basically while not endoffile{ detect "displayName: " & $firstName & " " & $lastName detect "name: ##" & $userName write $firstName tab $lastName tab $userName to file } Also if someone could point me to a resource specifically on the text parsing syntax that Perl uses, I'd be very grateful. Most of the resources that I've come across haven't been very helpful.

    Read the article

  • Which technology should I use to transform my latex documents into html documents

    - by Matthias Günther
    Hey, I want to write a little program that transforms my TeX files into HTML. I want to parse the documents and turn the macros (the build-in and of course my own) into HTML pieces. Here are my requirements: predefined rules (e.g. begin{itemize} \item text \end{itemize} = <br> <p>text </p> <br/>) defining own CSS style ability to convert formulars (extract the formulars, load them in an imagecreator and then save the jpg/png) easy to maintain and concise I know there are several technologies out there, but I don't exactly know which is the best for me. Here are the technologies which flow into my mind Ruby (I/O is easy, formular loading via webrat), XML XSLT (I don't think that I need just overhead) perl (there are many libs out there but I'm not quite familiar with it) bash (I worked with sed and was surprised how easy it was to work with regular expressions) latex2html ... (these converters won't work for me and they don't give me freedom in parsing) Any suggestions, hints and comments are welcome. Thanks for your time, folks.

    Read the article

  • Robust DateTime parser library for .NET

    - by Frank Krueger
    Hello, I am writing an RSS and Mail reader app in C# (technically MonoTouch). I have run into the issue of parsing DateTimes. I see a lot of variance in how dates are presented in the wild and have begun writing a function like this: public static DateTime ParseTime(string timeStr) { var formats = new string[] { "ddd, d MMM yyyy H:mm:ss \"GMT+00:00\"", "d MMM yyyy H:mm:ss \"EST\"", "yyyy-MM-dd\"T\"HH:mm:ss\"Z\"", "ddd MMM d HH:mm:ss \"+0000\" yyyy", }; try { return DateTime.Parse(timeStr); } catch (Exception) { } foreach (var f in formats) { try { var t = DateTime.ParseExact(timeStr, f, CultureInfo.InvariantCulture); return t; } catch (Exception) { } } return DateTime.MinValue; } This, well, makes me sick. Three points. (1) It's silly of me to think that I can actually collect a format list that will cover everything out there. (2) It's wrong! Notice that I'm treating an EST date time as UTC (since .NET seems oblivious to time zones). (3) I don't like using exceptions for logic. I am looking for an existing library (source only please) that is known to handle a bunch of these formats. Also, I would like to keep using UTC DateTimes throughout my code so whatever library is suggested should be able to produce DateTimes. Is there anything out there like this?

    Read the article

  • Replace string with incremented value

    - by Andrei
    Hello, I'm trying to write a CSS parser to automatically dispatch URLs in background images to different subdomains in order to parallelize downloads. Basically, I want to replace things like url(/assets/some-background-image.png) with url(http://assets[increment].domain.com/assets/some-background-image.png) I'm using this inside a class that I eventually want to evolve into doing various CSS parsing tasks. Here are the relevant parts of the class : private function parallelizeDownloads(){ static $counter = 1; $newURL = "url(http://assets".$counter.".domain.com"; The counter needs to be reset when it reaches 4 in order to limit to 4 subdomains. if ($counter == 4) { $counter = 1; } $counter ++; return $newURL; } public function replaceURLs() { This is mostly nonsense, but I know the code I'm looking for looks somewhat like this. Note : $this-css contains the CSS string. preg_match("/url/i",$this->css,$match); foreach($match as $URL) { $newURL = self::parallelizeDownloads(); $this->css = str_replace($match, $newURL,$this->css); } }

    Read the article

  • Delphi: Transporting Objects to remote computers

    - by pr0wl
    Hallo. I am writing a tier2 ordering software for network usage. So we have client and server. On the client I create Objects of TBest in which the Product ID, the amount and the user who orders it are saved. (So this is a item of an Order). An order can have multiple items and those are saved in an array to later send the created order to the server. The class that holds the array is called TBestellung. So i created both TBest.toString: string; and TBest.fromString(source: string): TBest; Now, I send the toString result to the server via socket and on the server I create the object using fromString (its parsing the attributes received). This works as intended. Question: Is there a better and more elegant way to do that? Serialisation is a keyword, yes, but isn't that awful / difficult when you serialize an object (TBestellung in this case) that contains an Array of other Objects (TBest in this case)? //Small amendment: Before it gets asked. Yes I should create an extra (static) class for toString and fromString because otherwise the server needs to create an "empty" TBest in order to be able to use fromString.

    Read the article

< Previous Page | 152 153 154 155 156 157 158 159 160 161 162 163  | Next Page >