Search Results

Search found 3176 results on 128 pages for 'parsing'.

Page 47/128 | < Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >

  • How to write a bison grammer for WDI?

    - by Rizo
    I need some help in bison grammar construction. From my another question: I'm trying to make a meta-language for writing markup code (such as xml and html) wich can be directly embedded into C/C++ code. Here is a simple sample written in this language, I call it WDI (Web Development Interface): /* * Simple wdi/html sample source code */ #include <mySite> string name = "myName"; string toCapital(string str); html { head { title { mySiteTitle; } link(rel="stylesheet", href="style.css"); } body(id="default") { // Page content wrapper div(id="wrapper", class="some_class") { h1 { "Hello, " + toCapital(name) + "!"; } // Lists post ul(id="post_list") { for(post in posts) { li { a(href=post.getID()) { post.tilte; } } } } } } } Basically it is a C source with a user-friendly interface for html. As you can see the traditional tag-based style is substituted by C-like, with blocks delimited by curly braces. I need to build an interpreter to translate this code to html and posteriorly insert it into C, so that it can be compiled. The C part stays intact. Inside the wdi source it is not necessary to use prints, every return statement will be used for output (in printf function). The program's output will be clean html code. So, for example a heading 1 tag would be transformed like this: h1 { "Hello, " + toCapital(name) + "!"; } // would become: printf("<h1>Hello, %s!</h1>", toCapital(name)); My main goal is to create an interpreter to translate wdi source to html like this: tag(attributes) {content} = <tag attributes>content</tag> Secondly, html code returned by the interpreter has to be inserted into C code with printfs. Variables and functions that occur inside wdi should also be sorted in order to use them as printf parameters (the case of toCapital(name) in sample source). Here are my flex/bison files: id [a-zA-Z_]([a-zA-Z0-9_])* number [0-9]+ string \".*\" %% {id} { yylval.string = strdup(yytext); return(ID); } {number} { yylval.number = atoi(yytext); return(NUMBER); } {string} { yylval.string = strdup(yytext); return(STRING); } "(" { return(LPAREN); } ")" { return(RPAREN); } "{" { return(LBRACE); } "}" { return(RBRACE); } "=" { return(ASSIGN); } "," { return(COMMA); } ";" { return(SEMICOLON); } \n|\r|\f { /* ignore EOL */ } [ \t]+ { /* ignore whitespace */ } . { /* return(CCODE); Find C source */ } %% %start wdi %token LPAREN RPAREN LBRACE RBRACE ASSIGN COMMA SEMICOLON CCODE QUOTE %union { int number; char *string; } %token <string> ID STRING %token <number> NUMBER %% wdi : /* empty */ | blocks ; blocks : block | blocks block ; block : head SEMICOLON | head body ; head : ID | ID attributes ; attributes : LPAREN RPAREN | LPAREN attribute_list RPAREN ; attribute_list : attribute | attribute COMMA attribute_list ; attribute : key ASSIGN value ; key : ID {$$=$1} ; value : STRING {$$=$1} /*| NUMBER*/ /*| CCODE*/ ; body : LBRACE content RBRACE ; content : /* */ | blocks | STRING SEMICOLON | NUMBER SEMICOLON | CCODE ; %% I am having difficulties on defining a proper grammar for the language, specially in splitting WDI and C code . I just started learning language processing techniques so I need some orientation. Could someone correct my code or give some examples of what is the right way to solve this problem?

    Read the article

  • Trying to parse xml, but xmldocument.loadxml() is trying to download?

    - by maxp
    I have a string input that i do not know whether or not is valid xml. I think the simplest aprroach is to wrap new XmlDocument().LoadXml(strINPUT); In a try/catch. The problem im facing is, sometimes strINPUT is an html file, if the header of this file contains <!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd""> <html xml:lang=""en-GB"" xmlns=""http://www.w3.org/1999/xhtml"" lang=""en-GB""> ...like many do, it actually tries to make a connection to the w3.org url, which i really dont want it doing. Anyone know if its possible to just parse the string without trying to be clever and checking external urls? Failing that is there an alternative to xmldocument?

    Read the article

  • java version of python-dateutil

    - by elhefe
    Python has a very handy package that can parse nearly any unambiguous date and provides helpful error messages on a parse failure, python-dateutil. Comparison to the SimpleDateFormat class is not favorable - AFAICT SimpleDateFormat can only handle one exact date format and the error messages have no granularity. I've looked through the Joda API but it appears Joda is the same way - only one explicit format can be parsed at a time. Is there any package or library that reproduces the python-dateutil behavior? Or am I missing something WRT Joda/SimpleDateFormat?

    Read the article

  • Linux: shell builtin string matching

    - by gmatt
    I am trying to become more familiar with using the builtin string matching stuff available in shells in linux. I came across this guys posting, and he showed an example a="abc|def" echo ${a#*|} # will yield "def" echo ${a%|*} # will yield "abc" I tried it out and it does what its advertised to do, but I don't understand what the $,{},#,*,| are doing, I tried looking for some reference online or in the manuals but I couldn't find anything. Can anyone explain to me what's going on here?

    Read the article

  • jQuery $.getJSON - How do I parse a flickr.photos.search REST API call?

    - by Chad
    Trying to adapt the $.getJSON Flickr example: $.getJSON("http://api.flickr.com/services/feeds/photos_public.gne?tags=cat&tagmode=any&format=json&jsoncallback=?", function(data){ $.each(data.items, function(i,item){ $("<img/>").attr("src", item.media.m).appendTo("#images"); if ( i == 3 ) return false; }); }); to read from the flickr.photos.search REST API method, but the JSON response is different for this call. Click here to see the JSON response. This is what I've done so far: var url = "http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=9322c53dde3b36bda33f79c16bb99104&tags=yokota+air+base&safe_search=1&per_page=20"; var src; $.getJSON(url + "&format=json&jsoncallback=?", function(data){ $.each(data.photos, function(i,item){ src = "http://farm"+ item.photo.farm +".static.flickr.com/"+ item.photo.server +"/"+ item.photo.id +"_"+ item.photo.secret +"_m.jpg"; $("<img/>").attr("src", src).appendTo("#images"); if ( i == 3 ) return false; }); }); I guess I'm not building the image src correctly. Couldn't find any documentation on how to build the image src, based on what the JSON response is. How do you parse a flickr.photos.search REST API call?

    Read the article

  • Parse usable Street Address, City, State, Zip from a string

    - by Rob Allen
    Problem: I have an address field from an Access database which has been converted to Sql Server 2005. This field has everything all in one field. I need to parse out the individual sections of the address into their appropriate fields in a normalized table. I need to do this for approximately 4,000 records and it needs to be repeatable. Here are the rules for this exercise: 1 - no whining about how this should have been separate fields in the first place, we are often confronted with less than ideal situations and have to make the best of them 2- for this post, use any language you want 3- feel free to play code golf 4 - Assume an address in the US (for now) 5 - assume that the input string will sometimes contain an addressee (the person being addressed) and/or a second street address (i.e. Suite B) 6 - states may be abbreviated 7 - zip code could be standard 5 digit or zip+4 8 - there are typos in some instances UPDATE: In response to the questions posed, standards were not universally followed, I need need to store the individual values, not just geocode and errors means typo (corrected above) Sample Data: A. P. Croll & Son 2299 Lewes-Georgetown Hwy, Georgetown, DE 19947 11522 Shawnee Road, Greenwood DE 19950 144 Kings Highway, S.W. Dover, DE 19901 Intergrated Const. Services 2 Penns Way Suite 405 New Castle, DE 19720 Humes Realty 33 Bridle Ridge Court, Lewes, DE 19958 Nichols Excavation 2742 Pulaski Hwy Newark, DE 19711 2284 Bryn Zion Road, Smyrna, DE 19904 VEI Dover Crossroads, LLC 1500 Serpentine Road, Suite 100 Baltimore MD 21 580 North Dupont Highway Dover, DE 19901 P.O. Box 778 Dover, DE 19903

    Read the article

  • Counting total sum of each value in one column w.r.t another in Perl

    - by sfactor
    I have tab delimited data with multiple columns. I have OS names in column 31 and data bytes in columns 6 and 7. What I want to do is count the total volume of each unique OS. So, I did something in Perl like this: #!/usr/bin/perl use warnings; my @hhfilelist = glob "*.txt"; my %count = (); for my $f (@hhfilelist) { open F, $f || die "Cannot open $f: $!"; while (<F>) { chomp; my @line = split /\t/; # counting volumes in col 6 and 7 for 31 $count{$line[30]} = $line[5] + $line[6]; } close (F); } my $w = 0; foreach $w (sort keys %count) { print "$w\t$count{$w}\n"; } So, the result would be something like Windows 100000 Linux 5000 Mac OSX 15000 Android 2000 But there seems to be some error in this code because the resulting values I get aren't as expected. What am I doing wrong?

    Read the article

  • C# - Parse HTML source as XML

    - by fonix232
    I would like to read in a dynamic URL what contains a HTML file, and read it like an XML file, based on nodes (HTML tags). Is this somehow possible? I mean, there is this HTML code: <table class="bidders" cellpadding="0" cellspacing="0"> <tr class="bidRow4"> <td>kucik (automata)</td> <td class="right">9 374 Ft</td> <td class="bidders_date">2010-06-10 18:19:52</td> </tr> <tr class="bidRow4"> <td>macszaf (automata)</td> <td class="right">9 373 Ft</td> <td class="bidders_date">2010-06-10 18:19:52</td> </tr> <tr class="bidRow2"> <td>kucik (automata)</td> <td class="right">9 372 Ft</td> <td class="bidders_date">2010-06-10 18:19:42</td> </tr> <tr class="bidRow2"> <td>macszaf (automata)</td> <td class="right">9 371 Ft</td> <td class="bidders_date">2010-06-10 18:19:42</td> </tr> <tr class="bidRow0"> <td>kucik (automata)</td> <td class="right">9 370 Ft</td> <td class="bidders_date">2010-06-10 18:19:32</td> </tr> <tr class="bidRow0"> <td>macszaf (automata)</td> <td class="right">9 369 Ft</td> <td class="bidders_date">2010-06-10 18:19:32</td> </tr> <tr class="bidRow8"> <td>kucik (automata)</td> <td class="right">9 368 Ft</td> <td class="bidders_date">2010-06-10 18:19:22</td> </tr> <tr class="bidRow8"> <td>macszaf (automata)</td> <td class="right">9 367 Ft</td> <td class="bidders_date">2010-06-10 18:19:22</td> </tr> <tr class="bidRow6"> <td>kucik (automata)</td> <td class="right">9 366 Ft</td> <td class="bidders_date">2010-06-10 18:19:12</td> </tr> <tr class="bidRow6"> <td>macszaf (automata)</td> <td class="right">9 365 Ft</td> <td class="bidders_date">2010-06-10 18:19:12</td> </tr> </table> I want to parse this into a ListView (or a Grid) to create rows with the data contained. All tr are different row, and all td in a given td is a column in the given row. And also I want it to be as fast as possible, as it would update itself in 5 seconds. Is there any library for this?

    Read the article

  • PHP Regex to match lines with all-caps with occaisional hyphens.

    - by Yaaqov
    I'm trying to to convert an existing PHP Regular Expression match case to apply to a slightly different style of document. Here's the original style of the document: **FOODS - TYPE A** ___________________________________ **PRODUCT** 1) Mi Pueblito Queso Fresco Authentic Mexican Style Fresh Cheese; 2) La Fe String Cheese **CODE** Sell by date going back to February 1, 2009 And the successfully-running PHP Regex match code that only returns "true" if the line is surrounded by asterisks, and stores each side of the "-" as $m[1] and $m[2], respectively. if ( preg_match('#^\*\*([^-]+)(?:-(.*))?\*\*$#', $line, $m) ) { // only for **header - subheader** $m[2] is set. if ( isset($m[2]) ) { return array(TYPE_HEADER, array(trim($m[1]), trim($m[2]))); } else { return array(TYPE_KEY, array($m[1])); } } So, for line 1: $m[1] = "FOODS" AND $m[2] = "TYPE A"; Line 2 would be skipped; Line 3: $m[1] = "PRODUCT", etc. The question: How would I re-write the above regex match if the headers did not have the asterisks, but still was all-caps, and was at least 4 characters long? For example: FOODS - TYPE A ___________________________________ PRODUCT 1) Mi Pueblito Queso Fresco Authentic Mexican Style Fresh Cheese; 2) La Fe String Cheese CODE Sell by date going back to February 1, 2009 Thank you.

    Read the article

  • PHP: What is an efficient way to parse a text file containing very long lines?

    - by Shaun
    I'm working on a parser in php which is designed to extract MySQL records out of a text file. A particular line might begin with a string corresponding to which table the records (rows) need to be inserted into, followed by the records themselves. The records are delimited by a backslash and the fields (columns) are separated by commas. For the sake of simplicity, let's assume that we have a table representing people in our database, with fields being First Name, Last Name, and Occupation. Thus, one line of the file might be as follows [People] = "\Han,Solo,Smuggler\Luke,Skywalker,Jedi..." Where the ellipses (...) could be additional people. One straightforward approach might be to use fgets() to extract a line from the file, and use preg_match() to extract the table name, records, and fields from that line. However, let's suppose that we have an awful lot of Star Wars characters to track. So many, in fact, that this line ends up being 200,000+ characters/bytes long. In such a case, taking the above approach to extract the database information seems a bit inefficient. You have to first read hundreds of thousands of characters into memory, then read back over those same characters to find regex matches. Is there a way, similar to the Java String next(String pattern) method of the Scanner class constructed using a file, that allows you to match patterns in-line while scanning through the file? The idea is that you don't have to scan through the same text twice (to read it from the file into a string, and then to match patterns) or store the text redundantly in memory (in both the file line string and the matched patterns). Would this even yield a significant increase in performance? It's hard to tell exactly what PHP or Java are doing behind the scenes.

    Read the article

  • What's the best way to retrieve two pieces of data from an XML file?

    - by Morinar
    I've got an XML document that is in either a pre or post FO transformed state that I need to extract some information from. In the pre-case, I need to pull out two tags that represent the pageWidth and pageHeight and in the post case I need to extract the page-height and page-width parameters from a specific tag (I forget which one it is off the top of my head). What I'm looking for is an efficient/easily maintainable way to grab these two elements. I'd like to only read the document a single time fetching the two things I need. I initially started writing something that would use BufferedReader + FileReader, but then I'm doing string searching and it gets messy when the tags span multiple lines. I then looked at the DOMParser, which seems like it would be ideal, but I don't want to have to read the entire file into memory if I could help it as the files could potentially be large and the tags I'm looking for will nearly always be close to the top of the file. I then looked into SAXParser, but that seems like a big pile of complicated overkill for what I'm trying to accomplish. Anybody have any advice? Or simple implementations that would accomplish my goal? Thanks.

    Read the article

  • Is XMLReader a SAX parser, a DOM parser, or neither?

    - by Renesis
    I am testing various methods to read (possibly large, and very often) XML configuration files in PHP. No writing is ever needed. I have two successful implementations, one using SimpleXML (which I know is a DOM parser) and one using XMLReader. I know that a DOM reader must read the whole tree and therefore uses more memory. My tests reflect that. I also know that A SAX parser is an "event-based" parser that uses less memory because it reads each node from the stream without checking what is next. XMLReader also reads from a stream with the cursor providing data about the node it is currently at. So, it definitely sounds like XMLReader (http://us2.php.net/xmlreader) is not a DOM parser, but my question is, is it a SAX parser, or something else? It seems like XMLReader behaves the way a SAX parser does but does not throw the events themselves (in other words, can you construct a SAX parser with XMLReader?) If it is something else, does the classification it's in have a name?

    Read the article

  • How Do I Pull Info from String

    - by Russ Bradberry
    I am trying to pull dynamics from a load that I run using bash. I have gotten to a point where I get the string I want, now from this I want to pull certain information that can vary. The string that gets returned is as follows: Records: 2910 Deleted: 0 Skipped: 0 Warnings: 0 Each of the number can and will vary in length, but the overall structure will remain the same. What I want to do is be able to get these numbers and load them into some bash variables ie: RECORDS=?? DELETED=?? SKIPPED=?? WARNING=?? In regex I would do it like this: Records: (\d*?) Deleted: (\d*?) Skipped (\d*?) Warnings (\d*?) and use the 4 groups in my variables.

    Read the article

  • Right recursive grammar or left recursive?

    - by user2485710
    I have little to no knowledge of what I'm about to ask, so I would like a suggestion based on the level of skills required to implemented a parser for the given grammar ( since I'm a beginner in this kind of formal approach to parsers and languages ). Just by going back of a couple of years, this situation reminds me a little of Pascal grammar vs C/C++ grammar, this left vs right stuff. But I'm not going to do any of that, my purpose is to implement a simple parser for a markup language for documents like Markdown. So considering that I'm starting with a markup language in mind, I want to keep things simple, which is the easiest one to handle between this 2 options and why . Another kind of grammar could be an easier option for me ? If yes which one do you suggest ?

    Read the article

  • How to retrieve a numbered sequence range from a List of filenames?

    - by glenneroo
    I would like to automatically parse the entire numbered sequence range of a List<FileData> of filenames (sans extensions) by checking which part of the filename changes. Here is an example (file extension already removed): First filename: IMG_0000 Last filename: IMG_1000 Numbered Range I need: 0000 to 1000 Except I need to deal with every possible type of file naming convention such as: 0000 ... 9999 20080312_0000 ... 20080312_9999 IMG_0000 - Copy ... IMG_9999 - Copy 8er_green3_00001 .. 8er_green3_09999 etc. I need the entire 0-padded range e.g. 0001 not just 1 The sequence number is 0-padded e.g. 0001 The sequence number can be located anywhere e.g. IMG_0000 - Copy The range can start and end with anything i.e. doesn't have to start with 1 and end with 9999 Whenever I get something working for 8 random test cases, the 9th test breaks everything and I end up re-starting from scratch. I've currently been comparing only the first and last filenames (as opposed to iterating through all filenames): void FindRange(List<FileData> files, out string startRange, out string endRange) { string firstFile = files.First().ShortName; string lastFile = files.Last().ShortName; ... } Does anyone have any clever ideas?

    Read the article

  • remove parent xml tag

    - by cru3l
    For example, we have xml file with this format: <A> <B> <C></C> <D></D> <D></D> </B> </A> i need that: if all "D"-tags elements are empty, then we need to delete whole "A"-tag element and, of course, we need to do this with all "A"-tags in xml.

    Read the article

  • How to get Nokogiri to ignore HTML elements that doesn't exist

    - by user296507
    any idea how i can get the code below to produce this output? 1 - 2 - B i'm getting this error "undefined method `text' for nil:NilClass (NoMethodError)", because i think table 1 does not have the element 'td class=r2' in it. require 'rubygems' require 'nokogiri' require 'open-uri' doc = Nokogiri::HTML.parse(<<-eohtml) <table class="t1"> <tbody> <tr> <td class="r1">1</td> </tr> </tbody> </table> <table class="t2"> <tbody> <tr> <td class="r1">2</td> <td class="r2">B</td> </tr> </tbody> </table> eohtml doc.css('tbody > tr').each do |n| r1 = n.at_css(".r1").text r2 = n.at_css(".r2").text puts "#{r1} - #{r2}" end

    Read the article

  • How can I parse a namespace using the SAX parser?

    - by Silvestri
    Hello, Using a twitter search URL ie. http://search.twitter.com/search.rss?q=android returns CSS that has an item that looks like: <item> <title>@UberTwiter still waiting for @ubertwitter android app!!!</title> <link>http://twitter.com/meals69/statuses/21158076391</link> <description>still waiting for an app!!!</description> <pubDate>Sat, 14 Aug 2010 15:33:44 +0000</pubDate> <guid>http://twitter.com/meals69/statuses/21158076391</guid> <author>Some Twitter User</author> <media:content type="image/jpg" height="48" width="48" url="http://a1.twimg.com/profile_images/756343289/me2_normal.jpg"/> <google:image_link>http://a1.twimg.com/profile_images/756343289/me2_normal.jpg</google:image_link> <twitter:metadata> <twitter:result_type>recent</twitter:result_type> </twitter:metadata> </item> Pretty simple. My code parses out everything (title, link, description, pubDate, etc.) without any problems. However, I'm getting null on: <google:image_link> I'm using Java to parse the RSS feed. Do I have to handle compound localnames differently than I would a more simple localname? This is the bit of code that parses out Link, Description, pubDate, etc: @Override public void endElement(String uri, String localName, String name) throws SAXException { super.endElement(uri, localName, name); if (this.currentMessage != null){ if (localName.equalsIgnoreCase(TITLE)){ currentMessage.setTitle(builder.toString()); } else if (localName.equalsIgnoreCase(LINK)){ currentMessage.setLink(builder.toString()); } else if (localName.equalsIgnoreCase(DESCRIPTION)){ currentMessage.setDescription(builder.toString()); } else if (localName.equalsIgnoreCase(PUB_DATE)){ currentMessage.setDate(builder.toString()); } else if (localName.equalsIgnoreCase(GUID)){ currentMessage.setGuid(builder.toString()); } else if (uri.equalsIgnoreCase(AVATAR)){ currentMessage.setAvatar(builder.toString()); } else if (localName.equalsIgnoreCase(ITEM)){ messages.add(currentMessage); } builder.setLength(0); } } startDocument looks like: @Override public void startDocument() throws SAXException { super.startDocument(); messages = new ArrayList<Message>(); builder = new StringBuilder(); } startElement looks like: @Override public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException { super.startElement(uri, localName, name, attributes); if (localName.equalsIgnoreCase(ITEM)){ this.currentMessage = new Message(); } } Tony

    Read the article

  • Evaluating mathematical expressions in Python

    - by vander
    Hi, I want to tokenize a given mathematical expression into a binary tree like this: ((3 + 4 - 1) * 5 + 6 * -7) / 2 '/' / \ + 2 / \ * * / \ / \ - 5 6 -7 / \ + 1 / \ 3 4 Is there any pure Python way to do this? Like passing as a string to Python and then get back as a tree like mentioned above. Thanks.

    Read the article

  • Python: Is there a way to get HTML that was dynamically created by Javascript?

    - by Joschua
    As far as I can tell, this is the case for LyricWikia. The lyrics (example) can be accessed from the browser, but can't be found in the source code (can be opened with CTRL + U in most browsers) or reading the contents of the site with Python: from urllib.request import urlopen URL = 'http://lyrics.wikia.com/Billy_Joel:Piano_Man' r = urlopen(URL).read().decode('utf-8') And the test: >>> 'Now John at the bar is a friend of mine' in r False >>> 'John' in r False But when you select and look at the source code of the box in which the lyrics are displayed, you can see that there is: <div class="lyricbox">[...]</div> Is there a way to get the contents of that div-element with Python?

    Read the article

  • How do I keep a scanner from throwing exceptions when the wrong type is entered? (java)

    - by David
    Here's some sample code: import java.util.Scanner; class In { public static void main (String[]arg) { Scanner in = new Scanner (System.in) ; System.out.println ("how many are invading?") ; int a = in.nextInt() ; System.out.println (a) ; } } if i run the program and give it an int like 4then everything goes fine. if, on the other hand, i answer too many it doesn't laugh at my funny joke. instead i get this: (as expected) Exception in thread "main" java.util.InputMismatchException at java.util.Scanner.throwFor(Scanner.java:819) at java.util.Scanner.next(Scanner.java:1431) at java.util.Scanner.nextInt(Scanner.java:2040) at java.util.Scanner.nextInt(Scanner.java:2000) at In.main(In.java:9) is there a way so that i can make it so that it either ignores entries that aren't ints or re prompts with "how many are invading?"? i'd like to know how to do both of these.

    Read the article

  • Extract information from javascript counter via PHP

    - by Jennifer Weinberg
    Hi, I'm looking for a way to extract some information from this site via PHP: http://www.mycitydeal.co.uk/deals/london There ist a counter where the time left is displayed, but the information is within the JavaScript. Since I'm really a JavaScript rookie, I didn't really know how to get the information. Normally I would extract the information with "preg_match" and some regular expressions. Can someone help me to extract the information (Hrs., Min., Sec.) ? Jennifer

    Read the article

< Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >