html parsing - Page 186

Regular Expression to parse a string from end until a delimiter is reached(Perl)

- by Zerobu

Hello, I need a regular expression to parse a text, the text is a URL. The URL is http://www.foo.com/bar/hello.txt. I want to get rid of the hello.txt, the delimiter is the slash. I would like to get http://www.foo.com/bar/

Read the article

Python/YACC: Resolving a shift/reduce conflict

- by Rosarch

I'm using PLY. Here is one of my states from parser.out: state 3 (5) course_data -> course . (6) course_data -> course . course_list_tail (3) or_phrase -> course . OR_CONJ COURSE_NUMBER (7) course_list_tail -> . , COURSE_NUMBER (8) course_list_tail -> . , COURSE_NUMBER course_list_tail ! shift/reduce conflict for OR_CONJ resolved as shift $end reduce using rule 5 (course_data -> course .) OR_CONJ shift and go to state 7 , shift and go to state 8 ! OR_CONJ [ reduce using rule 5 (course_data -> course .) ] course_list_tail shift and go to state 9 I want to resolve this as: if OR_CONJ is followed by COURSE_NUMBER: shift and go to state 7 else: reduce using rule 5 (course_data -> course .) How can I fix my parser file to reflect this? Do I need to handle a syntax error by backtracking and trying a different rule?

Read the article

How to read XML using XPath in Java

- by kaibuki

Hi guys, I want to read XML data using XPath in Java, so for the information I have gathered I am not able to parse XML according to my requirement. here is what I want to do: Get XML file from online via its URL, then use XPath to parse it, I want to create two methods in it. One is in which I enter a specific node attribute id, and I get all the child nodes as result, and second is suppose I just want to get a specific child node value only <?xml version="1.0"?><howto> <topic name="Java"> <url>http://www.rgagnonjavahowto.htm</url> <car>taxi</car> </topic> <topic ame="PowerBuilder"> <url>http://www.rgagnon/pbhowto.htm</url> <url>http://www.rgagnon/pbhowtonew.htm</url> </topic> <topic name="Javascript"> <url>http://www.rgagnon/jshowto.htm</url> </topic> <topic name="VBScript"> <url>http://www.rgagnon/vbshowto.htm</url> </topic></howto> In above example I want to read all the elements if I search via @name and also one function in which I just want the url from @name 'Javascript' only return one node element. I hope I cleared my question :) Thanks. Kai

Read the article

How do I get Bison/YACC to not recognize a command until it parses the whole string?

- by chucknelson

I have some bison grammar: input: /* empty */ | input command ; command: builtin | external ; builtin: CD { printf("Changing to home directory...\n"); } | CD WORD printf("Changing to directroy %s\n", $2); } ; I'm wondering how I get Bison to not accept (YYACCEPT?) something as a command until it reads ALL of the input. So I can have all these rules below that use recursion or whatever to build things up, which either results in a valid command or something that's not going to work. One simple test I'm doing with the code above is just entering "cd mydir mydir". Bison parses CD and WORD and goes "hey! this is a command, put it to the top!". Then the next token it finds is just WORD, which has no rule, and then it reports an error. I want it to read the whole line and realize CD WORD WORD is not a rule, and then report an error. I think I'm missing something obvious and would greatly appreciate any help - thanks! Also - I've tried using input command NEWLINE or something similar, but it still pushes CD WORD to the top as a command and then parses the extra WORD separately.

Read the article

ICalendar parser in PHP that supports timezones

- by Vincent Robert

I am looking for a PHP class that can parse an ICalendar (ICS) file and correctly handle timezones. I already created an ICS parser myself but it can only handle timezones known to PHP (like 'Europe/Paris'). Unfortunately, ICS file generated by Evolution (default calendar software of Ubuntu) does not use default timezone IDs. It exports events with its a specific timezone ID exporting also the full definition of the timezone: daylight saving dates, recurrence rule and all the hard stuff to understand about timezones. This is too much for me. Since it was only a small utility for my girlfriend, I won't have time to investigate further the ICalendar specification and create a full blown ICalendar parser myself. So is there any known implementation in PHP of ICalendar file format that can parse timezones definitions?

Read the article

how can I parse and group/do stats on user agent strings?

- by user151841

I have a database that has the various user-agent strings of visitors to our site. I'd like to do a 'survey' of them to see what browsers our users are using, so that I can know what features I can use in future development. Is there a tool to parse and run statistics on user-agent strings, or a bunch of strings like this? Ideally, I'd like to see a hierarchical grouping of the stats. For instance: Opera/9.80 (Windows Mobile; WCE; Opera Mobi/WMD-50301; U; en) Presto/2.4.13 Version/10.00 Opera/9.80 (J2ME/MIDP; Opera Mini/4.2.14912/1280; U; en) Presto/2.2.0 Opera/9.80 (J2ME/MIDP; Opera Mini/4.2.13918/812; U; en) Presto/2.2.0 Opera/9.64 (Macintosh; Intel Mac OS X; U; en) Presto/2.1.1 Opera/9.60 (J2ME/MIDP; Opera Mini/4.2.13918/786; U; en) Presto/2.2.0 Opera/9.60 (J2ME/MIDP; Opera Mini/4.0.10992/432; U; en) Presto/2.2.0 I'd like to see 6 entries for Opera, broken down into 3 for 9.80, 1 for 9.64 and 2 for 9.60, and so forth for all browsers. Other dimensions, such as OS, would cross the boundaries of the browser version hierarchy, but it might be nice to see also.

Read the article

Parse Formulae in C#

- by Cool

Hello All, I am trying to parse formula in C# language like "5*3 + 2" "(3*4 - 2)/5" Is it possible to do in C# or scripts like VBScript, JavaScript (which will be called in c# program). Thanks a lot!.

Read the article

strip action code from bison grammar file

- by Flavius

Hi Is there any existing tool to strip all the action code from bison grammar files, leaving only the {} aound it?

Read the article

Is a switch statement ok for 30 or so conditions?

- by DeanMc

I am in the final stages of creating an MP4 tag parser in .Net. For those who have experience with tagging music you would be aware that there are an average of 30 or so tags. If tested out different types of loops and it seems that a switch statement with Const values seems to be the way to go with regard to catching the tags in binary. The switch allows me to search the binary without the need to know which order the tags are stored or if there are some not present but I wonder if anyone would be against using a switch statement for so many conditionals. Any insight is much appreciated.

Read the article

how to detect an escape sequence in a string

- by mix

Given a string named line whose raw version has this value: \rRAWSTRING how can I detect if it has the escape character \r? What I've tried is: if repr(line).startswith('\r'): blah... but it doesn't catch it. I also tried find, such as: if repr(line).find('\r') != -1: blah doesn't work either. What am I missing? thx! EDIT: thanks for all the replies and the corrections re terminolgy and sorry for the confusion. OK, if i do this print repr(line) then what it prints is: '\rSET ENABLE ACK\n' (including the single quotes). i have tried all the suggestions, including: line.startswith(r'\r') line.startswith('\\r') each of which returns False. also tried: line.find(r'\r') line.find('\\r') each of which returns -1

Read the article

SQL error - Cannot convert nvarchar to decimal

- by jakesankey

I have a C# application that simply parses all of the txt documents within a given network directory and imports the data to a SQL server db. Everything was cruising along just fine until about the 1800th file when it happend to have a few blanks in columns that are called out as DBType.Decimal (and the value is usually zero in the files, not blank). So I got this error, "cannot convert nvarchar to decimal". I am wondering how I could tell the app to simply skip the lines that have this issue?? Perhaps I could even just change the column type to varchar even tho values are numbers (what problems could this create?) Thanks for any help! using System; using System.Data; using System.Data.SQLite; using System.IO; using System.Text.RegularExpressions; using System.Threading; using System.Collections.Generic; using System.Linq; using System.Data.SqlClient; namespace JohnDeereCMMDataParser { internal class Program { public static List<string> GetImportedFileList() { List<string> ImportedFiles = new List<string>(); using (SqlConnection connect = new SqlConnection(@"Server=FRXSQLDEV;Database=RX_CMMData;Integrated Security=YES")) { connect.Open(); using (SqlCommand fmd = connect.CreateCommand()) { fmd.CommandText = @"SELECT FileName FROM CMMData;"; fmd.CommandType = CommandType.Text; SqlDataReader r = fmd.ExecuteReader(); while (r.Read()) { ImportedFiles.Add(Convert.ToString(r["FileName"])); } } } return ImportedFiles; } private static void Main(string[] args) { Console.Title = "John Deere CMM Data Parser"; Console.WriteLine("Preparing CMM Data Parser... done"); Console.WriteLine("Scanning for new CMM data..."); Console.ForegroundColor = ConsoleColor.Gray; using (SqlConnection con = new SqlConnection(@"Server=FRXSQLDEV;Database=RX_CMMData;Integrated Security=YES")) { con.Open(); using (SqlCommand insertCommand = con.CreateCommand()) { Console.WriteLine("Connecting to SQL server..."); SqlCommand cmdd = con.CreateCommand(); string[] files = Directory.GetFiles(@"C:\Documents and Settings\js91162\Desktop\CMM WENZEL\", "*_*_*.txt", SearchOption.AllDirectories); List<string> ImportedFiles = GetImportedFileList(); insertCommand.Parameters.Add(new SqlParameter("@FeatType", DbType.String)); insertCommand.Parameters.Add(new SqlParameter("@FeatName", DbType.String)); insertCommand.Parameters.Add(new SqlParameter("@Axis", DbType.String)); insertCommand.Parameters.Add(new SqlParameter("@Actual", DbType.Decimal)); insertCommand.Parameters.Add(new SqlParameter("@Nominal", DbType.Decimal)); insertCommand.Parameters.Add(new SqlParameter("@Dev", DbType.Decimal)); insertCommand.Parameters.Add(new SqlParameter("@TolMin", DbType.Decimal)); insertCommand.Parameters.Add(new SqlParameter("@TolPlus", DbType.Decimal)); insertCommand.Parameters.Add(new SqlParameter("@OutOfTol", DbType.Decimal)); foreach (string file in files.Except(ImportedFiles)) { var FileNameExt1 = Path.GetFileName(file); cmdd.Parameters.Clear(); cmdd.Parameters.Add(new SqlParameter("@FileExt", FileNameExt1)); cmdd.CommandText = @" IF (EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'RX_CMMData' AND TABLE_NAME = 'CMMData')) BEGIN SELECT COUNT(*) FROM CMMData WHERE FileName = @FileExt; END"; int count = Convert.ToInt32(cmdd.ExecuteScalar()); con.Close(); con.Open(); if (count == 0) { Console.WriteLine("Preparing to parse CMM data for SQL import..."); if (file.Count(c => c == '_') > 5) continue; insertCommand.CommandText = @" INSERT INTO CMMData (FeatType, FeatName, Axis, Actual, Nominal, Dev, TolMin, TolPlus, OutOfTol, PartNumber, CMMNumber, Date, FileName) VALUES (@FeatType, @FeatName, @Axis, @Actual, @Nominal, @Dev, @TolMin, @TolPlus, @OutOfTol, @PartNumber, @CMMNumber, @Date, @FileName);"; string FileNameExt = Path.GetFullPath(file); string RNumber = Path.GetFileNameWithoutExtension(file); int index2 = RNumber.IndexOf("~"); Match RNumberE = Regex.Match(RNumber, @"^(R|L)\d{6}(COMP|CRIT|TEST|SU[1-9])(?=_)", RegexOptions.IgnoreCase); Match RNumberD = Regex.Match(RNumber, @"(?<=_)\d{3}[A-Z]\d{4}|\d{3}[A-Z]\d\w\w\d(?=_)", RegexOptions.IgnoreCase); Match RNumberDate = Regex.Match(RNumber, @"(?<=_)\d{8}(?=_)", RegexOptions.IgnoreCase); string RNumE = Convert.ToString(RNumberE); string RNumD = Convert.ToString(RNumberD); if (RNumberD.Value == @"") continue; if (RNumberE.Value == @"") continue; if (RNumberDate.Value == @"") continue; if (index2 != -1) continue; DateTime dateTime = DateTime.ParseExact(RNumberDate.Value, "yyyyMMdd", Thread.CurrentThread.CurrentCulture); string cmmDate = dateTime.ToString("dd-MMM-yyyy"); string[] lines = File.ReadAllLines(file); bool parse = false; foreach (string tmpLine in lines) { string line = tmpLine.Trim(); if (!parse && line.StartsWith("Feat. Type,")) { parse = true; continue; } if (!parse || string.IsNullOrEmpty(line)) { continue; } Console.WriteLine(tmpLine); foreach (SqlParameter parameter in insertCommand.Parameters) { parameter.Value = null; } string[] values = line.Split(new[] { ',' }); for (int i = 0; i < values.Length - 1; i++) { if (i = "" || i = null) continue; SqlParameter param = insertCommand.Parameters[i]; if (param.DbType == DbType.Decimal) { decimal value; param.Value = decimal.TryParse(values[i], out value) ? value : 0; } else { param.Value = values[i]; } } insertCommand.Parameters.Add(new SqlParameter("@PartNumber", RNumE)); insertCommand.Parameters.Add(new SqlParameter("@CMMNumber", RNumD)); insertCommand.Parameters.Add(new SqlParameter("@Date", cmmDate)); insertCommand.Parameters.Add(new SqlParameter("@FileName", FileNameExt)); insertCommand.ExecuteNonQuery(); insertCommand.Parameters.RemoveAt("@PartNumber"); insertCommand.Parameters.RemoveAt("@CMMNumber"); insertCommand.Parameters.RemoveAt("@Date"); insertCommand.Parameters.RemoveAt("@FileName"); } } } Console.WriteLine("CMM data successfully imported to SQL database..."); } con.Close(); } } } }

Read the article

MalformedURLException with file URI

- by Paul Reiners

While executing the following code: doc = builder.parse(file); where doc is an instance of org.w3c.dom.Document and builder is an instance of javax.xml.parsers.DocumentBuilder, I'm getting the following exception: Exception in thread "main" java.net.MalformedURLException: unknown protocol: c at java.net.URL.<init>(Unknown Source) at java.net.URL.<init>(Unknown Source) at java.net.URL.<init>(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(Unknown Source) at com.acme.ItemToThetaValues.createFiles(ItemToThetaValues.java:47) It's choking on this line of the file: <!DOCTYPE questestinterop SYSTEM "C:\Program Files\Acme\parsers\acme_full.dtd"> I am not getting this error on my machine, while a user is getting it on his machine. We are both using version 6 of the Sun JRE. This error also occurs when he's uses double backslashes in the path instead of single backslashes and when he uses forward slashes instead of backslashes. First of all, is the XML correct? Is the path expressed correctly? Second of all, why is this error occurring on one computer but not on another?

Read the article

nokogiri vs hpricot?

- by roshan

Which one would you choose? My important attributes are (not in order) Support & Future enhancements Community & general knowledge base (on the Internet) Comprehensive (i.e proven to parse a wide range of *.*ml pages) Performance Memory Footprint (runtime, not the code-base)

Read the article

Counting total sum of each value in one column w.r.t another in Perl

- by sfactor

I have tab delimited data with multiple columns. I have OS names in column 31 and data bytes in columns 6 and 7. What I want to do is count the total volume of each unique OS. So, I did something in Perl like this: #!/usr/bin/perl use warnings; my @hhfilelist = glob "*.txt"; my %count = (); for my $f (@hhfilelist) { open F, $f || die "Cannot open $f: $!"; while (<F>) { chomp; my @line = split /\t/; # counting volumes in col 6 and 7 for 31 $count{$line[30]} = $line[5] + $line[6]; } close (F); } my $w = 0; foreach $w (sort keys %count) { print "$w\t$count{$w}\n"; } So, the result would be something like Windows 100000 Linux 5000 Mac OSX 15000 Android 2000 But there seems to be some error in this code because the resulting values I get aren't as expected. What am I doing wrong?

Read the article

java version of python-dateutil

- by elhefe

Python has a very handy package that can parse nearly any unambiguous date and provides helpful error messages on a parse failure, python-dateutil. Comparison to the SimpleDateFormat class is not favorable - AFAICT SimpleDateFormat can only handle one exact date format and the error messages have no granularity. I've looked through the Joda API but it appears Joda is the same way - only one explicit format can be parsed at a time. Is there any package or library that reproduces the python-dateutil behavior? Or am I missing something WRT Joda/SimpleDateFormat?

Read the article

jQuery $.getJSON - How do I parse a flickr.photos.search REST API call?

- by Chad

Trying to adapt the $.getJSON Flickr example: $.getJSON("http://api.flickr.com/services/feeds/photos_public.gne?tags=cat&tagmode=any&format=json&jsoncallback=?", function(data){ $.each(data.items, function(i,item){ $("<img/>").attr("src", item.media.m).appendTo("#images"); if ( i == 3 ) return false; }); }); to read from the flickr.photos.search REST API method, but the JSON response is different for this call. Click here to see the JSON response. This is what I've done so far: var url = "http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=9322c53dde3b36bda33f79c16bb99104&tags=yokota+air+base&safe_search=1&per_page=20"; var src; $.getJSON(url + "&format=json&jsoncallback=?", function(data){ $.each(data.photos, function(i,item){ src = "http://farm"+ item.photo.farm +".static.flickr.com/"+ item.photo.server +"/"+ item.photo.id +"_"+ item.photo.secret +"_m.jpg"; $("<img/>").attr("src", src).appendTo("#images"); if ( i == 3 ) return false; }); }); I guess I'm not building the image src correctly. Couldn't find any documentation on how to build the image src, based on what the JSON response is. How do you parse a flickr.photos.search REST API call?

Read the article

Linux: shell builtin string matching

- by gmatt

I am trying to become more familiar with using the builtin string matching stuff available in shells in linux. I came across this guys posting, and he showed an example a="abc|def" echo ${a#*|} # will yield "def" echo ${a%|*} # will yield "abc" I tried it out and it does what its advertised to do, but I don't understand what the $,{},#,*,| are doing, I tried looking for some reference online or in the manuals but I couldn't find anything. Can anyone explain to me what's going on here?

Read the article

Parse usable Street Address, City, State, Zip from a string

- by Rob Allen

Problem: I have an address field from an Access database which has been converted to Sql Server 2005. This field has everything all in one field. I need to parse out the individual sections of the address into their appropriate fields in a normalized table. I need to do this for approximately 4,000 records and it needs to be repeatable. Here are the rules for this exercise: 1 - no whining about how this should have been separate fields in the first place, we are often confronted with less than ideal situations and have to make the best of them 2- for this post, use any language you want 3- feel free to play code golf 4 - Assume an address in the US (for now) 5 - assume that the input string will sometimes contain an addressee (the person being addressed) and/or a second street address (i.e. Suite B) 6 - states may be abbreviated 7 - zip code could be standard 5 digit or zip+4 8 - there are typos in some instances UPDATE: In response to the questions posed, standards were not universally followed, I need need to store the individual values, not just geocode and errors means typo (corrected above) Sample Data: A. P. Croll & Son 2299 Lewes-Georgetown Hwy, Georgetown, DE 19947 11522 Shawnee Road, Greenwood DE 19950 144 Kings Highway, S.W. Dover, DE 19901 Intergrated Const. Services 2 Penns Way Suite 405 New Castle, DE 19720 Humes Realty 33 Bridle Ridge Court, Lewes, DE 19958 Nichols Excavation 2742 Pulaski Hwy Newark, DE 19711 2284 Bryn Zion Road, Smyrna, DE 19904 VEI Dover Crossroads, LLC 1500 Serpentine Road, Suite 100 Baltimore MD 21 580 North Dupont Highway Dover, DE 19901 P.O. Box 778 Dover, DE 19903

Read the article

Seperate html pages for each screen in Jquery mobile

- by vrs

I am newbie to Jquery Mobile, so far what ever examples i searched contains only one html page for whole application, with multipe div tags where each page/screen is defined as div tag with data-role as page with some header and footers optionally. Based on user actions, we are hiding some div's(pages) and showing only expected page. Also, this multi-page template seems to be standard design, as written by some blogs. Are there any other designing ways? what I would like to have is multipe html pages, for ex one for login, one for home, one for contact etc. Other wise it is difficult to understand/code/debug issues, especially people from Java background like me.So, what I want is some kind of MVC design with JQueryMobile, like each view/screen as sepearate html associated with one js (Controller). Can we have multiple html pages in JqueryMobile app? If possible how to pass data/ maintain session between them? Any samples are most welcome. Thanks In Advance. Note: Also I don't want server side includes, may app contains 10 to 15 screens, each page will make a webservice call and fetch some data and map it to UI.

Read the article

Is XMLReader a SAX parser, a DOM parser, or neither?

- by Renesis

I am testing various methods to read (possibly large, and very often) XML configuration files in PHP. No writing is ever needed. I have two successful implementations, one using SimpleXML (which I know is a DOM parser) and one using XMLReader. I know that a DOM reader must read the whole tree and therefore uses more memory. My tests reflect that. I also know that A SAX parser is an "event-based" parser that uses less memory because it reads each node from the stream without checking what is next. XMLReader also reads from a stream with the cursor providing data about the node it is currently at. So, it definitely sounds like XMLReader (http://us2.php.net/xmlreader) is not a DOM parser, but my question is, is it a SAX parser, or something else? It seems like XMLReader behaves the way a SAX parser does but does not throw the events themselves (in other words, can you construct a SAX parser with XMLReader?) If it is something else, does the classification it's in have a name?

Read the article

PHP Regex to match lines with all-caps with occaisional hyphens.

- by Yaaqov

I'm trying to to convert an existing PHP Regular Expression match case to apply to a slightly different style of document. Here's the original style of the document: **FOODS - TYPE A** ___________________________________ **PRODUCT** 1) Mi Pueblito Queso Fresco Authentic Mexican Style Fresh Cheese; 2) La Fe String Cheese **CODE** Sell by date going back to February 1, 2009 And the successfully-running PHP Regex match code that only returns "true" if the line is surrounded by asterisks, and stores each side of the "-" as $m[1] and $m[2], respectively. if ( preg_match('#^\*\*([^-]+)(?:-(.*))?\*\*$#', $line, $m) ) { // only for **header - subheader** $m[2] is set. if ( isset($m[2]) ) { return array(TYPE_HEADER, array(trim($m[1]), trim($m[2]))); } else { return array(TYPE_KEY, array($m[1])); } } So, for line 1: $m[1] = "FOODS" AND $m[2] = "TYPE A"; Line 2 would be skipped; Line 3: $m[1] = "PRODUCT", etc. The question: How would I re-write the above regex match if the headers did not have the asterisks, but still was all-caps, and was at least 4 characters long? For example: FOODS - TYPE A ___________________________________ PRODUCT 1) Mi Pueblito Queso Fresco Authentic Mexican Style Fresh Cheese; 2) La Fe String Cheese CODE Sell by date going back to February 1, 2009 Thank you.

Read the article

rich:editor ruins html?

- by Ben

Hi, Strange behaviour. I use rich:editor with these attributes: (Irrelevant data removed) HtmlEditor editor = new HtmlEditor(); editor.setValueExpression("value", ve); editor.setTheme("advanced"); editor.setValueExpression("viewMode", viewModeValueExpression); panel.getChildren().add(editor); Now my problem is that whenever I load a ready-made html text such as this (In source mode): <html lang="en" xml:lang="en"> <head> <title>Done</title> </head> <body style="direction: ltr; font-size: medium; color: #0000FF;"> <p>When the menu loads, navigate to and open Image Editor.</p> </body> </html> Change to VisualMode and then back to SourceMode, I see that the editor removed all of my html data and now the source mode is this: <p>When the menu loads, navigate to and open Chul Muzal.</p> Anyone knows why this happens? Thanks!!

Read the article

Right recursive grammar or left recursive?

- by user2485710

I have little to no knowledge of what I'm about to ask, so I would like a suggestion based on the level of skills required to implemented a parser for the given grammar ( since I'm a beginner in this kind of formal approach to parsers and languages ). Just by going back of a couple of years, this situation reminds me a little of Pascal grammar vs C/C++ grammar, this left vs right stuff. But I'm not going to do any of that, my purpose is to implement a simple parser for a markup language for documents like Markdown. So considering that I'm starting with a markup language in mind, I want to keep things simple, which is the easiest one to handle between this 2 options and why . Another kind of grammar could be an easier option for me ? If yes which one do you suggest ?

Read the article

PHP: What is an efficient way to parse a text file containing very long lines?

- by Shaun

I'm working on a parser in php which is designed to extract MySQL records out of a text file. A particular line might begin with a string corresponding to which table the records (rows) need to be inserted into, followed by the records themselves. The records are delimited by a backslash and the fields (columns) are separated by commas. For the sake of simplicity, let's assume that we have a table representing people in our database, with fields being First Name, Last Name, and Occupation. Thus, one line of the file might be as follows [People] = "\Han,Solo,Smuggler\Luke,Skywalker,Jedi..." Where the ellipses (...) could be additional people. One straightforward approach might be to use fgets() to extract a line from the file, and use preg_match() to extract the table name, records, and fields from that line. However, let's suppose that we have an awful lot of Star Wars characters to track. So many, in fact, that this line ends up being 200,000+ characters/bytes long. In such a case, taking the above approach to extract the database information seems a bit inefficient. You have to first read hundreds of thousands of characters into memory, then read back over those same characters to find regex matches. Is there a way, similar to the Java String next(String pattern) method of the Scanner class constructed using a file, that allows you to match patterns in-line while scanning through the file? The idea is that you don't have to scan through the same text twice (to read it from the file into a string, and then to match patterns) or store the text redundantly in memory (in both the file line string and the matched patterns). Would this even yield a significant increase in performance? It's hard to tell exactly what PHP or Java are doing behind the scenes.

Read the article

What's the best way to retrieve two pieces of data from an XML file?

- by Morinar

I've got an XML document that is in either a pre or post FO transformed state that I need to extract some information from. In the pre-case, I need to pull out two tags that represent the pageWidth and pageHeight and in the post case I need to extract the page-height and page-width parameters from a specific tag (I forget which one it is off the top of my head). What I'm looking for is an efficient/easily maintainable way to grab these two elements. I'd like to only read the document a single time fetching the two things I need. I initially started writing something that would use BufferedReader + FileReader, but then I'm doing string searching and it gets messy when the tags span multiple lines. I then looked at the DOMParser, which seems like it would be ideal, but I don't want to have to read the entire file into memory if I could help it as the files could potentially be large and the tags I'm looking for will nearly always be close to the top of the file. I then looked into SAXParser, but that seems like a big pile of complicated overkill for what I'm trying to accomplish. Anybody have any advice? Or simple implementations that would accomplish my goal? Thanks.

Search Results

Search found 32104 results on 1285 pages for 'html parsing'.

Page 186/1285 | < Previous Page | 182 183 184 185 186 187 188 189 190 191 192 193 | Next Page >

- by Zerobu

- by Rosarch

- by kaibuki

- by chucknelson

- by Vincent Robert

- by user151841

- by Cool

- by Flavius

- by DeanMc

- by mix

- by jakesankey

- by Paul Reiners

- by roshan

- by sfactor

- by elhefe

- by Chad

- by gmatt

- by Rob Allen

- by vrs

- by Renesis

- by Yaaqov

- by Ben

- by user2485710

- by Shaun

- by Morinar

< Previous Page | 182 183 184 185 186 187 188 189 190 191 192 193 | Next Page >