text parsing - Page 117

Sanitize Content: removing markup from Amazon's content

- by StackOverflowNewbie

I'm using Amazon Web Service to get product descriptions of various items. The problem is that Amazon's content contains mark up that is sometimes destructive to the layout of my web page (e.g. unclosed DIVs, etc.). I want to sanitize the content I get from Amazon. My solution would be to do the following (my initial list so far): Remove unnecessary tags such as div, span, etc. while keeping tags like p, ul, ol, etc. Remove all attributes from all the tags (e.g. seems like there are style attributes in some of the tags) Remove excess white space (e.g. multiple spaces, carriage returns, new lines, tabs, etc.) Etc. Before I go off trying to build my solution, I'm wondering if anyone has a better idea (or an already existing solution). Thanks.

Read the article

Watin ContainsText method fails to find text in FireFox

- by alonp

The ContainsText method finds the text only in specific area in the html but fails to find id in other parts of the page. The text that located under 'div id="content"' can be found But the text in other area of the html is not found (f.e 'form id="aspnetForm"') Browser b = new FireFox("http://localhost:8668/login.aspx"); b.Button("login.login.button")).Click(); bool blah = b.ContainsText("Hello"); I'm using the latest watin release. The issue is reproduced with FF3.0, FF3.5 and FF3.6 In IE it's working fine for the tested text.

Read the article

Parse usable Street Address, City, State, Zip from a string

- by Rob Allen

Problem: I have an address field from an Access database which has been converted to Sql Server 2005. This field has everything all in one field. I need to parse out the individual sections of the address into their appropriate fields in a normalized table. I need to do this for approximately 4,000 records and it needs to be repeatable. Here are the rules for this exercise: 1 - no whining about how this should have been separate fields in the first place, we are often confronted with less than ideal situations and have to make the best of them 2- for this post, use any language you want 3- feel free to play code golf 4 - Assume an address in the US (for now) 5 - assume that the input string will sometimes contain an addressee (the person being addressed) and/or a second street address (i.e. Suite B) 6 - states may be abbreviated 7 - zip code could be standard 5 digit or zip+4 8 - there are typos in some instances UPDATE: In response to the questions posed, standards were not universally followed, I need need to store the individual values, not just geocode and errors means typo (corrected above) Sample Data: A. P. Croll & Son 2299 Lewes-Georgetown Hwy, Georgetown, DE 19947 11522 Shawnee Road, Greenwood DE 19950 144 Kings Highway, S.W. Dover, DE 19901 Intergrated Const. Services 2 Penns Way Suite 405 New Castle, DE 19720 Humes Realty 33 Bridle Ridge Court, Lewes, DE 19958 Nichols Excavation 2742 Pulaski Hwy Newark, DE 19711 2284 Bryn Zion Road, Smyrna, DE 19904 VEI Dover Crossroads, LLC 1500 Serpentine Road, Suite 100 Baltimore MD 21 580 North Dupont Highway Dover, DE 19901 P.O. Box 778 Dover, DE 19903

Read the article

Is a switch statement ok for 30 or so conditions?

- by DeanMc

I am in the final stages of creating an MP4 tag parser in .Net. For those who have experience with tagging music you would be aware that there are an average of 30 or so tags. If tested out different types of loops and it seems that a switch statement with Const values seems to be the way to go with regard to catching the tags in binary. The switch allows me to search the binary without the need to know which order the tags are stored or if there are some not present but I wonder if anyone would be against using a switch statement for so many conditionals. Any insight is much appreciated.

Read the article

MalformedURLException with file URI

- by Paul Reiners

While executing the following code: doc = builder.parse(file); where doc is an instance of org.w3c.dom.Document and builder is an instance of javax.xml.parsers.DocumentBuilder, I'm getting the following exception: Exception in thread "main" java.net.MalformedURLException: unknown protocol: c at java.net.URL.<init>(Unknown Source) at java.net.URL.<init>(Unknown Source) at java.net.URL.<init>(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(Unknown Source) at com.acme.ItemToThetaValues.createFiles(ItemToThetaValues.java:47) It's choking on this line of the file: <!DOCTYPE questestinterop SYSTEM "C:\Program Files\Acme\parsers\acme_full.dtd"> I am not getting this error on my machine, while a user is getting it on his machine. We are both using version 6 of the Sun JRE. This error also occurs when he's uses double backslashes in the path instead of single backslashes and when he uses forward slashes instead of backslashes. First of all, is the XML correct? Is the path expressed correctly? Second of all, why is this error occurring on one computer but not on another?

Read the article

Trying to parse xml, but xmldocument.loadxml() is trying to download?

- by maxp

I have a string input that i do not know whether or not is valid xml. I think the simplest aprroach is to wrap new XmlDocument().LoadXml(strINPUT); In a try/catch. The problem im facing is, sometimes strINPUT is an html file, if the header of this file contains <!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd""> <html xml:lang=""en-GB"" xmlns=""http://www.w3.org/1999/xhtml"" lang=""en-GB""> ...like many do, it actually tries to make a connection to the w3.org url, which i really dont want it doing. Anyone know if its possible to just parse the string without trying to be clever and checking external urls? Failing that is there an alternative to xmldocument?

Read the article

Can I fire a Text Changed Event for an asp.net Text Box before it loses focus?

- by Xaisoft

I have an asp.net TextBox in which I want to check if the text entered into the TextBox is > 0. It works once I tab out or click out of the TextBox, but if I keep focus on the TextBox, it won't fire the Text Changed Event, so I have the following scenario, I want to enable something if and only if the TextBox.Text.Length = 0. Now, if I put my caret in the TextBox and delete all the characters and then leave the caret in the TextBox so it still has focus and take my mouse and click a button, it will not do what it was supposed to do because it never fired the Text Changed Event. How would something like this be handled?

Read the article

Is XMLReader a SAX parser, a DOM parser, or neither?

- by Renesis

I am testing various methods to read (possibly large, and very often) XML configuration files in PHP. No writing is ever needed. I have two successful implementations, one using SimpleXML (which I know is a DOM parser) and one using XMLReader. I know that a DOM reader must read the whole tree and therefore uses more memory. My tests reflect that. I also know that A SAX parser is an "event-based" parser that uses less memory because it reads each node from the stream without checking what is next. XMLReader also reads from a stream with the cursor providing data about the node it is currently at. So, it definitely sounds like XMLReader (http://us2.php.net/xmlreader) is not a DOM parser, but my question is, is it a SAX parser, or something else? It seems like XMLReader behaves the way a SAX parser does but does not throw the events themselves (in other words, can you construct a SAX parser with XMLReader?) If it is something else, does the classification it's in have a name?

Read the article

jQuery $.getJSON - How do I parse a flickr.photos.search REST API call?

- by Chad

Trying to adapt the $.getJSON Flickr example: $.getJSON("http://api.flickr.com/services/feeds/photos_public.gne?tags=cat&tagmode=any&format=json&jsoncallback=?", function(data){ $.each(data.items, function(i,item){ $("<img/>").attr("src", item.media.m).appendTo("#images"); if ( i == 3 ) return false; }); }); to read from the flickr.photos.search REST API method, but the JSON response is different for this call. Click here to see the JSON response. This is what I've done so far: var url = "http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=9322c53dde3b36bda33f79c16bb99104&tags=yokota+air+base&safe_search=1&per_page=20"; var src; $.getJSON(url + "&format=json&jsoncallback=?", function(data){ $.each(data.photos, function(i,item){ src = "http://farm"+ item.photo.farm +".static.flickr.com/"+ item.photo.server +"/"+ item.photo.id +"_"+ item.photo.secret +"_m.jpg"; $("<img/>").attr("src", src).appendTo("#images"); if ( i == 3 ) return false; }); }); I guess I'm not building the image src correctly. Couldn't find any documentation on how to build the image src, based on what the JSON response is. How do you parse a flickr.photos.search REST API call?

Read the article

nokogiri vs hpricot?

- by roshan

Which one would you choose? My important attributes are (not in order) Support & Future enhancements Community & general knowledge base (on the Internet) Comprehensive (i.e proven to parse a wide range of *.*ml pages) Performance Memory Footprint (runtime, not the code-base)

Read the article

text-decoration:underline vs border-bottom....

- by jitendra

What is the difference to use {text-decoration:underline} and {border-bottom:...}? which is easy to style and cross browser compatible? when we should use border-bottom over text-decoration:underline? Would it be good to use border-bottom always in place of text-decoration:underline?

Read the article

Libraries for text animation in Flex / Actionscript ?

- by Simon

Are there any good libraries for cool text animation effects for use in Actionscript (for use in an intro screen or banner). I've given up tryin to use Flash itself because that takes forever, and I dont know which of the many flash text animation tools to choose from. I'd like to be able to dynamically product cool text effects with different messages.

Read the article

Right recursive grammar or left recursive?

- by user2485710

I have little to no knowledge of what I'm about to ask, so I would like a suggestion based on the level of skills required to implemented a parser for the given grammar ( since I'm a beginner in this kind of formal approach to parsers and languages ). Just by going back of a couple of years, this situation reminds me a little of Pascal grammar vs C/C++ grammar, this left vs right stuff. But I'm not going to do any of that, my purpose is to implement a simple parser for a markup language for documents like Markdown. So considering that I'm starting with a markup language in mind, I want to keep things simple, which is the easiest one to handle between this 2 options and why . Another kind of grammar could be an easier option for me ? If yes which one do you suggest ?

Read the article

How to change the default text of Cancel BUtton which appears in the UISearchBar +Iphone

- by Pradeep Reddy Kypa

HI I am developing an Application where i wanted to change the text of Search String in the SearchBar. I wanted to change the text of Cancel Button Also which appears next to the SearchBar. Before entering any string in the search bar we wil get the Search String as the default string. i wanted to change the text of that string and when we click on that searchbar we get a cancel button next to searchbar and i wanted to change the text of that cancel button. PLease help me.

Read the article

PyParsing: What does Combine() do?

- by Rosarch

What is the difference between: foo = TOKEN1 + TOKEN2 and foo = Combine(TOKEN1 + TOKEN2) Thanks.

Read the article

What's the best way to retrieve two pieces of data from an XML file?

- by Morinar

I've got an XML document that is in either a pre or post FO transformed state that I need to extract some information from. In the pre-case, I need to pull out two tags that represent the pageWidth and pageHeight and in the post case I need to extract the page-height and page-width parameters from a specific tag (I forget which one it is off the top of my head). What I'm looking for is an efficient/easily maintainable way to grab these two elements. I'd like to only read the document a single time fetching the two things I need. I initially started writing something that would use BufferedReader + FileReader, but then I'm doing string searching and it gets messy when the tags span multiple lines. I then looked at the DOMParser, which seems like it would be ideal, but I don't want to have to read the entire file into memory if I could help it as the files could potentially be large and the tags I'm looking for will nearly always be close to the top of the file. I then looked into SAXParser, but that seems like a big pile of complicated overkill for what I'm trying to accomplish. Anybody have any advice? Or simple implementations that would accomplish my goal? Thanks.

Read the article

Linux: shell builtin string matching

- by gmatt

I am trying to become more familiar with using the builtin string matching stuff available in shells in linux. I came across this guys posting, and he showed an example a="abc|def" echo ${a#*|} # will yield "def" echo ${a%|*} # will yield "abc" I tried it out and it does what its advertised to do, but I don't understand what the $,{},#,*,| are doing, I tried looking for some reference online or in the manuals but I couldn't find anything. Can anyone explain to me what's going on here?

Read the article

PHP Regex to match lines with all-caps with occaisional hyphens.

- by Yaaqov

I'm trying to to convert an existing PHP Regular Expression match case to apply to a slightly different style of document. Here's the original style of the document: **FOODS - TYPE A** ___________________________________ **PRODUCT** 1) Mi Pueblito Queso Fresco Authentic Mexican Style Fresh Cheese; 2) La Fe String Cheese **CODE** Sell by date going back to February 1, 2009 And the successfully-running PHP Regex match code that only returns "true" if the line is surrounded by asterisks, and stores each side of the "-" as $m[1] and $m[2], respectively. if ( preg_match('#^\*\*([^-]+)(?:-(.*))?\*\*$#', $line, $m) ) { // only for **header - subheader** $m[2] is set. if ( isset($m[2]) ) { return array(TYPE_HEADER, array(trim($m[1]), trim($m[2]))); } else { return array(TYPE_KEY, array($m[1])); } } So, for line 1: $m[1] = "FOODS" AND $m[2] = "TYPE A"; Line 2 would be skipped; Line 3: $m[1] = "PRODUCT", etc. The question: How would I re-write the above regex match if the headers did not have the asterisks, but still was all-caps, and was at least 4 characters long? For example: FOODS - TYPE A ___________________________________ PRODUCT 1) Mi Pueblito Queso Fresco Authentic Mexican Style Fresh Cheese; 2) La Fe String Cheese CODE Sell by date going back to February 1, 2009 Thank you.

Read the article

C# - Parse HTML source as XML

- by fonix232

I would like to read in a dynamic URL what contains a HTML file, and read it like an XML file, based on nodes (HTML tags). Is this somehow possible? I mean, there is this HTML code: <table class="bidders" cellpadding="0" cellspacing="0"> <tr class="bidRow4"> <td>kucik (automata)</td> <td class="right">9 374 Ft</td> <td class="bidders_date">2010-06-10 18:19:52</td> </tr> <tr class="bidRow4"> <td>macszaf (automata)</td> <td class="right">9 373 Ft</td> <td class="bidders_date">2010-06-10 18:19:52</td> </tr> <tr class="bidRow2"> <td>kucik (automata)</td> <td class="right">9 372 Ft</td> <td class="bidders_date">2010-06-10 18:19:42</td> </tr> <tr class="bidRow2"> <td>macszaf (automata)</td> <td class="right">9 371 Ft</td> <td class="bidders_date">2010-06-10 18:19:42</td> </tr> <tr class="bidRow0"> <td>kucik (automata)</td> <td class="right">9 370 Ft</td> <td class="bidders_date">2010-06-10 18:19:32</td> </tr> <tr class="bidRow0"> <td>macszaf (automata)</td> <td class="right">9 369 Ft</td> <td class="bidders_date">2010-06-10 18:19:32</td> </tr> <tr class="bidRow8"> <td>kucik (automata)</td> <td class="right">9 368 Ft</td> <td class="bidders_date">2010-06-10 18:19:22</td> </tr> <tr class="bidRow8"> <td>macszaf (automata)</td> <td class="right">9 367 Ft</td> <td class="bidders_date">2010-06-10 18:19:22</td> </tr> <tr class="bidRow6"> <td>kucik (automata)</td> <td class="right">9 366 Ft</td> <td class="bidders_date">2010-06-10 18:19:12</td> </tr> <tr class="bidRow6"> <td>macszaf (automata)</td> <td class="right">9 365 Ft</td> <td class="bidders_date">2010-06-10 18:19:12</td> </tr> </table> I want to parse this into a ListView (or a Grid) to create rows with the data contained. All tr are different row, and all td in a given td is a column in the given row. And also I want it to be as fast as possible, as it would update itself in 5 seconds. Is there any library for this?

Read the article

How do I send text to a UITextView?

- by Lee

I'm new to iphone development. I'm a VB programmer who is trying to convert a VB application to an ipad app. I need some help with sending text to a UITextView. I want to first have a UIPickerView and then once the user hits a UIButton, a UITextView appears and the text is then generated by my source code code, line by line. The code would constantly be concatenating strings to the text. It would sort of go like this-- 1) User makes selections with UIPickerView. 2) User then hits UIButton. 3) UIPickerView is replaced on the screen with a UITextView. 4) The code does stuff. 5) The code adds the 1st line of text into the UITextView. 6) The code does more stuff. 7) The code then adds the 2nd line of text into the UITextView, retaining the 1st line that was already there. Steps 6 and 7 are repeated until the code is done. Does anyone know of any examples of this that I could look at? I am mostly interested in finding something like a youtube video, a webpage that explains the code or even a good book that covers this particular topic. I am finding that the sample codes that Apple has on this site goes over my head. In fact, I could probably benefit from a good book. But, I am looking for one that I would already know covers this particular topic, since it is so essential to the app that I am trying to build.

Read the article

How to detect Links with out anchor element in a plain text

- by dhee

If user enters his text in the text box and saves it and again what's to add some more text he can edit that text and save it if required. Firstly if user enters that text with some links I, detected them and converted any hyperlinks to linkify in new tab. Secondly if user wants to add some more text and links he clicks on edit and add them and save it at this time I must ignore the links that already hyperlinked with anchor button Please help and advice For example: what = "In the task system, is there a way to automatically have any site / page URL or image URL be hyperlinked in a new window? So If I type or copy http://www.stackoverflow.com/  for example anywhere in the description, in any of the internal messages or messages to clients, it automatically is a hyperlink in a new window. <a href="http://www.stackoverflow.com/">http://www.stackoverflow.com/</a> Or if I input an image URL anywhere in support description, internal messages or messages to cleints, it automatically is a hyperlink in a new window: https://static.doubleclick.net/viewad/4327673/1-728x90.jpg <a href="https://static.doubleclick.net/viewad/4327673/1-728x90.jpg">https://static.doubleclick.net/viewad/4327673/1-728x90.jpg</a> This would save us a lot time in task building, reviewing and creating messages. Test URL's http://www.stackoverflow.com/ http://stackoverflow.com/ https://stackoverflow.com/ www.stackoverflow.com //stackoverflow.com/ <a href='http://stackoverflow.com/'>http://stackoverflow.com/</a>"; I've tried this code function Linkify(what) { str = what; out = ""; url = ""; i = 0; do { url = str.match(/((https?:\/\/)?([a-z\-]+\.)*[\-\w]+(\.[a-z]{2,4})+(\/[\w\_\-\?\=\&\.]*)*(?![a-z]))/i); if(url!=null) { // get href value href = url[0]; if(href.substr(0,7)!="http://") href = "http://"+href; // where the match occured where = str.indexOf(url[0]); // add it to the output out += str.substr(0,where); // link it out += '<a href="'+href+'" target="_blank">'+url[0]+'</a>'; // prepare str for next round str = str.substr((where+url[0].length)); } else { out += str; str = ""; } } while(str.length>0); return out; } Please help Thanks.

Read the article

FLEX: fade in/out, random positioning, text

- by Patrick

hi, I want to fade in / fade out some text placing it randomly on the screen in a flex module. I have 2 questions: 1) What Flex container should I use to display the actionscript text animations into it ? 2) What type should I use for text ? I need to set the textFormat on it. Is UITextField the correct solution ? thanks

Read the article

Moving ordered elements and adding arbitrary text in Deliverance

- by Jon Hadley

Using the following XPath derived rule, I am able to select a list of events from a Plone site (similar to Plone Community events) and display them in my Deliverance themed site: <replace content="//*[@id='parent-fieldname-title' or @class='explain' or @id='parent-fieldname-location' or @id='parent-fieldname-text']" theme="#center-content-to-replace-one" /> However, two separate, but related, problems now rear their heads. I can't work out how to: Re-order elements within the replaced theme, such as moving images originally held within #parent-fieldname-text to before #parent-fieldname-title Put arbitrary text before elements. Such as 'Page Title:' before #parent-fieldname-title I could do both if I was just dealing with one record, as place holders(1) or text before place holders(2) could be specified in the template - but I can't see a away for that template behaviour to repeat across multiple items in one page. (if that is indeed the solution) Is this a job for pyQuery?

Read the article

How do I ignore unrecognized characters which come back from web service?

- by Jay Askren

I am working on an app which calls a rest web service. Sometimes the xml responses contain characters which the phone can not understand. When displaying these characters, an empty box is displayed instead. I would like to filter out these characters. How can I detect if a character will be able to be displayed on the screen?

Read the article

How can I parse a namespace using the SAX parser?

- by Silvestri

Hello, Using a twitter search URL ie. http://search.twitter.com/search.rss?q=android returns CSS that has an item that looks like: <item> <title>@UberTwiter still waiting for @ubertwitter android app!!!</title> <link>http://twitter.com/meals69/statuses/21158076391</link> <description>still waiting for an app!!!</description> <pubDate>Sat, 14 Aug 2010 15:33:44 +0000</pubDate> <guid>http://twitter.com/meals69/statuses/21158076391</guid> <author>Some Twitter User</author> <media:content type="image/jpg" height="48" width="48" url="http://a1.twimg.com/profile_images/756343289/me2_normal.jpg"/> <google:image_link>http://a1.twimg.com/profile_images/756343289/me2_normal.jpg</google:image_link> <twitter:metadata> <twitter:result_type>recent</twitter:result_type> </twitter:metadata> </item> Pretty simple. My code parses out everything (title, link, description, pubDate, etc.) without any problems. However, I'm getting null on: <google:image_link> I'm using Java to parse the RSS feed. Do I have to handle compound localnames differently than I would a more simple localname? This is the bit of code that parses out Link, Description, pubDate, etc: @Override public void endElement(String uri, String localName, String name) throws SAXException { super.endElement(uri, localName, name); if (this.currentMessage != null){ if (localName.equalsIgnoreCase(TITLE)){ currentMessage.setTitle(builder.toString()); } else if (localName.equalsIgnoreCase(LINK)){ currentMessage.setLink(builder.toString()); } else if (localName.equalsIgnoreCase(DESCRIPTION)){ currentMessage.setDescription(builder.toString()); } else if (localName.equalsIgnoreCase(PUB_DATE)){ currentMessage.setDate(builder.toString()); } else if (localName.equalsIgnoreCase(GUID)){ currentMessage.setGuid(builder.toString()); } else if (uri.equalsIgnoreCase(AVATAR)){ currentMessage.setAvatar(builder.toString()); } else if (localName.equalsIgnoreCase(ITEM)){ messages.add(currentMessage); } builder.setLength(0); } } startDocument looks like: @Override public void startDocument() throws SAXException { super.startDocument(); messages = new ArrayList<Message>(); builder = new StringBuilder(); } startElement looks like: @Override public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException { super.startElement(uri, localName, name, attributes); if (localName.equalsIgnoreCase(ITEM)){ this.currentMessage = new Message(); } } Tony

Search Results

Search found 37274 results on 1491 pages for 'text parsing'.

Page 117/1491 | < Previous Page | 113 114 115 116 117 118 119 120 121 122 123 124 | Next Page >

- by StackOverflowNewbie

- by alonp

- by Rob Allen

- by DeanMc

- by Paul Reiners

- by maxp

- by Xaisoft

- by Renesis

- by Chad

- by roshan

- by jitendra

- by Simon

- by user2485710

- by Pradeep Reddy Kypa

- by Rosarch

- by Morinar

- by gmatt

- by Yaaqov

- by fonix232

- by Lee

- by dhee

- by Patrick

- by Jon Hadley

- by Jay Askren

- by Silvestri

< Previous Page | 113 114 115 116 117 118 119 120 121 122 123 124 | Next Page >