Search Results

Search found 21350 results on 854 pages for 'url parsing'.

Page 122/854 | < Previous Page | 118 119 120 121 122 123 124 125 126 127 128 129  | Next Page >

  • Loading not-so-well-formed XML into XDocument (multiple DTD)

    - by Gart
    I have got a problem handling data which is almost well-formed XHTML document except for it has multiple DTD declarations in the beginning: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ... </head> <body> ... </body> </html> I need load this data into XDocument object using only the first DTD and ignoring the rest declarations. It is not possible to completely ignore DTD processing because the document may have unusual characters like &acirc; or &euro; etc. The text is retrieved from external source and I have no idea why it comes like this. Obviously my naive attempt to load this document fails with System.Xml.XmlException : Cannot have multiple DTDs: var xmlReaderSettings = new XmlReaderSettings { DtdProcessing = DtdProcessing.Parse XmlResolver = new XmlPreloadedResolver(), ConformanceLevel = ConformanceLevel.Document, }; using (var xmlReader = XmlReader.Create(stream, xmlReaderSettings)) { return XDocument.Load(xmlReader); } What would be the best way to handle this kind of data?

    Read the article

  • boost spirit semantic action parameters

    - by lurscher
    Hi, in this article about boost spirit semantic actions it is mentioned that There are actually 2 more arguments being passed: the parser context and a reference to a boolean ‘hit’ parameter. The parser context is meaningful only if the semantic action is attached somewhere to the right hand side of a rule. We will see more information about this shortly. The boolean value can be set to false inside the semantic action invalidates the match in retrospective, making the parser fail. All fine, but i've been trying to find an example passing a function object as semantic action that uses the other parameters (parser context and hit boolean) but i haven't found any. I would love to see an example using regular functions or function objects, as i barely can grok the phoenix voodoo

    Read the article

  • delete element from xml using LINQ

    - by Shishir
    Hello I've a xml file like: <starting> <start> <site>mushfiq.com</site> <site>mee.con</site> <site>ttttt.co</site> <site>jkjhkhjkh</site> <site>jhkhjkjhkhjkhjkjhkh</site> <site>dasdasdasdasdasdas</site> </start> </starting> Now I need to delete any ... and value will randomly be given from a textbox. Here is my code : XDocument doc = XDocument.Load(@"AddedSites.xml"); var deleteQuery = from r in doc.Descendants("start") where r.Element("site").Value == txt.Text.Trim() select r; foreach (var qry in deleteQuery) { qry.Element("site").Remove(); } doc.Save(@"AddedSites.xml"); If I put the value of first element in the textbox then it can delete it, but if I put any value of element except the first element's value it could not able to delete! I need I'll put any value of any element...as it can be 2nd element or 3rd or 4th and so on.... can anyone help me out? thanks in advanced!

    Read the article

  • Get the rendered text from HTML (Delphi)

    - by Daisetsu
    I have some HTML and I need to extract the actual written text from the page. So far I have tried using a web browser and rendering the page, then going to the document property and grabbing the text. This works, but only where the browser is supported (IE com object). The problem is I want this to be able to run under wine also, so I need a solution that doesn't use IE COM. There must be a programatic way to do this that is reasonable.

    Read the article

  • Alternative for namespaces in xml

    - by mridul4c
    I have a web-service which gives a xml feed for number of clients of us, our clients consumes the xml in different types of devices. In our XML we have some namespaces also. But one of our clients can't detect namespaces because of some limitation at their end. But I can't provide a new xml for him as well. Please suggest me something so that I can satisfy the needs of namespaces without using that, so that i can change my xml to be usable by all of them. Thanks in advance.

    Read the article

  • how to pass the parameters to the urlconnection in java/android?

    - by androidbase
    hi all, i can establish a connection using HttpUrlConnection. my code below. client = new DefaultHttpClient(); URL action_url = new URL(actionUrl); conn = (HttpURLConnection) action_url.openConnection(); conn.setDoOutput(true); conn.setDoInput(true); conn.setRequestProperty("domain", "bschool.hbs.edu"); conn.setRequestProperty("userType", "2"); conn.setRequestProperty("referer", "http://www.alumni.hbs.edu/"); conn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); conn.setRequestMethod(HttpPost.METHOD_NAME); DataOutputStream ds = new DataOutputStream(conn.getOutputStream()); String content = "username=username1&password=password11"; Log.v(TAG, "content: " + content); ds.writeBytes(content); ds.flush(); ds.close(); InputStream in = conn.getInputStream();//**getting filenotfound exception here.** BufferedReader reader = new BufferedReader( new InputStreamReader(in)); StringBuilder str1 = new StringBuilder(); String line = null; while ((line = reader.readLine()) != null) { str1.append(line); Log.v(TAG, "line:" + line); } in.close(); s = str1.toString(); getting filenotfound exception. dont know why? else give me some suggestion to pass username and passwrod parameter to the url by code..

    Read the article

  • Nokogiri pull parser (Nokogiri::XML::Reader) issue with self closing tag

    - by Vlad Zloteanu
    I have a huge XML(400MB) containing products. Using a DOM parser is therefore excluded, so i tried to parse and process it using a pull parser. Below is a snippet from the each_product(&block) method where i iterate over the product list. Basically, using a stack, i transform each <product> ... </product> node into a hash and process it. while (reader.read) case reader.node_type #start element when Nokogiri::XML::Node::ELEMENT_NODE elem_name = reader.name.to_s stack.push([elem_name, {}]) #text element when Nokogiri::XML::Node::TEXT_NODE, Nokogiri::XML::Node::CDATA_SECTION_NODE stack.last[1] = reader.value #end element when Nokogiri::XML::Node::ELEMENT_DECL return if stack.empty? elem = stack.pop parent = stack.last if parent.nil? yield(elem[1]) elem = nil next end key = elem[0] parent_childs = parent[1] # ... parent_childs[key] = elem[1] end The issue is on self-closing tags (EG <country/>), as i can not make the difference between a 'normal' and a 'self-closing' tag. They both are of type Nokogiri::XML::Node::ELEMENT_NODE and i am not able to find any other discriminator in the documentation. Any ideas on how to solve this issue?

    Read the article

  • c# Network Programming - HTTPWebRequest Scraping

    - by masterguru
    Hi, I am building a web scraping application. It should scrape a complex web site with concurrent HttpWebRequests from a single host to a single target web server. The application should run on Windows server 2008. One single HttpWebRequest for data could take from 1 minute to 4 minutes to complete (because of long running db operations) I should have at least 100 parallel requests to the target web server, but i have noticed that when i use more then 2-3 long-running requests i have big performance issues (request timeouts/hanging). How many concurrent requests can i have in this scenario from a single host to a single target web server? can i use Thread Pools in the application to run parallel HttpWebRequests to the server? will i have any issues with the default outbound HTTP connection/requests limits? what about Request timeouts when i reach outbound connection limits? what would be the best setup for my scenario? Any help would be appreciated. Thanks

    Read the article

  • Coding a parser for a domain specific language in Java

    - by Bruno Rothgiesser
    We want to design a simple domain specific language for writing test scripts to automatically test a XML-based interface of one of our applications. A sample test would be: Get an input XML file from network shared folder or subversion repository Import the XML file using the interface Check if the import result message was successfull Export the XML corresponding to the object that was just imported using the interface and check if it correct. If the domain specific language can be declarative and its statements look as close as my sentences in the sample above as possible, it will be awesome because people won't necessarily have to be programmers to understand/write/maintain the tests. Something like: newObject = GET FILE "http://svn/repos/template1.xml" reponseMessage = IMPORT newObject newObjectID = GET PROPERTY '/object/id/' FROM responseMessage (..) But then I'm not sure how to implement a simple parser for that languange in Java. Back in school, 10 years ago, I coded a language parser using Lex and Yacc for the C language. Maybe an approach would be to use some equivalent for Java? Or, I could give up the idea of having a declarative language and choose an XML-based language instead, which would possibly be easier to create a parser for? What approach would you recommend?

    Read the article

  • Core Data Error "Fetch Request must have an entity"

    - by Graeme
    Hi, I've attempted to add the TopSongs parser and Core Data files into my application, and it now builds succesfully, with no errors or warning messages. However, as soon as the app loads, it crashes, giving the following reason: *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: 'executeFetchRequest:error: A fetch request must have an entity.' I have renamed all files, including the .xcdatamodel file. Could this be the problem (the renaming of the .xcdatamodel)? I'm assuming this error means that no data can be found. Thanks.

    Read the article

  • C#: Regex to extract portions of file name

    - by jakesankey
    I have text files formatted as such: R156484COMP_004A7001_20100104_065119.txt I need to consistently extract the R****COMP, the 004A7001 number, 20100104 (date), and don't care about the 065119 number. the problem is that not ALL of the files being parsed have the exact naming convention. some may be like this: R168166CRIT_156B2075_SU2_20091223_123456.txt or R285476COMP_SU1_125A6025_20100407_123456.txt So how could I use regex instead of split to ensure I am always getting that serial (ex. 004A7001), the date (ex. 20100104), and the R****COMP (or CRIT)??? Here is what I do now but it only gets the files formatted like my first example. if (file.Count(c => c == '_') != 3) continue; and further down in the code I have: string RNumber = Path.GetFileNameWithoutExtension(file); string RNumberE = RNumber.Split('_')[0]; string RNumberD = RNumber.Split('_')[1]; string RNumberDate = RNumber.Split('_')[2]; DateTime dateTime = DateTime.ParseExact(RNumberDate, "yyyyMMdd", Thread.CurrentThread.CurrentCulture); string cmmDate = dateTime.ToString("dd-MMM-yyyy");

    Read the article

  • How to handle corrupt messages arriving on a socket?

    - by Pieter
    I've got a working socket handling mechanism, similar (but a bit more complex) to Qt's Fortune Example http://qt.nokia.com/doc/4.5/network-fortuneclient.html http://qt.nokia.com/doc/4.5/network-fortuneserver.html Now I'm wondering how to handle corrupt messages. Discarding the data is a start, but I need to discard up to a point I can start processing messages again. The corrupt message may be lost, but I need to be able to recover from it. I've got the following idea in mind: Put a fixed header at the start of each message, eg. 0xABCDEF01. When recovering, lookup this header and restart handling incoming messages. = Break off readFortune() on a timeout and recover = When encountering an inconsistent header, recover A huge blocksize is still going to be a problem. To fix that, I should be constantly checking whether or not I'm reading gibberish, but this is not always possible. I can also limit the blocksize on certain message-types. Any ideas on this? Any proposals on what to use as byteword?

    Read the article

  • Parse HTML with PHP's HTML DOMDocument

    - by Mint
    I was trying to do it with "getElementsByTagName", but it wasn't working, I'm new to using DOMDocument to parse HTML, as I used to use regex until yesterday some kind fokes here told me that DOMEDocument would be better for the job, so I'm giving it a try :) I google around for a while looking for some explains but didn't find anything that helped (not with the class anyway) So I want to capture "Capture this text 1" and "Capture this text 2" and so on. Doesn't look to hard, but I can't figure it out :( <div class="main"> <div class="text"> Capture this text 1 </div> </div> <div class="main"> <div class="text"> Capture this text 2 </div> </div>

    Read the article

  • Detect the language & django locale-url

    - by mamcx
    I want to deploy a website in english & spanish and detect the user browser languaje & redirect to the correct locale site. My site is www.elmalabarista.com I install django-localeurl, but I discover that the languaje is not correctly detected. This are my middlewares: MIDDLEWARE_CLASSES = ( 'django.contrib.sessions.middleware.SessionMiddleware', 'django.middleware.locale.LocaleMiddleware', 'multilingual.middleware.DefaultLanguageMiddleware', 'middleware.feedburner.FeedburnerMiddleware', 'lib.threadlocals.ThreadLocalsMiddleware', 'middleware.url.UrlMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'maintenancemode.middleware.MaintenanceModeMiddleware', 'middleware.redirect.RedirectMiddleware', 'openidconsumer.middleware.OpenIDMiddleware', 'django.middleware.doc.XViewMiddleware', 'middleware.ajax_errors.AjaxMiddleware', 'pingback.middleware.PingbackMiddleware', 'localeurl.middleware.LocaleURLMiddleware', 'multilingual.flatpages.middleware.FlatpageFallbackMiddleware', 'django.middleware.common.CommonMiddleware', ) But ALWAYS the site get to US despite the fact my OS & Browser setup is spanish. LANGUAGES = ( ('en', ugettext('English')), ('es', ugettext('Spanish')), ) DEFAULT_LANGUAGE = 1 Then, I hack the middleware of locale-url and do this: def process_request(self, request): locale, path = self.split_locale_from_request(request) if request.META.has_key('HTTP_ACCEPT_LANGUAGE'): locale = utils.supported_language(request.META['HTTP_ACCEPT_LANGUAGE'].split(',')[0]) locale_path = utils.locale_path(path, locale) if locale_path != request.path_info: if request.META.get("QUERY_STRING", ""): locale_path = "%s?%s" % (locale_path, request.META['QUERY_STRING']) return HttpResponseRedirect(locale_path) request.path_info = path if not locale: locale = settings.LANGUAGE_CODE translation.activate(locale) request.LANGUAGE_CODE = translation.get_language() However, this detect fine the language but redirect the "en" urls to "es". So is impossible navigate in english. UPDATE: This is the final code (after the input from Carl Meyer) with a fix for the case of "/": def process_request(self, request): locale, path = self.split_locale_from_request(request) if (not locale) or (locale==''): if request.META.has_key('HTTP_ACCEPT_LANGUAGE'): locale = utils.supported_language(request.META['HTTP_ACCEPT_LANGUAGE'].split(',')[0]) else: locale = settings.LANGUAGE_CODE locale_path = utils.locale_path(path, locale) if locale_path != request.path_info: if request.META.get("QUERY_STRING", ""): locale_path = "%s?%s" % (locale_path, request.META['QUERY_STRING']) return HttpResponseRedirect(locale_path) request.path_info = path translation.activate(locale) request.LANGUAGE_CODE = translation.get_language()

    Read the article

  • replace double quotes to parse JSON in PHP

    - by hunt
    hi, i have following json format { "status": "ACTIVE", "result": false, "isworking": false, "margin": 1, "employee": { "111": { "val1": 5.7000000000000002, "val2": "9/2", "val3": 5.7000000000000002 }, "222": { "val1": 31.550000000000001, "val2": "29/1", "val3": 31.550000000000001 } } } how the problem is when i am trying to decode above json response in php using json_decode($res,true) { true param for associative array } i am getting following result as few fields like "result":false is not "result":"false" i.e. at many of the places doubles quotes are missing in values of json. see in val1 and val3 fields resultant data after decoding in php (associative array) Array ( [status] => > ACTIVE [result] => > [isworking] => > [margin] => > 1 [employee] => > Array ( [111] => > Array ( [val1] => > 5.7 [val2] => > 9/2 [val3] => > 5.7 ) [222] => > Array ( [val1] => > 31.55 [val2] => > 29/1 [val3] => > 31.55 ) ) ) please help me on how would i insert double quotes in values ? Thanks

    Read the article

  • Parse an HTTP request Authorization header with Python

    - by Kris Walker
    I need to take a header like this: Authorization: Digest qop="chap", realm="[email protected]", username="Foobear", response="6629fae49393a05397450978507c4ef1", cnonce="5ccc069c403ebaf9f0171e9517f40e41" And parse it into this using Python: {'protocol':'Digest', 'qop':'chap', 'realm':'[email protected]', 'username':'Foobear', 'response':'6629fae49393a05397450978507c4ef1', 'cnonce':'5ccc069c403ebaf9f0171e9517f40e41'} Is there a library to do this, or something I could look at for inspiration? I'm doing this on Google App Engine, and I'm not sure if the Pyparsing library is available, but maybe I could include it with my app if it is the best solution. Currently I'm creating my own MyHeaderParser object and using it with reduce() on the header string. It's working, but very fragile. Brilliant solution by nadia below: import re reg = re.compile('(\w+)[=] ?"?(\w+)"?') s = """Digest realm="stackoverflow.com", username="kixx" """ print str(dict(reg.findall(s)))

    Read the article

  • Parse text from a screen grab

    - by Caylem
    Hey guys Not sure the best way to explain this but i'll give it a shot. I'm trying to find a way to parse text/numbers from a screen grab in either C# or Java - whichever provides the easiest way, but preferably java. An example would be as follows. You have a website/document/application with a block of text. You can take a screenshot of the specific area which contains this text. Once the screenshot has been taken you can extract a string from it containing the relevant characters. Any feedback is appreciated. Thanks

    Read the article

  • Managed (.net) library with html-tidy like functionality?

    - by Eamon Nerbonne
    Does anybody know of an html cleaner for .NET that can parse html and (for instance) convert it to a more machine friendly format such as xhtml? I've tried the HTML Agility Pack, but that fails to correctly parse even fairly simple examples. To give an example of html that should be parsed correctly: <html><body> <ul><li>TestElem1 <li>TestElem2 <li>TestElem3 List: <ul><li>Nested1 <li>Nested2</li> <li>Nested3 </ul> <li>TestElem4 </ul> <p>paragraph 1 <p>paragraph 2 <p>paragraph 3 </body></html> li tags don't need to be closed (see spec), and neither do P tags. In other words, the above sample should be parsed as: <html><body> <ul><li>TestElem1</li> <li>TestElem2</li> <li>TestElem3 List: <ul><li>Nested1</li> <li>Nested2</li> <li>Nested3</li> </ul></li> <li>TestElem4</li> </ul> <p>paragraph 1</p> <p>paragraph 2</p> <p>paragraph 3</p> </body></html> Since the aim is to use the library on various machines, it's a big disadvantage to need to fall back to native code (such as a wrapper around html tidy) which would require extra deployment hassle and sacrifice platform independance. Any suggestions? To recap, I'm looking for: An html cleaner ala HTML tidy Must be able to deal with real world html, not just xhtml, at the very least correctly reading valid html 4 Must be able to convert to a more easily processable xml format Should be a purely managed app.

    Read the article

  • Extracting ""((Adj|Noun)+|((Adj|Noun)(Noun-Prep)?)(Adj|Noun))Noun"" from Text (Justeson & Katz, 1995)

    - by ssuhan
    I would like to query if it is possible to extract ((Adj|Noun)+|((Adj|Noun)(Noun-Prep)?)(Adj|Noun))Noun proposed by Justeson and Katz (1995) in R package openNLP? That is, I would like to use this linguistic filtering to extract candidate noun phrases. I cannot well understand its meaning. Could you do me a favor to explain it or transform such representation into R language. Many thanks. Maybe we can start the sample code from: library("openNLP") acq <- "This paper describes a novel optical thread plug gauge (OTPG) for internal thread inspection using machine vision. The OTPG is composed of a rigid industrial endoscope, a charge-coupled device camera, and a two degree-of-freedom motion control unit. A sequence of partial wall images of an internal thread are retrieved and reconstructed into a 2D unwrapped image. Then, a digital image processing and classification procedure is used to normalize, segment, and determine the quality of the internal thread." acqTag <- tagPOS(acq) acqTagSplit = strsplit(acqTag," ")

    Read the article

  • Encoding a email address that can be used as part of a URL in codeigniter

    - by freedayum
    Is there a way to encode a email address that can be used as a part of a url in codeigniter?. I need to decode back the email address from the url. What I am trying to do is just a -forgotten password recovery- thing. I send a confirmation link to the user's email address, the link needs to be like ../encodedEmail/forgottenPasswordCode (with the forgottenPasswordCode updated in the db for the user with the submitted email). When the user visits that link, I decode the email(if the email - forgottenPasswordCode pair is in the table), i allow them to reset their password (and i reset forgottenPasswordCode back to null). I could just do a loop -checking the table with a select query- (or) -set that forgottenPasswordCode column unique, so i keep generating on a insert failure(would that be a lot faster ?)- until I generate a forgottenPasswordCode that doesn't already exist in the table. But the guy I do this for would not accept it this way:). He wants the checking be done with the user's email, he thinks its much faster. I am working with codeigniter, I used its encode() function, it seems to produce characters like '-slashes-' at times that breaks the encoded-email-string. Any other ideas?

    Read the article

< Previous Page | 118 119 120 121 122 123 124 125 126 127 128 129  | Next Page >