Search Results

Search found 4222 results on 169 pages for 'dtd parsing'.

Page 1/169 | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Loading not-so-well-formed XML into XDocument (multiple DTD)

    - by Gart
    I have got a problem handling data which is almost well-formed XHTML document except for it has multiple DTD declarations in the beginning: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ... </head> <body> ... </body> </html> I need load this data into XDocument object using only the first DTD and ignoring the rest declarations. It is not possible to completely ignore DTD processing because the document may have unusual characters like &acirc; or &euro; etc. The text is retrieved from external source and I have no idea why it comes like this. Obviously my naive attempt to load this document fails with System.Xml.XmlException : Cannot have multiple DTDs: var xmlReaderSettings = new XmlReaderSettings { DtdProcessing = DtdProcessing.Parse XmlResolver = new XmlPreloadedResolver(), ConformanceLevel = ConformanceLevel.Document, }; using (var xmlReader = XmlReader.Create(stream, xmlReaderSettings)) { return XDocument.Load(xmlReader); } What would be the best way to handle this kind of data?

    Read the article

  • Parsing / Extracting Text from String in Rails?

    - by user641116
    I have a string in Rails, e.g. "This is a Twitter message. #books War & Peace by Leo Tolstoy. I love this book!", and I want to parse the text and extract only certain phrases, like "War & Peace by Leo Tolstoy". Is this a matter of using Regex and lifting the text between "#books" to "."? What if there's no structure to the message, like: "This is a Twitter message #books War & Peace by Leo Tolstoy I love this book!" or "This is a Twitter message. I love the book War & Peace by Leo Tolstoy #books" How can I reliably pull the phrase "War & Peace by Leo Tolstoy" without knowing the phrase ex ante. Are there any gems, methods, etc. that can help me do this? At the very least, what would you call what I'm trying to do? It will help me search for a solution on Google. I've tried a few searches on "parsing" with no luck.

    Read the article

  • Why is the XML DTD not found by the browser

    - by hyperuser
    When I load my XML file in a browser, it complains there is 'no style information': "This XML file does not appear to have any style information associated with it. The document tree is shown below." So I wrote an external DTD, then an internal DTD, but keep getting the same 'no style information' error. It doesn't even show the DTD! What am I doing wrong? <?xml version="1.0"?> <!DOCTYPE fotos [ <!ELEMENT fotos (titel,auteur)> <!ELEMENT titel (#PCDATA)> <!ELEMENT auteur (#PCDATA)> ]> <fotos> <titel>titel1</titel> <auteur>jan</auteur> </fotos>

    Read the article

  • An error has occurred opening extern DTD (w3.org, xhtml1-transitional.dtd). 503 Server Unavailable

    - by Cheeso
    I'm trying to do xpath queries over an xhtml document. The document looks like this: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html lang="en" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head> .... </head> <body> ... </body> </html> Because the document includes various char entities (&nbsp; and so on), I need to use the DTD, in order to load it with an XmlReader. So my code looks like this: var reader = XmlReader.Create(sr, new XmlReaderSettings { ProhibitDtd = false }); But when I run this, it returns An error has occurred while opening external DTD 'http://www.w3.org/TR/xhtml1-transitional.dtd': The remote server returned an error: (503) Server Unavailable. Now, I know why I am getting the 503 error. W3C explained it very clearly. But I still want to validate the document. How can I validate with the DTD, and get the entity definitions, without hitting the w3.org website? related: - java.io.IOException: Server returned HTTP response code: 503

    Read the article

  • XML Catalog file failing to resolve

    - by newt
    I'm using an OASIS v 1.1 compatible resolver (Norm Walsh's XMLResolver in conjunction with the catalog below. However, I'm pretty sure I've made some sort of obvious error here (this is the first time I've needed to use v 1.1 features) since attempting to resolve OxChapML.dtd fails. Can anyone see something obviously wrong with this? Or even subtly wrong? <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE catalog PUBLIC "-//OASIS//DTD XML Catalogs V1.1//EN" "http://www.oasis-open.org/committees/entity/release/1.1/catalog.dtd"> <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> <group xml:base="file:///Volumes/Ac-EDP/DTG/SP%20DTD%20management/OUP_DTD/"> <public publicId="-//OXFORD//DTD OXCHAPML//EN" uri="OxChapML.dtd"/> <public publicId="-//OXFORD//DTD OXENCYCLML//EN" uri="xEncyclML.dtd"/> <public publicId="-//OXFORD//DTD OXLAWML//EN" uri="OxLawML.dtd"/> <public publicId="-//OXFORD//DTD OXSTRUCTML//EN" uri="OxStructML.dtd"/> <public publicId="-//OXFORD//DTD OXLAWREPML//EN" uri="OxLawRepML.dtd"/> <public publicId="-//OXFORD//DTD OXBILINGML//EN" uri="OxBilingML.dtd"/> <public publicId="-//OXFORD//DTD OXMONOLINGML//EN" uri="OxMonolingML.dtd"/> <public publicId="-//OXFORD//DTD TIMELINES//EN" uri="timelines.dtd"/> <systemSuffix OxChapML.dtd" systemIdSuffix="OxChapML.dtd"/> <systemSuffix uri="xEncyclML.dtd" systemIdSuffix="xEncyclML.dtd"/> <systemSuffix systemIdSuffix="OxLawML.dtd" uri="OxLawML.dtd"/> <systemSuffix systemIdSuffix="OxStructML.dtd" uri="OxStructML.dtd"/> <systemSuffix systemIdSuffix="OxLawRepML.dtd" uri="OxLawRepML.dtd"/> <systemSuffix systemIdSuffix="OxBilingML.dtd" uri="OxBilingML.dtd"/> <systemSuffix systemIdSuffix="OxMonolingML.dtd" uri="OxMonolingML.dtd"/> <systemSuffix systemIdSuffix="timelines.dtd" uri="timelines.dtd"/> </group> </catalog>

    Read the article

  • How to validate xml using a .dtd via a proxy and NOT using system.net.defaultproxy

    - by Lanceomagnifico
    Hi, Someone else has already asked a somewhat similar question: http://stackoverflow.com/questions/1888887/validate-an-xml-file-against-a-dtd-with-a-proxy-c-2-0/2766197#2766197 Here's my problem: We have a website application that needs to use both internal and external resources. We have a bunch of internal webservices. Requests to the CANNOT go through the proxy. If we try to, we get 404 errors since the proxy DNS doesn't know about our internal webservice domains. We generate a few xml files that have to be valid. I'd like to use the provided dtd documents to validate the xml. The dtd urls are outside our network and MUST go through the proxy. Is there any way to validate via dtd through a proxy without using system.net.defaultproxy? If we use defaultproxy, the internal webservices are busted, but the dtd validation works.# Here is what I'm doing to validate the xml right now: public static XDocument ValidateXmlUsingDtd(string xml) { var xrSettings = new XmlReaderSettings { ValidationType = ValidationType.DTD, ProhibitDtd = false }; var sr = new StringReader(xml.Trim()); XmlReader xRead = XmlReader.Create(sr, xrSettings); return XDocument.Load(xRead); } Ideally, there would be some way to assign a proxy to the XmlReader much like you can assign a proxy to the HttpWebRequest object. Or perhaps there is a way to programatically turn defaultproxy on or off? So that I can just turn it on for the call to Load the Xdocument, then turn it off again? FYI - I'm open to ideas on how to tackle this - note that the proxy is located in another domain, and they don't want to have to set up a dns lookup to our dns server for our internal webservice addresses. Cheers, Lance

    Read the article

  • Parsing scripts that use curly braces

    - by Keikoku
    To get an idea of what I'm doing, I am writing a python parser that will parse directx .x text files. The problem I have deals with how the files are formatted. Although I'm writing it in python, I'm looking for general algorithms for dealing with this sort of parsing. .x files define data using templates. The format of a template is template_name { [some_data] } The goal I have is to parse the file line-by-line and whenever I come across a template, I will deal with it accordingly. My initial approach was to check if a line contains an opening or closing brace. If it's an open brace, then I will check what the template name is. Now the catch here is that the open brace doesn't have to occur on the same line as the template name. It could just as well be template_name { [some_data] } So if I were to use my "open brace exists" criteria, it won't work for any files that use the latter format. A lot of languages also use curly braces (though I'm not sure when people would be parsing the scripts themselves), so I was wondering if anyone knows how to accurately get the template name (or in some other languages, it could just as well be a function name, though there aren't any keywords to look for)

    Read the article

  • JiBX binding DTD schema in Eclipse

    - by Trick
    I have warnings in binding xml files: No grammar constraints (DTD or XML schema) detected for the document. I have done as is written in the answer here: http://stackoverflow.com/questions/982263/jibx-how-do-i-keep-using-interfaces-in-my-code (answer which is not accepted). But now I have an error in binding xml file: Referenced file contains errors (file:/C:/Amplio/LiveCliq/Work/core/src/main/resources/config/rest/ mappings/binding.dtd). For more information, right click on the message in the Problems View and select "Show Details..." And the details are: The markup in the document preceding the root element must be well-formed. line 20 I am not familiar with DTD schemas, so I don't know what is the problem. Did anybody found the solution? And - I do not want to turn off validation in XML files, I would like to have this in binding files (mainly for code assist and validation).

    Read the article

  • dtd vs xsd, which one to choose?

    - by noname
    i want to use one of these to describe my xml document. i've read that xsd is better than the older dtd since it supports namespaces and data types. does this mean that i should only use xsd for all future needs and totally ignore dtd and don´t even have to bother learning its structure?

    Read the article

  • valid children of XmlNode according to DTD?

    - by redoced
    consider this: I'm inside a (selfbuilt) XML Editor and am about to add a Child to an XmlNode. How do I know which types of children are valid according to a DTD. it's a behaviour like Intellisense. I couldn't find any .NET classes for "parsing" the DTD. How would i go about this?

    Read the article

  • Parsing a string, Grammar file.

    - by defn
    How would I separate the below string into its parts. What I need to separate is each < Word including the angle brackets from the rest of the string. So in the below case I would end up with several strings 1. "I have to break up with you because " 2. "< reason " (without the spaces) 3. " . But Let's still " 4. "< disclaimer " 5. " ." I have to break up with you because <reason> . But let's still <disclaimer> . below is what I currently have (its ugly...) boolean complete = false; int begin = 0; int end = 0; while (complete == false) { if (s.charAt(end) == '<'){ stack.add(new Terminal(s.substring(begin, end))); begin = end; } else if (s.charAt(end) == '>') { stack.add(new NonTerminal(s.substring(begin, end))); begin = end; end++; } else if (end == s.length()){ if (isTerminal(getSubstring(s, begin, end))){ stack.add(new Terminal(s.substring(begin, end))); } else { stack.add(new NonTerminal(s.substring(begin, end))); } complete = true; } end++;

    Read the article

  • Parsing tab delimited file with double quotes in Perl

    - by sfactor
    I have a data set that is tab delimited with the user-agent strings in double quotes. I need to parse each of these columns and based on the answer of my other post I used the Text::CSV module. 94410634 0 GET "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.6; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; AskTB5.5)" 1 The code is a simple one. #!/usr/bin/perl use strict; use warnings; use Text::CSV; my $csv = Text::CSV->new(sep_char => "\t"); while (<>) { if ($csv->parse($_)) { my @columns = $csv->fields(); print "@columns\n"; } else { my $err = $csv->error_input; print "Failed to parse line: $err"; } } But i get the Failed to parse line: error when I try it on this dataset. what am I doing wrong? I need to extract the 4th column containing the user-agent strings for further processing.

    Read the article

  • Perl: parsing string enclosed by double quotes

    - by sfactor
    I need to parse tab/space delimited files that have a lot of columns in Perl. The values are such that the there are large strings enclosed within double quotes. These strings can have any characters such as tabs and spaces or anything else. When I try to parse them with the split function it splits these strings as well. Now how can I make perl understand that the strings within the " " are a single column entry? A simple example is, 12 345546.67677 "Hello World!!!" -567.55656 0.5465767 "Hello_Again; "

    Read the article

  • What options to parse a DTD using PHP

    - by Chadwick
    I need to parse DTDs using PHP and am hoping there's a simple library to help out. Each DTD has numerous <!ENTITY... and <!-- Comment... elements, which I need to act upon. Note that I do not need to validate anything against these DTDs, simply parse them as data files themselves. A few options I've looked at: James Clarke's SD, which is an option of last resort, but I'd like to avoid the complexity of building/installing/configuring code external to PHP. I'm not sure it's even possible in my situation. PEAR has an XML_DTD_Parser, which requires installing/configuring PEAR and a number of pear modules, which I'm also not sure is possible, and would rather avoid. Has anyone used it with success? PHP XML Classes has the class_path_parser, which another site suggested, but it fails to read ENTITY elements. It appears to be using PHP's built in XML parsing capabilities, which use EXPAT. PHP's DOMDocument will validate against a DTD, so must be able to read them, though I don't see how to get at the DTD parser directly at first glance.

    Read the article

  • What libraries will parse a DTD using PHP

    - by Chadwick
    I need to parse DTDs using PHP and am hoping there's a simple library to help out. Each DTD has numerous <!ENTITY... and <!-- Comment... elements, which I need to act upon. Note that I do not need to validate anything against these DTDs, simply parse them as data files themselves. A few options I've looked at: James Clarke's SD, which is an option of last resort, but I'd like to avoid the complexity of building/installing/configuring code external to PHP. I'm not sure it's even possible in my situation. PEAR has an XML_DTD_Parser, which requires installing/configuring PEAR and a number of pear modules, which I'm also not sure is possible, and would rather avoid. Has anyone used it with success? PHP XML Classes has the class_path_parser, which another site suggested, but it fails to read ENTITY elements. It appears to be using PHP's built in XML parsing capabilities, which use EXPAT. PHP's DOMDocument will validate against a DTD, so must be able to read them, though I don't see how to get at the DTD parser directly at first glance.

    Read the article

  • getting expat to use .dtd for entity replacement in python

    - by nicolas78
    I'm trying to read in an xml file which looks like this <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE dblp SYSTEM "dblp.dtd"> <dblp> <incollection> <author>Jos&eacute; A. Blakeley</author> </incollection> </dblp> The point that creates the problem looks is the Jos&eacute; A. Blakeley part: The parser calls its character handler twice, once with "Jos", once with " A. Blakeley". Now I understand this may be the correct behaviour if it doesn't know the eacute entity. However, this is defined in the dblp.dtd, which I have. I don't seem to be able to convince expat to use this file, though. All I can say is p = xml.parsers.expat.ParserCreate() # tried with and without following line p.SetParamEntityParsing(xml.parsers.expat.XML_PARAM_ENTITY_PARSING_ALWAYS) p.UseForeignDTD(True) f = open(dblp_file, "r") p.ParseFile(f) but expat still doesn't recognize my entity. Why is there no way to tell expat which DTD to use? I've tried putting the file into the same directory as the XML putting the file into the program's working directory replacing the reference in the xml file by an absolute path What am I missing? Thx.

    Read the article

  • Language parsing to find important words

    - by Matt Huggins
    I'm looking for some input and theory on how to approach a lexical topic. Let's say I have a collection of strings, which may just be one sentence or potentially multiple sentences. I'd like to parse these strings to and rip out the most important words, perhaps with a score that denotes how likely the word is to be important. Let's look at a few examples of what I mean. Example #1: "I really want a Keurig, but I can't afford one!" This is a very basic example, just one sentence. As a human, I can easily see that "Keurig" is the most important word here. Also, "afford" is relatively important, though it's clearly not the primary point of the sentence. The word "I" appears twice, but it is not important at all since it doesn't really tell us any information. I might expect to see a hash of word/scores something like this: "Keurig" => 0.9 "afford" => 0.4 "want" => 0.2 "really" => 0.1 etc... Example #2: "Just had one of the best swimming practices of my life. Hopefully I can maintain my times come the competition. If only I had remembered to take of my non-waterproof watch." This example has multiple sentences, so there will be more important words throughout. Without repeating the point exercise from example #1, I would probably expect to see two or three really important words come out of this: "swimming" (or "swimming practice"), "competition", & "watch" (or "waterproof watch" or "non-waterproof watch" depending on how the hyphen is handled). Given a couple examples like this, how would you go about doing something similar? Are there any existing (open source) libraries or algorithms in programming that already do this?

    Read the article

  • Parsing a website's source

    - by Davlog
    I want to create an application and maybe upload it to the play store but I am not sure if that what my app does is legal or not. I am downloading a page's source from a website to get some information I need. For example if I download a page about the song "Ramble On" by Led Zeppelin and parse this page source to get the song's name, maybe a link to an image and the lyrics. Would that be illegal or can I display these information to my users without getting any problem? Also the website says it's an "open 'wiki-style' [...].It's completely user built by people like you and used every day by fans and developers alike."

    Read the article

  • Parsing mathematical experssions with two values that have parenthesis and minus signs

    - by user45921
    I'm trying to parse equations like these which only has two values or the square root of a certain value from a text file: 100+100 -100-100 -(100)+(-100) sqrt(100) by the minues signs, parenthesis and the operator symbol in the middle and the square root, and I have no idea how to start off... I've got the file part done and the simple calculation parts except that I couldnt get my program to solve the equations in the above. #include <stdio.h> #include <string.h> #include <stdlib.h> #include <math.h> main(){ FILE *fp; char buff[255], sym,sym2,del1,del2,del3,del4; double num1, num2; int ret; fp = fopen("input.txt","r"); while(fgets(buff,sizeof(buff),fp)!=NULL){ char *tok = buff; sscanf(tok,"%lf%c%lf",&num1,&sym,&num2); switch(sym){ case '+': printf("%lf\n", num1+num2); break; case '-': printf("%lf\n", num1-num2); break; case '*': printf("%lf\n", num1*num2); break; case '/': printf("%lf\n", num1/num2); break; default: printf("The input value is not correct\n"); break; } } fclose(fp); } that is what have I written for the other basic operations without parenthesis and the minus sign for the second value and it works great for the simple ones. I'm using a switch method to calculate the add, sub, mul and divide but I'm not sure how to properly use the sscanf function (if I am not using it properly) or if there is another way using a function like strtok to properly parse the parenthesis and the minus signs. Any kind help?

    Read the article

  • Qt C++ XML, validating against a DTD?

    - by Airjoe
    Is there a way to validate an XML file against a DTD with Qt's XML handling? I've tried googling around but can't seem to get a straight answer. If Qt doesn't include support for validating an XML file, what might be the process of implementing validation myself? Any good reference to start with in regards to validating XML against a spec? Thanks for the help!

    Read the article

  • DTD definition error

    - by Geln Yang
    Hi, It will get a error to define a dtd as follow: <!ELEMENT line (property*)> <!ATTLIST line showType (1|?|+|*) "1" > The error: The name token is required in the enumerated type list for the "showType" attribute declaration. It seems the value can't be special characters,such as "?","+","*". To change the characters to Latin-1 characters, like "& #42;"(add a blank before '#') , get the same error. How to resolve this problem? Thanks!

    Read the article

  • Writing a DTD: How to achieve this children setup

    - by Boldewyn
    The element tasklist may contain at most one title and at most one description, additionally any number (incl. 0) task elements in any order. The naive approach is not applicable, since the order should not matter: <!ELEMENT tasklist (title?, description?, task*) > Alternatively, I could explicitly name all possible options: (title, description?, task*) | (title, task+, description?, task*) | (task+, title, task*, description?, task*) | (description, title?, task*) | (description, task+, title?, task*) | (task+, description, task*, title?, task*) | (task*) but then it's quite easy to write a non-deterministic rule, and furthermore it looks like the direct path to darkest madness. Any ideas, how this could be done more elegantly? And no, an XSD or RelaxNG is no option. I need a plain, old DTD.

    Read the article

  • Extending XHTML

    - by Daniel Schaffer
    I'm playing around with writing a jQuery plugin that uses an attribute to define form validation behavior (yes, I'm aware there's already a validation plugin; this is as much a learning exercise as something I'll be using). Ideally, I'd like to have something like this: Example 1 - input: <input id="name" type="text" v:onvalidate="return this.value.length > 0;" /> Example 2 - wrapper: <div v:onvalidate="return $(this).find('[value]').length > 0;"> <input id="field1" type="text" /> <input id="field2" type="text" /> <input id="field3" type="text" /> </div> Example 3 - predefined: <input id="name" type="text" v:validation="not empty" /> The goal here is to allow my jQuery code to figure out which elements need to be validated (this is already done) and still have the markup be valid XHTML, which is what I'm having a problem with. I'm fairly sure this will require a combination of both DTD and XML Schema, but I'm not really quite sure how exactly to execute. Based on this article, I've created the following DTD: <!ENTITY % XHTML1-formvalidation1 PUBLIC "-//W3C//DTD XHTML 1.1 +FormValidation 1.0//EN" "http://new.dandoes.net/DTD/FormValidation1.dtd" > %XHTML1-formvalidation1; <!ENTITY % Inlspecial.extra "%div.qname; " > <!ENTITY % xhmtl-model.mod SYSTEM "formvalidation-model-1.mod" > <!ENTITY % xhtml11.dtd PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd" > %xhtml11.dtd; And here is "formvalidation-model-1": <!ATTLIST %div.qname; %onvalidation CDATA #IMPLIED %XHTML1-formvalidation1.xmlns.extra.attrib; > I've never done DTD before, so I'm not even really exactly sure what I'm doing. When I run my page through the W3 XHTML validator, I get 80+ errors because it's getting duplicate definitions of all the XHTML elements. Am I at least on the right track? Any suggestions? EDIT: I removed this section from my custom DTD, because it turned out that it was actually self-referencing, and the code I got the template from was really for combining two DTDs into one, not appending specific items to one: <!ENTITY % XHTML1-formvalidation1 PUBLIC "-//W3C//DTD XHTML 1.1 +FormValidation 1.0//EN" "http://new.dandoes.net/DTD/FormValidation1.dtd" > %XHTML1-formvalidation1; I also removed this, because it wasn't validating, and didn't seem to be doing anything: <!ENTITY % Inlspecial.extra "%div.qname; " > Additionally, I decided that since I'm only adding a handful of additional items, the separate files model recommended by W3 doesn't really seem that helpful, so I've put everything into the dtd file, the content of which is now this: <!ATTLIST div onvalidate CDATA #IMPLIED> <!ENTITY % xhtml11.dtd PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd" > %xhtml11.dtd; So now, I'm not getting any DTD-related validation errors, but the onvalidate attribute still is not valid. Update: I've ditched the DTD and added a schema: http://schema.dandoes.net/FormValidation/1.0.xsd Using v:onvalidate appears to validate in Visual Studio, but the W3C service still doesn't like it. Here's a page where I'm using it so you can look at the source: http://new.dandoes.net/auth And here's the link to the w3c validation result: http://validator.w3.org/check?uri=http://new.dandoes.net/auth&charset=(detect+automatically)&doctype=Inline&group=0 Is this about as close as I'll be able to get with this, or am I still doing something wrong?

    Read the article

1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >