Search Results

Search found 21350 results on 854 pages for 'url parsing'.

Page 19/854 | < Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26  | Next Page >

  • Parsing a website

    - by Phenom
    I want to make a program that takes as user input a website address. The program then goes to that website, downloads it, and then parses the information inside. It outputs a new html file using the information from the website. Specifically, what this program will do is take certain links from the website, and put the links in the output html file, and it will discard everything else. Right now I just want to make it for websites that don't require a login, but later on I want to make it work for sites where you have to login, so it will have to be able to deal with cookies. I'll also want to later on have the program be able to explore certain links and download information from those other sites. What are the best programming languages or tools to do this?

    Read the article

  • Parsing a String into date with pattern:"dd/MM/yyyy"

    - by kawtousse
    Hi, I want to insert a date having this format MM/dd/YYYY for example:04/29/2010 to 29/04/2010 to be inserted into mysql database in a field typed Date. So i have this code: String dateimput=request.getParameter("datepicker"); DateFormat df = new SimpleDateFormat("dd/MM/yyyy"); Date dt = null; try { dt = df.parse(dateimput); System.out.println("date imput is:" +dt); } catch (ParseException e) { e.printStackTrace(); } but it gives me those error: 1-date imput is:Fri May 04 00:00:00 CEST 2012 (it is not the correct value that have been entered). 2-dismatching with mysql date type. I can not detect the error exactly. Please help.

    Read the article

  • Haskell - Parsec Parsing <p> element

    - by Martin
    I'm using Text.ParserCombinators.Parsec and Text.XHtml to parse an input like this: This is the first paragraph example\n with two lines\n \n And this is the second paragraph\n And my output should be: <p>This is the first paragraph example\n with two lines\n</p> <p>And this is the second paragraph\n</p> I defined: line= do{ ;t<-manyTill (anyChar) newline ;return t } paragraph = do{ t<-many1 (line) ;return ( p << t ) } But it returns: <p>This is the first paragraph example\n with two lines\n\n And this is the second paragraph\n</p> What is wrong? Any ideas? Thanks!

    Read the article

  • Parsing Chunk of Data into Hash of Array With Perl

    - by neversaint
    I have data that looks like this: #info #info2 1:SRX004541 Submitter: UT-MGS, UT-MGS Study: Glossina morsitans transcript sequencing project(SRP000741) Sample: Glossina morsitans(SRS002835) Instrument: Illumina Genome Analyzer Total: 1 run, 8.3M spots, 299.9M bases Run #1: SRR016086, 8330172 spots, 299886192 bases 2:SRX004540 Submitter: UT-MGS Study: Anopheles stephensi transcript sequencing project(SRP000747) Sample: Anopheles stephensi(SRS002864) Instrument: Solexa 1G Genome Analyzer Total: 1 run, 8.4M spots, 401M bases Run #1: SRR017875, 8354743 spots, 401027664 bases 3:SRX002521 Submitter: UT-MGS Study: Massive transcriptional start site mapping of human cells under hypoxic conditions.(SRP000403) Sample: Human DLD-1 tissue culture cell line(SRS001843) Instrument: Solexa 1G Genome Analyzer Total: 6 runs, 27.1M spots, 977M bases Run #1: SRR013356, 4801519 spots, 172854684 bases Run #2: SRR013357, 3603355 spots, 129720780 bases Run #3: SRR013358, 3459692 spots, 124548912 bases Run #4: SRR013360, 5219342 spots, 187896312 bases Run #5: SRR013361, 5140152 spots, 185045472 bases Run #6: SRR013370, 4916054 spots, 176977944 bases What I want to do is to create a hash of array with first line of each chunk as keys and SR## part of lines with "^Run" as its array member: $VAR = { 'SRX004541' => ['SRR016086'], # etc } But why my construct doesn't work. And it must be a better way to do it. use Data::Dumper; my %bighash; my $head = ""; my @temp = (); while ( <> ) { chomp; next if (/^\#/); if ( /^\d{1,2}:(\w+)/ ) { print "$1\n"; $head = $1; } elsif (/^Run \#\d+: (\w+),.*/){ print "\t$1\n"; push @temp, $1; } elsif (/^$/) { push @{$bighash{$head}}, [@temp]; @temp =(); } } print Dumper \%bighash ;

    Read the article

  • Parsing Complex Text File with C#

    - by David
    Hello, I need to parse a text file that has a lot of levels and characters. I've been trying different ways to parse it but I haven't been able to get anything to work. I've included a sample of the text file I'm dealing with. Any suggestions on how I can parse this file? I have denoted the parts of the file I need with TEXTINEED. (bean name: 'TEXTINEED context: (list '/text '/content/home/left-nav/text '/content/home/landing-page) type: '/text/types/text module: '/modules/TEXTINEED source: '|moretext| ((contents (list (list (bean type: '/directory/TEXTINEED ((directives (bean ((chartSize (list 600 400)) (showCorners (list #f)) (showColHeader (list #f)) (showRowHeader (list #f))))))) (bean type: '/directory/TEXTINEED ((directives (bean ((displayName (list "MTD")) (showCorners (list #f)) (showColHeader (list #f)) (showRowLabels (list #f)) (hideDetailedLink (list #t)) (showRowHeader (list #f)) (chartSize (list 600 400))))))) (bean type: '/directory/TEXTINEED ((directives (bean ((displayName (list "QTD")) (showCorners (list #f)) (showColHeader (list #f)) (showRowLabels (list #f)) (hideDetailedLink (list #t)) (showRowHeader (list #f)) (chartSize (list 600 400)))))))) Thanks!

    Read the article

  • IE 8 html parsing error message.

    - by user48408
    I'm experiencing the problem outlined in this kb article. http://support.microsoft.com/kb/927917 . Sorry I can't hyperlink cos i don't have enough points! "This problem occurs because a child container HTML element contains script that tries to modify the parent container element of the child container. The script tries to modify the parent container element by using either the innerHTML method or the appendChild method." The problem I'm having diagnosing the source of my problem is 2 fold: 1) This is only happening on some client machines (All are running IE8) and not others. How/Why only some? 2) I don't have any scripts which modify the innerHTML or call appendChild on any dom elements. I do have server side code which modify properties on asp .net server controls. (Essentially all thats happening is a panel control with some more controls is being made visbile or invisible on a button click), would these in turn then set the innerHTML property of the client rendered control(?)

    Read the article

  • Parsing XML data with Namespaces in PHP

    - by osbmedia
    I'm trying to work with this XML feed that uses namespaces and i'm not able to get past the colon in the tags. Here's how the XML feed looks like: <r25:events pubdate="2010-05-19T13:58:08-04:00"> <r25:event xl:href="event.xml?event_id=328" id="BRJDMzI4" crc="00000022" status="est"> <r25:event_id>328</r25:event_id> <r25:event_name>Testing 09/2005-08/2006</r25:event_name> <r25:alien_uid/> <r25:event_priority>0</r25:event_priority> <r25:event_type_id xl:href="evtype.xml?type_id=105">105</r25:event_type_id> <r25:event_type_name>CABINET</r25:event_type_name> <r25:node_type>C</r25:node_type> <r25:node_type_name>cabinet</r25:node_type_name> <r25:state>1</r25:state> <r25:state_name>Tentative</r25:state_name> <r25:event_locator>2005-AAAAMQ</r25:event_locator> <r25:event_title/> <r25:favorite>F</r25:favorite> <r25:organization_id/> <r25:organization_name/> <r25:parent_id/> <r25:cabinet_id xl:href="event.xml?event_id=328">328</r25:cabinet_id> <r25:cabinet_name>cabinet 09/2005-08/2006</r25:cabinet_name> <r25:start_date>2005-09-01</r25:start_date> <r25:end_date>2006-08-31</r25:end_date> <r25:registration_url/> <r25:last_mod_dt>2008-02-27T14:22:43-05:00</r25:last_mod_dt> <r25:last_mod_user>abc00296004</r25:last_mod_user> </r25:event> </r25:events> And here is what I'm using for code - I'll trying to throw these into a bunch of arrays where I can format the output however I want: <?php $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, "http://somedomain.com/blah.xml"); curl_setopt ($ch, CURLOPT_HTTPHEADER, Array("Content-Type: text/xml")); curl_setopt($ch, CURLOPT_USERPWD, "username:password"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $output = curl_exec($ch); curl_close($ch); $xml = new SimpleXmlElement($output); foreach ($xml->events->event as $entry){ $dc = $entry->children('http://www.collegenet.com/r25'); echo $entry->event_name . "<br />"; echo $entry->event_id . "<br /><br />"; }

    Read the article

  • string parsing help

    - by sprugman
    I've got a string like this: #################### Section One #################### Data A Data B #################### Section Two #################### Data C Data D etc. I want to parse it into something like: $arr( 'Section One' => array('Data A', 'Data B'), 'Section Two' => array('Data C', 'Data D') ) At first I tried this: $sections = preg_split("/(\r?\n)(\r?\n)#/", $file_content); The problem is, the file isn't perfectly clean: sometimes there are different numbers of blank lines between the sections, or blank spaces between data rows. The section head pattern itself seems to be relatively consistent: #################### Section Title #################### The number of #'s is probably consistent, but I don't want to count on it. The white space on the title line is pretty random. Once I have it split into sections, I think it'll be pretty straightforward, but any help writing a killer reg ex to get it there would be appreciated. (Or if there's a better approach than reg ex...)

    Read the article

  • Parsing multiple files at a time in Perl

    - by sfactor
    I have a large data set (around 90GB) to work with. There are data files (tab delimited) for each hour of each day and I need to perform operations in the entire data set. For example, get the share of OSes which are given in one of the columns. I tried merging all the files into one huge file and performing the simple count operation but it was simply too huge for the server memory. So, I guess I need to perform the operation each file at a time and then add up in the end. I am new to perl and am especially naive about the performance issues. How do I do such operations in a case like this. As an example two columns of the file are. ID OS 1 Windows 2 Linux 3 Windows 4 Windows Lets do something simple, counting the share of the OSes in the data set. So, each .txt file has millions of these lines and there are many such files. What would be the most efficient way to operate on the entire files.

    Read the article

  • XML Parsing in Groovy strips attribute new lines

    - by Bill James
    I'm writing code where I retrieve XML from a web api, then parse that XML using Groovy. Unfortunately, it seems that both XmlParser and XmlSlurper for Groovy strip newline characters from the attributes of nodes when .text() is called. How can I get at the text of the attribute including the newlines? Sample code: def xmltest = ''' <snippet> <preSnippet att1="testatt1" code="This is line 1 This is line 2 This is line 3" > <lines count="10" /> </preSnippet> </snippet>''' def parsed = new XmlParser().parseText( xmltest ) println "Parsed" parsed.preSnippet.each { pre -> println pre.attribute('code'); } def slurped = new XmlSlurper().parseText( xmltest ) println "Slurped" slurped.children().each { preSnip -> println [email protected]() } the output of which is: Parsed This is line 1 This is line 2 This is line 3 Slurped This is line 1 This is line 2 This is line 3

    Read the article

  • protocol parsing in c

    - by nomad.alien
    I have been playing around with trying to implement some protocol decoders, but each time I run into a "simple" problem and I feel the way I am solving the problem is not optimal and there must be a better way to do things. I'm using C. Currently I'm using some canned data and reading it in as a file, but later on it would be via TCP or UDP. Here's the problem. I'm currently playing with a binary protocol at work. All fields are 8 bits long. The first field(8bits) is the packet type. So I read in the first 8 bits and using a switch/case I call a function to read in the rest of the packet as I then know the size/structure of it. BUT...some of these packets have nested packets inside them, so when I encounter that specific packet I then have to read another 8-16 bytes have another switch/case to see what the next packet type is and on and on. (Luckily the packets are only nested 2 or 3 deep). Only once I have the whole packet decoded can I handle it over to my state machine for processing. I guess this can be a more general question as well. How much data do you have to read at a time from the socket? As much as possible? As much as what is "similar" in the protocol headers? So even though this protocol is fairly basic, my code is a whole bunch of switch/case statements and I do a lot of reading from the file/socket which I feel is not optimal. My main aim is to make this decoder as fast as possible. To the more experienced people out there, is this the way to go or is there a better way which I just haven't figured out yet? Any elegant solution to this problem?

    Read the article

  • String Parsing in C#

    - by Betamoo
    What is the most efficient way to parse a C# string in the form of "(params (abc 1.3)(sdc 2.0)....)" into a struct in the form struct Params { double abc,sdc....; } Thanks EDIT The structure always have the same parameters (number and names).. but the order is not granted..

    Read the article

  • Parsing a simple file

    - by Mike Graham
    I have a file consisting of lines of the form Foo="Some information" Bar="More" Starting with such a string, what is the best way to extract "Some information" and "More" as strings? Foo and Bar are always exactly those names.

    Read the article

  • nginx: rewrite URL but have original URL stored in access.log as 200

    - by mhambra
    I'm setting up a link tracking system, which (temporarily) involves adding /link/id/ in front of URL (like http://server/data/id/publication/id/). rewrite data/id/(.*) http://server/$1; The request is logged as: ip - - [17/Nov/2011:10:07:19 +0300] "GET /data/id/publication/id.html HTTP/1.1" 302 154 "-" "UA"` For some reason (keeping the compatibility with AWStats) it is wanted to have 200 logged instead of 302. (nginx allows to get 301 code out of box with permanent option, but thats inappropriate too) What are my options here? Will the combination of location { } and rewrite do the job?

    Read the article

  • string parsing and substring in c

    - by Josh
    I'm trying to parse the string below in a good way so I can get the sub-string stringI-wantToGet: const char *str = "Hello \"FOO stringI-wantToGet BAR some other extra text"; str will vary in length but always same pattern - FOO and BAR What I had in mind was something like: const char *str = "Hello \"FOO stringI-wantToGet BAR some other extra text"; char *probe, *pointer; probe = str; while(probe != '\n'){ if(probe = strstr("\"FOO")!=NULL) probe++ else probe = ""; // Nulterm part if(pointer = strchr(probe, ' ')!=NULL) pointer = '\0'; // not sure here, I was planning to separate it with \0's } Any help will be appreciate it.

    Read the article

  • Parsing a text file with a fixed format in Java

    - by EugeneP
    Suppose I know a text file format, say, each line contains 4 fields like this: firstword secondword thirdword fourthword firstword2 secondword2 thirdword2 fourthword2 ... and I need to read it fully into memory I can use this approach: open a text file while not EOF read line by line split each line by a space create a new object with four fields extracted from each line add this object to a Set Ok, but is there anything better, a special 3-rd party Java library? So that we could define the structure of each text line beforehand and parse the file with some function thirdpartylib.setInputTextFileFormat("format.xml"); thirdpartylib.parse(Set, "pathToFile") ?

    Read the article

  • Dealing with infinite loops when constructing states for LR(1) parsing

    - by Bruce
    I'm currently constructing LR(1) states from the following grammar. S->AS S->c A->aA A->b where A,S are nonterminals and a,b,c are terminals. This is the construction of I0 I0: S' -> .S, epsilon --------------- S -> .AS, epsilon S -> .c, epsilon --------------- S -> .AS, a S -> .c, c A -> .aA, a A -> .b, b And I1. From S, I1: S' -> S., epsilon //DONE And so on. But when I get to constructing I4... From a, I4: A -> a.A, a ----------- A -> .aA, a A -> .b, b The problem is A - .aA When I attempt to construct the next state from a, I'm going to once again get the exact same content of I4, and this continues infinitely. A similar loop occurs with S -> .AS So, what am I doing wrong? There has to be some detail that I'm missing, but I've browsed my notes and my book and either can't find or just don't understand what's wrong here. Any help?

    Read the article

  • Parsing HTML with XPath and PHP

    - by Peter
    Is there a way (using XPath and PHP) to do the following (WITHOUT external XSLT files)? Remove all tables and their contents Remove everything after the first h1 tag Keep only paragraphs (INCLUDING their inner HTML (links, lists, etc)) I received an XSLT answer here, but I'm looking for XPATH queries that don't require external files. Currently, I've got the HTML in question loaded into a SimpleXmlElement via: $doc = @DOMDocument::loadHTML($xml); $data = simplexml_import_dom($doc); Now I need help with: $data = $data->xpath('??????'); Been working with this one for several days to no avail. I really appreciate the help. Edit: I don't particularly care what's inside the paragraphs, as I can use strip_tags to eliminate what I don't want. All I need to do is to isolate the paragraphs from the rest of the source. I suppose a more specific, accurate requirement would be this: Return only paragraphs (and their html contents) that aren't contained in tables, and only before the first h1 tag

    Read the article

  • Parsing numbers at PreviewTextInput

    - by Nitin Chaudhari
    I have a WPF application in which I have a hook at PreviewTextInput, through that I get the currently entered character and I have the string already entered. Given this I need to write the following function : bool ShouldAccept(char newChar,string existingText) existingText can be comma seperated valid numbers(including exponential) and it should just return false when invalid characters are pressed. My code(if else based) currently has a lot of flaws, I wanted to know if there is any smart way to do it.

    Read the article

  • Parsing HTML "Visually"

    - by Midhat
    OKay I am at loss how to name this question. I have some HTML files, probably written by lord Lucifier himself, that I need to parse. It consists of many segments like this, among other html tags <p>HeadingNumber</p> <p style="text-indent:number;margin-top:neg_num ">Heading Text</p> <p>Body</p> Notice that the heading number and text are in seperate p tags, aligned in a horizontal line by css. the css may be whatever Lucifier fancies, a mixture of indents, paddings, margins and positions. However that line is a single object in my business model and should be kept as such. So How do I detect whether two p elements are visually in a single line and process them accordingly. I believe the HTML files are well formed if it helps.

    Read the article

  • Android - Parsing XML with XPath

    - by Ruben Deig Ramos
    First of all, thanks to all the people who's going to spend a little time on this question. Second, sorry for my english (not my first language! :D). Well, here is my problem. I'm learning Android and I'm making an app which uses a XML file to store some info. I have no problem creating the file, but trying to read de XML tags with XPath (DOM, XMLPullParser, etc. only gave me problems) I've been able to read, at least, the first one. Let's see the code. Here is the XML file the app generates: <dispositivo> <id>111</id> <nombre>Name</nombre> <intervalo>300</intervalo> </dispositivo> And here is the function which reads the XML file: private void leerXML() { try { XPathFactory factory=XPathFactory.newInstance(); XPath xPath=factory.newXPath(); // Introducimos XML en memoria File xmlDocument = new File("/data/data/com.example.gps/files/devloc_cfg.xml"); InputSource inputSource = new InputSource(new FileInputStream(xmlDocument)); // Definimos expresiones para encontrar valor. XPathExpression tag_id = xPath.compile("/dispositivo/id"); String valor_id = tag_id.evaluate(inputSource); id=valor_id; XPathExpression tag_nombre = xPath.compile("/dispositivo/nombre"); String valor_nombre = tag_nombre.evaluate(inputSource); nombre=valor_nombre; } catch (Exception e) { e.printStackTrace(); } } The app gets correctly the id value and shows it on the screen ("id" and "nombre" variables are assigned to a TextView each one), but the "nombre" is not working. What should I change? :) Thanks for all your time and help. This site is quite helpful! PD: I've been searching for a response on the whole site but didn't found any.

    Read the article

< Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26  | Next Page >