Search Results

Search found 37381 results on 1496 pages for 'string parsing'.

Page 17/1496 | < Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24 | Next Page >

php parsing speed optimization

- by Arnaud

I would like to add tooltip or generate link according to the element available in the database, for exemple if the html page printed is: to reboot your linux host in single-user mode you can ... I will use explode(" ", $row[page]) and the idea is now to lookup for every single word in the page to find out if they have a related referance in this exemple let's say i've got a table referance an one entry for reboot and one for linux reboot: restart a computeur linux: operating system now my output will look like (replaced < and by @) to @a href="ref/reboot"@reboot@/a@ your @a href="ref/linux"@linux@/a@ host in single-user mode you can ... Instead of have a static list generated when I saved the content, if I add more keyword in the future, then the text will become more interactive. My main concerne and question is how can I create a efficient enough process to do it ? Should I store all the db entry in an array and compare them ? Do an sql query for each word (seems to be crazy) Dump the table in a file and use a very long regex or a "grep -f pattern data" way of doing it? Or or or or I'm sure it must be a better way of doing it, just don't have a clue about it, or maybe this will be far too resource un-friendly and I should avoid doing such things. Cheers!

Read the article
Parsing XML data with Namespaces in PHP

- by osbmedia

I'm trying to work with this XML feed that uses namespaces and i'm not able to get past the colon in the tags. Here's how the XML feed looks like: <r25:events pubdate="2010-05-19T13:58:08-04:00"> <r25:event xl:href="event.xml?event_id=328" id="BRJDMzI4" crc="00000022" status="est"> <r25:event_id>328</r25:event_id> <r25:event_name>Testing 09/2005-08/2006</r25:event_name> <r25:alien_uid/> <r25:event_priority>0</r25:event_priority> <r25:event_type_id xl:href="evtype.xml?type_id=105">105</r25:event_type_id> <r25:event_type_name>CABINET</r25:event_type_name> <r25:node_type>C</r25:node_type> <r25:node_type_name>cabinet</r25:node_type_name> <r25:state>1</r25:state> <r25:state_name>Tentative</r25:state_name> <r25:event_locator>2005-AAAAMQ</r25:event_locator> <r25:event_title/> <r25:favorite>F</r25:favorite> <r25:organization_id/> <r25:organization_name/> <r25:parent_id/> <r25:cabinet_id xl:href="event.xml?event_id=328">328</r25:cabinet_id> <r25:cabinet_name>cabinet 09/2005-08/2006</r25:cabinet_name> <r25:start_date>2005-09-01</r25:start_date> <r25:end_date>2006-08-31</r25:end_date> <r25:registration_url/> <r25:last_mod_dt>2008-02-27T14:22:43-05:00</r25:last_mod_dt> <r25:last_mod_user>abc00296004</r25:last_mod_user> </r25:event> </r25:events> And here is what I'm using for code - I'll trying to throw these into a bunch of arrays where I can format the output however I want: <?php $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, "http://somedomain.com/blah.xml"); curl_setopt ($ch, CURLOPT_HTTPHEADER, Array("Content-Type: text/xml")); curl_setopt($ch, CURLOPT_USERPWD, "username:password"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $output = curl_exec($ch); curl_close($ch); $xml = new SimpleXmlElement($output); foreach ($xml->events->event as $entry){ $dc = $entry->children('http://www.collegenet.com/r25'); echo $entry->event_name . "<br />"; echo $entry->event_id . "<br /><br />"; }

Read the article
Parsing Chunk of Data into Hash of Array With Perl

- by neversaint

I have data that looks like this: #info #info2 1:SRX004541 Submitter: UT-MGS, UT-MGS Study: Glossina morsitans transcript sequencing project(SRP000741) Sample: Glossina morsitans(SRS002835) Instrument: Illumina Genome Analyzer Total: 1 run, 8.3M spots, 299.9M bases Run #1: SRR016086, 8330172 spots, 299886192 bases 2:SRX004540 Submitter: UT-MGS Study: Anopheles stephensi transcript sequencing project(SRP000747) Sample: Anopheles stephensi(SRS002864) Instrument: Solexa 1G Genome Analyzer Total: 1 run, 8.4M spots, 401M bases Run #1: SRR017875, 8354743 spots, 401027664 bases 3:SRX002521 Submitter: UT-MGS Study: Massive transcriptional start site mapping of human cells under hypoxic conditions.(SRP000403) Sample: Human DLD-1 tissue culture cell line(SRS001843) Instrument: Solexa 1G Genome Analyzer Total: 6 runs, 27.1M spots, 977M bases Run #1: SRR013356, 4801519 spots, 172854684 bases Run #2: SRR013357, 3603355 spots, 129720780 bases Run #3: SRR013358, 3459692 spots, 124548912 bases Run #4: SRR013360, 5219342 spots, 187896312 bases Run #5: SRR013361, 5140152 spots, 185045472 bases Run #6: SRR013370, 4916054 spots, 176977944 bases What I want to do is to create a hash of array with first line of each chunk as keys and SR## part of lines with "^Run" as its array member: $VAR = { 'SRX004541' => ['SRR016086'], # etc } But why my construct doesn't work. And it must be a better way to do it. use Data::Dumper; my %bighash; my $head = ""; my @temp = (); while ( <> ) { chomp; next if (/^\#/); if ( /^\d{1,2}:(\w+)/ ) { print "$1\n"; $head = $1; } elsif (/^Run \#\d+: (\w+),.*/){ print "\t$1\n"; push @temp, $1; } elsif (/^$/) { push @{$bighash{$head}}, [@temp]; @temp =(); } } print Dumper \%bighash ;

Read the article
Parsing XML in C# from stream

- by Phillip

I've tried several methods, from Linq to loading the data to an XML document, but i can't seem to be able to return the result that i need. here's the example XML: <serv:message xmlns:serv="http://www.webex.com/schemas/2002/06/service" xmlns:com="http://www.webex.com/schemas/2002/06/common" xmlns:event="http://www.webex.com/schemas/2002/06/service/event"><serv:header><serv:response><serv:result>SUCCESS</serv:result><serv:gsbStatus>PRIMARY</serv:gsbStatus></serv:response></serv:header><serv:body><serv:bodyContent xsi:type="event:createEventResponse" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><event:sessionKey>11111111</event:sessionKey><event:guestToken>111111111111111111111</event:guestToken></serv:bodyContent></serv:body></serv:message> And, here's what i've tried to do: StreamReader reader = new StreamReader(dataStream); XmlDocument doc = new XmlDocument(); doc.LoadXml(reader.ReadToEnd()); XmlNamespaceManager ns = new XmlNamespaceManager(doc.NameTable); XmlNamespaceManager ns2 = new XmlNamespaceManager(doc.NameTable); XmlNamespaceManager ns3 = new XmlNamespaceManager(doc.NameTable); ns.AddNamespace("serv", "http://www.webex.com/schemas/2002/06/service"); ns2.AddNamespace("com", "http://www.webex.com/schemas/2002/06/common"); ns3.AddNamespace("event", "http://www.webex.com/schemas/2002/06/service/event"); XmlNode node = doc.SelectSingleNode("result",ns); Yet, for some reason i cannot ever seem to return the actual result, which should be either 'SUCCESS' or 'FAILURE' based on the actual xml above. How can i do this?

Read the article
IE 8 html parsing error message.

- by user48408

I'm experiencing the problem outlined in this kb article. http://support.microsoft.com/kb/927917 . Sorry I can't hyperlink cos i don't have enough points! "This problem occurs because a child container HTML element contains script that tries to modify the parent container element of the child container. The script tries to modify the parent container element by using either the innerHTML method or the appendChild method." The problem I'm having diagnosing the source of my problem is 2 fold: 1) This is only happening on some client machines (All are running IE8) and not others. How/Why only some? 2) I don't have any scripts which modify the innerHTML or call appendChild on any dom elements. I do have server side code which modify properties on asp .net server controls. (Essentially all thats happening is a panel control with some more controls is being made visbile or invisible on a button click), would these in turn then set the innerHTML property of the client rendered control(?)

Read the article
Parsing a website

- by Phenom

I want to make a program that takes as user input a website address. The program then goes to that website, downloads it, and then parses the information inside. It outputs a new html file using the information from the website. Specifically, what this program will do is take certain links from the website, and put the links in the output html file, and it will discard everything else. Right now I just want to make it for websites that don't require a login, but later on I want to make it work for sites where you have to login, so it will have to be able to deal with cookies. I'll also want to later on have the program be able to explore certain links and download information from those other sites. What are the best programming languages or tools to do this?

Read the article
Haskell - Parsec Parsing <p> element

- by Martin

I'm using Text.ParserCombinators.Parsec and Text.XHtml to parse an input like this: This is the first paragraph example\n with two lines\n \n And this is the second paragraph\n And my output should be: <p>This is the first paragraph example\n with two lines\n</p> <p>And this is the second paragraph\n</p> I defined: line= do{ ;t<-manyTill (anyChar) newline ;return t } paragraph = do{ t<-many1 (line) ;return ( p << t ) } But it returns: <p>This is the first paragraph example\n with two lines\n\n And this is the second paragraph\n</p> What is wrong? Any ideas? Thanks!

Read the article
XML Parsing in Groovy strips attribute new lines

- by Bill James

I'm writing code where I retrieve XML from a web api, then parse that XML using Groovy. Unfortunately, it seems that both XmlParser and XmlSlurper for Groovy strip newline characters from the attributes of nodes when .text() is called. How can I get at the text of the attribute including the newlines? Sample code: def xmltest = ''' <snippet> <preSnippet att1="testatt1" code="This is line 1 This is line 2 This is line 3" > <lines count="10" /> </preSnippet> </snippet>''' def parsed = new XmlParser().parseText( xmltest ) println "Parsed" parsed.preSnippet.each { pre -> println pre.attribute('code'); } def slurped = new XmlSlurper().parseText( xmltest ) println "Slurped" slurped.children().each { preSnip -> println [email protected]() } the output of which is: Parsed This is line 1 This is line 2 This is line 3 Slurped This is line 1 This is line 2 This is line 3

Read the article
protocol parsing in c

- by nomad.alien

I have been playing around with trying to implement some protocol decoders, but each time I run into a "simple" problem and I feel the way I am solving the problem is not optimal and there must be a better way to do things. I'm using C. Currently I'm using some canned data and reading it in as a file, but later on it would be via TCP or UDP. Here's the problem. I'm currently playing with a binary protocol at work. All fields are 8 bits long. The first field(8bits) is the packet type. So I read in the first 8 bits and using a switch/case I call a function to read in the rest of the packet as I then know the size/structure of it. BUT...some of these packets have nested packets inside them, so when I encounter that specific packet I then have to read another 8-16 bytes have another switch/case to see what the next packet type is and on and on. (Luckily the packets are only nested 2 or 3 deep). Only once I have the whole packet decoded can I handle it over to my state machine for processing. I guess this can be a more general question as well. How much data do you have to read at a time from the socket? As much as possible? As much as what is "similar" in the protocol headers? So even though this protocol is fairly basic, my code is a whole bunch of switch/case statements and I do a lot of reading from the file/socket which I feel is not optimal. My main aim is to make this decoder as fast as possible. To the more experienced people out there, is this the way to go or is there a better way which I just haven't figured out yet? Any elegant solution to this problem?

Read the article
Parsing multiple files at a time in Perl

- by sfactor

I have a large data set (around 90GB) to work with. There are data files (tab delimited) for each hour of each day and I need to perform operations in the entire data set. For example, get the share of OSes which are given in one of the columns. I tried merging all the files into one huge file and performing the simple count operation but it was simply too huge for the server memory. So, I guess I need to perform the operation each file at a time and then add up in the end. I am new to perl and am especially naive about the performance issues. How do I do such operations in a case like this. As an example two columns of the file are. ID OS 1 Windows 2 Linux 3 Windows 4 Windows Lets do something simple, counting the share of the OSes in the data set. So, each .txt file has millions of these lines and there are many such files. What would be the most efficient way to operate on the entire files.

Read the article
Parsing the Youtube API with DOM

- by Kirk

I'm using the Youtube API and I can retrieve the date information without a problem, but don't know how to retrieve the description information. My Code: <?php $v = "dQw4w9WgXcQ"; $url = "http://gdata.youtube.com/feeds/api/videos/". $v; $doc = new DOMDocument; $doc->load($url); $pub = $doc->getElementsByTagName("published")->item(0)->nodeValue; $desc = $doc->getElementsByTagName("media:description")->item(0)->nodeValue; echo "<b>Video Uploaded:</b> "; echo date( "F jS, Y", strtotime( $pub ) ); echo '<br>'; if (isset ($desc)) { echo "<b>Description:</b> "; echo $desc; echo '<br>'; } ?> Here's a link to the feed: http://gdata.youtube.com/feeds/api/videos/dQw4w9WgXcQ?prettyprint=true And the excerpt of code I don't know how to retrieve data from: <media:group> <media:description type='plain'>Music video by Rick Astley performing Never Gonna Give You Up. (C) 1987 PWL</media:description> </media:group> Thanks in advance.

Read the article
Parsing HTML: Call to a member function > children() on a non-object

- by sm56d

Hello all, I was just helped with this question but I can't get it to move to the next block of HTML. $html = file_get_html('http://music.banadir24.com/singer/aasha_abdoo/247.html'); $urls = $html->find('table[width=100%] table tr'); foreach($urls as $url){ $song_name = $url->children(2)->plaintext; $url = $url->children(6)->children(0)->href; } It returns the list of the names of the first album (Deesco) but it does not continue to the next album (The Best Of Aasha)? It just gives me this error: Notice: Trying to get property of non-object in C:\wamp\www\test3.php on line 26 Fatal error: Call to a member function children() on a non-object in C:\wamp\www\test3.php on line 28 Why is this and how can I get it to continue to the next table element? I appreciate any help on this! Please note: This is legal as the songs are not bound by copyright and they are available to download freely, its just I need to download a lot of them and I can't sit there clicking a button all day. Having said that, its taken me an hour to get this far.

Read the article
Parsing HTML with XPath and PHP

- by Peter

Is there a way (using XPath and PHP) to do the following (WITHOUT external XSLT files)? Remove all tables and their contents Remove everything after the first h1 tag Keep only paragraphs (INCLUDING their inner HTML (links, lists, etc)) I received an XSLT answer here, but I'm looking for XPATH queries that don't require external files. Currently, I've got the HTML in question loaded into a SimpleXmlElement via: $doc = @DOMDocument::loadHTML($xml); $data = simplexml_import_dom($doc); Now I need help with: $data = $data->xpath('??????'); Been working with this one for several days to no avail. I really appreciate the help. Edit: I don't particularly care what's inside the paragraphs, as I can use strip_tags to eliminate what I don't want. All I need to do is to isolate the paragraphs from the rest of the source. I suppose a more specific, accurate requirement would be this: Return only paragraphs (and their html contents) that aren't contained in tables, and only before the first h1 tag

Read the article
Parsing a file with column data in Python

- by rejinacm

I have a file that contains the symbol table details.Its in the form of rows and columns. I need to extract first and last column. How can I do that?

Read the article
Dealing with infinite loops when constructing states for LR(1) parsing

- by Bruce

I'm currently constructing LR(1) states from the following grammar. S->AS S->c A->aA A->b where A,S are nonterminals and a,b,c are terminals. This is the construction of I0 I0: S' -> .S, epsilon --------------- S -> .AS, epsilon S -> .c, epsilon --------------- S -> .AS, a S -> .c, c A -> .aA, a A -> .b, b And I1. From S, I1: S' -> S., epsilon //DONE And so on. But when I get to constructing I4... From a, I4: A -> a.A, a ----------- A -> .aA, a A -> .b, b The problem is A - .aA When I attempt to construct the next state from a, I'm going to once again get the exact same content of I4, and this continues infinitely. A similar loop occurs with S -> .AS So, what am I doing wrong? There has to be some detail that I'm missing, but I've browsed my notes and my book and either can't find or just don't understand what's wrong here. Any help?

Read the article
module to use when parsing .properties file in Python

- by Tshepang

configparser raises an exception if one parses a simple .properties (key/value) file wich lacks section headers; other than rolling out code myself, is there some module that allows me to do this?

Read the article
Parsing HTML "Visually"

- by Midhat

OKay I am at loss how to name this question. I have some HTML files, probably written by lord Lucifier himself, that I need to parse. It consists of many segments like this, among other html tags <p>HeadingNumber</p> <p style="text-indent:number;margin-top:neg_num ">Heading Text</p> <p>Body</p> Notice that the heading number and text are in seperate p tags, aligned in a horizontal line by css. the css may be whatever Lucifier fancies, a mixture of indents, paddings, margins and positions. However that line is a single object in my business model and should be kept as such. So How do I detect whether two p elements are visually in a single line and process them accordingly. I believe the HTML files are well formed if it helps.

Read the article
Scala: XML Attribute parsing

- by Chris

I'm trying to parse a rss feed that looks like this for the attribute "date": <rss version="2.0"> <channel> <item> <y:c date="AA"></y:c> </item> </channel> </rss> I tried several different versions of this: (rssFeed contains the RSS data) println(((rssFeed \\ "channel" \\ "item" \ "y:c" \"date").toString)) But nothing seems to work. What am I missing? Any help would really be appreciated!

Read the article
Regex for xml parsing

- by ogmios

What is your opinon about following regexes - is it correct? To find element with spcific and required attribute "<(" + elem_name + ")(\s+(?:[^<]?\s+)" + attr_name + "\s*=\s*(['\"])((?:(?!\3).))\3[^<])(.*?)" To find element with spcific but optional attribute "<(" + elem_name + ")(\s*|\s+(?:[^<]?\s+)(?:" + attr_name + "\s*=\s*(['\"])((?:(?!\3).))\3)?[^<])(.*?)" Pleas not another answer "use existing xml parser". Question is - are the regexes proper or not? This is specific situation - C language in embedded system and xml is not well-formed (cannot be fixed - does not depend on me). Xml have specified schema and no problem with namespaces etc. exists.

Read the article
Parsing timestamp with retarded Python

- by jellybean

I want to parse a timestamp from a log file that has been written via datetime.datetime.now().strftime('%Y%m%d%H%M%S') and then compute the number of seconds that have passed since this timestamp. I know I could do it with datetime.datetime.strptime to get back a datetime object and then compute a timedelta. Problem is, the strptime function has been introduced with Python 2.5 and I'm using Python2.4.4 (an upgrade is not possible in my context). Any easy way to do this?

Read the article
parsing command option with default values and range constrains in C

- by agramfort

Hi, I need to parse command line arguments in C. My arguments are basically int or float with default values and range constrains. I've started to implement something that look like this: option_float(float* out, int argc, char* argv, char* name, description, float default_val, int is_optional, float min_value, float max_value) which I call for example with: float* pct; option_float(pct, argc, argv, "pct", "My super percentage option", 50, 1, FALSE, 0, 100) however I don't want to reinvent the wheel ! My objective is to have error checking of range constrains, throw an error when the option is not optional and is not set. And generate the help message usually given by usage() function. The usage text would look like this: --pct My super percentage option (default : 50). Should be in [0, 100] I've started with getopt but it is too limited for what I want to do and I feel it still requires me to write too much code for a simple usecase like this. thanks

Read the article
parsing python to csv

- by user185955

I'm trying to download some game stats to do some analysis, only problem is each season the data their isn't 100% consistent. I grab the json file from the site, then wish to save it to a csv with the first line in the csv containing the heading for that column, so the heading would be essentially the key from the python data type. #!/usr/bin/env python import requests import json import csv base_url = 'http://www.afl.com.au/api/cfs/afl/' token_url = base_url + 'WMCTok' player_url = base_url + 'matchItems/round' def printPretty(data): print(json.dumps(data, sort_keys=True, indent=2, separators=(',', ': '))) session = requests.Session() # session makes it simple to use the token across the requests token = session.post(token_url).json()['token'] # get the token session.headers.update({'X-media-mis-token': token}) # set the token Season = 2014 Roundno = 4 if Roundno<10: strRoundno = '0'+str(Roundno) else: strRoundno = str(Roundno) # get some data (could easily be a for loop, might want to put in a delay using Sleep so that you don't get IP blocked) data = session.get(player_url + '/CD_R'+str(Season)+'014'+strRoundno) # print everything printPretty(data.json()) with open('stats_game_test.csv', 'w', newline='') as csvfile: spamwriter = csv.writer(csvfile, delimiter="'",quotechar='|', quoting=csv.QUOTE_ALL) for profile in data.json()['items']: spamwriter.writerow(['%s' %(profile)]) #for key in data.json().keys(): # print("key: %s , value: %s" % (key, data.json()[key])) The above code grabs the json and writes it to a csv, but it puts the key in each individual cell next to the value (eg 'venueId': 'CD_V190'), the key needs to be just across the first row as a heading. It gives me a csv file with data in the cells like this Column A B 'tempInCelsius': 17.0 'totalScore': 32 'tempInCelsius': 16.0 'totalScore': 28 What I want is the data like this tempInCelsius totalScore 17 32 16 28 As I mentioned up the top, the data isn't always consistent so if I define what fields to grab with spamwriter.writerow([profile['tempInCelsius'], profile['totalScore']]) then it will error out on certain data grabs. This is why I'm now trying the above method so it just grabs everything regardless of what data is there.

Read the article
Parsing timestamp with Python2.4

- by jellybean

I want to parse a timestamp from a log file that has been written via datetime.datetime.now().strftime('%Y%m%d%H%M%S') and then compute the number of seconds that have passed since this timestamp. I know I could do it with datetime.datetime.strptime to get back a datetime object and then compute a timedelta. Problem is, the strptime function has been introduced with Python 2.5 and I'm using Python2.4.4 (an upgrade is not possible in my context). Any easy way to do this?

Read the article
parsing position files in ruby

- by john

I have a sample position file like below. 789754654 COLORA SOMETHING1 19370119FYY076 2342423234SS323423 742784897 COLORB SOMETHING2 20060722FYY076 2342342342SDFSD3423 I am interested in positions 54-61 (4th column). I want to change the date to be a different format. So final outcome will be: 789754654 COLORA SOMETHING1 01191937FYY076 2342423234SS323423 742784897 COLORB SOMETHING2 07222006FYY076 2342342342SDFSD3423 The columns are seperated by spaces not tabs. And the final file should have exact number of spaces as the original file....only thing changing should be the date format. How can I do this? I wrote a script but it will lose the original spaces and positioning will be messed up. file.each_line do |line| dob = line.split(" ") puts dob[3] #got the date. change its format 5.times { puts "**" } end Can anyone suggest a better strategy so that positioning in the original file remains the same?

Read the article
Parsing a Multi-Index Excel File in Pandas

- by rhaskett

I have a time series excel file with a tri-level column MultiIndex that I would like to successfully parse if possible. There are some results on how to do this for an index on stack overflow but not the columns and the parse function has a header that does not seem to take a list of rows. The ExcelFile looks like is like the following: Column A is all the time series dates starting on A4 Column B has top_level1 (B1) mid_level1 (B2) low_level1 (B3) data (B4-B100+) Column C has null (C1) null (C2) low_level2 (C3) data (C4-C100+) Column D has null (D1) mid_level2 (D2) low_level1 (D3) data (D4-D100+) Column E has null (E1) null (E2) low_level2 (E3) data (E4-E100+) ... So there are two low_level values many mid_level values and a few top_level values but the trick is the top and mid level values are null and are assumed to be the values to the left. So, for instance all the columns above would have top_level1 as the top multi-index value. My best idea so far is to use transpose, but the it fills Unnamed: # everywhere and doesn't seem to work. In Pandas 0.13 read_csv seems to have a header parameter that can take a list, but this doesn't seem to work with parse.

Read the article

< Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24 | Next Page >