dtd parsing - Page 38 - Developer IT

STR_TO_DATE parsing in mysql

- by wo_shi_ni_ba_ba

trying to parse "06/01/2010 15:00:00 08:00" the problem is the last offset hour, mysql date_time can't parse it, any idea?

R Language XML Parsing

- by C S

What is the best way to parse this XML document so that it is available in R? I would like for each country node to be an independent R object.

I have a webpage's source, lets say neweggs website source, and I store it inside buffer[20000]. Now, I am trying to get the three titles for the three featured products on the home page. I know I have to search, but I dont exactly know how I would go about doing it. Thanks :D

Read the article

UITableView not displaying parsed data

- by Graeme

I have a UITableView which is setup in Interface Builder and connected properly to its class in Xcode. I also have a "Importer" Class which downloads and parses an RSS feed and stores the information in an NSMutableArray. However I have verified the parsing is working properly (using breakpoints and NSlog) but no data is showing in the UITable View. Any ideas as to what the problem could be? I'm almost out of them. It's based on the XML performance Apple example. Here's the code for TableView.h: #import <UIKit/UIKit.h> #import "IncidentsImporter.h" @class SongDetailsController; @interface CurrentIncidentsTableViewController : UITableViewController <IncidentsImporterDelegate>{ NSMutableArray *incidents; SongDetailsController *detailController; UITableView *ctableView; IncidentsImporter *parser; } @property (nonatomic, retain) NSMutableArray *incidents; @property (nonatomic, retain, readonly) SongDetailsController *detailController; @property (nonatomic, retain) IncidentsImporter *parser; @property (nonatomic, retain) IBOutlet UITableView *ctableView; // Called by the ParserChoiceViewController based on the selected parser type. - (void)beginParsing; @end And the code for .m: #import "CurrentIncidentsTableViewController.h" #import "SongDetailsController.h" #import "Incident.h" @implementation CurrentIncidentsTableViewController @synthesize ctableView, incidents, parser, detailController; #pragma mark - #pragma mark View lifecycle - (void)viewDidLoad { [super viewDidLoad]; self.parser = [[IncidentsImporter alloc] init]; parser.delegate = self; [parser start]; UIBarButtonItem *refreshButton = [[UIBarButtonItem alloc] initWithBarButtonSystemItem:UIBarButtonSystemItemRefresh target:self action:@selector(beginParsing)]; self.navigationItem.rightBarButtonItem = refreshButton; [refreshButton release]; // Uncomment the following line to preserve selection between presentations. //self.clearsSelectionOnViewWillAppear = NO; // Uncomment the following line to display an Edit button in the navigation bar for this view controller. // self.navigationItem.rightBarButtonItem = self.editButtonItem; } - (void)viewWillAppear:(BOOL)animated { NSIndexPath *selectedRowIndexPath = [ctableView indexPathForSelectedRow]; if (selectedRowIndexPath != nil) { [ctableView deselectRowAtIndexPath:selectedRowIndexPath animated:NO]; } } // This method will be called repeatedly - once each time the user choses to parse. - (void)beginParsing { NSLog(@"Parsing has begun"); //self.navigationItem.rightBarButtonItem.enabled = NO; // Allocate the array for song storage, or empty the results of previous parses if (incidents == nil) { NSLog(@"Grabbing array"); self.incidents = [NSMutableArray array]; } else { [incidents removeAllObjects]; [ctableView reloadData]; } // Create the parser, set its delegate, and start it. self.parser = [[IncidentsImporter alloc] init]; parser.delegate = self; [parser start]; } /* - (void)viewDidAppear:(BOOL)animated { [super viewDidAppear:animated]; } */ /* - (void)viewWillDisappear:(BOOL)animated { [super viewWillDisappear:animated]; } */ /* - (void)viewDidDisappear:(BOOL)animated { [super viewDidDisappear:animated]; } */ - (BOOL)shouldAutorotateToInterfaceOrientation:(UIInterfaceOrientation)interfaceOrientation { // Override to allow orientations other than the default portrait orientation. return YES; } #pragma mark - #pragma mark Table view data source - (NSInteger)numberOfSectionsInTableView:(UITableView *)tableView { // Return the number of sections. return 1; } - (NSInteger)tableView:(UITableView *)tableView numberOfRowsInSection:(NSInteger)section { // Return the number of rows in the section. return [incidents count]; } // Customize the appearance of table view cells. - (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath { NSLog(@"Table Cell Sought"); static NSString *kCellIdentifier = @"MyCell"; UITableViewCell *cell = [ctableView dequeueReusableCellWithIdentifier:kCellIdentifier]; if (cell == nil) { cell = [[[UITableViewCell alloc] initWithStyle:UITableViewCellStyleDefault reuseIdentifier:kCellIdentifier] autorelease]; cell.textLabel.font = [UIFont boldSystemFontOfSize:14.0]; cell.accessoryType = UITableViewCellAccessoryDisclosureIndicator; } cell.textLabel.text = @"Test";//[[incidents objectAtIndex:indexPath.row] title]; return cell; } /* // Override to support conditional editing of the table view. - (BOOL)tableView:(UITableView *)tableView canEditRowAtIndexPath:(NSIndexPath *)indexPath { // Return NO if you do not want the specified item to be editable. return YES; } */ /* // Override to support editing the table view. - (void)tableView:(UITableView *)tableView commitEditingStyle:(UITableViewCellEditingStyle)editingStyle forRowAtIndexPath:(NSIndexPath *)indexPath { if (editingStyle == UITableViewCellEditingStyleDelete) { // Delete the row from the data source [tableView deleteRowsAtIndexPaths:[NSArray arrayWithObject:indexPath] withRowAnimation:YES]; } else if (editingStyle == UITableViewCellEditingStyleInsert) { // Create a new instance of the appropriate class, insert it into the array, and add a new row to the table view } } */ /* // Override to support rearranging the table view. - (void)tableView:(UITableView *)tableView moveRowAtIndexPath:(NSIndexPath *)fromIndexPath toIndexPath:(NSIndexPath *)toIndexPath { } */ /* // Override to support conditional rearranging of the table view. - (BOOL)tableView:(UITableView *)tableView canMoveRowAtIndexPath:(NSIndexPath *)indexPath { // Return NO if you do not want the item to be re-orderable. return YES; } */ #pragma mark - #pragma mark Table view delegate - (void)tableView:(UITableView *)tableView didSelectRowAtIndexPath:(NSIndexPath *)indexPath { self.detailController.incident = [incidents objectAtIndex:indexPath.row]; [self.navigationController pushViewController:self.detailController animated:YES]; } #pragma mark - #pragma mark Memory management - (void)didReceiveMemoryWarning { // Releases the view if it doesn't have a superview. [super didReceiveMemoryWarning]; // Relinquish ownership any cached data, images, etc that aren't in use. } - (void)viewDidUnload { // Relinquish ownership of anything that can be recreated in viewDidLoad or on demand. // For example: self.myOutlet = nil; } - (void)parserDidEndParsingData:(IncidentsImporter *)parser { [ctableView reloadData]; self.navigationItem.rightBarButtonItem.enabled = YES; self.parser = nil; } - (void)parser:(IncidentsImporter *)parser didParseIncidents:(NSArray *)parsedIncidents { //[incidents addObjectsFromArray: parsedIncidents]; // Three scroll view properties are checked to keep the user interface smooth during parse. When new objects are delivered by the parser, the table view is reloaded to display them. If the table is reloaded while the user is scrolling, this can result in eratic behavior. dragging, tracking, and decelerating can be checked for this purpose. When the parser finishes, reloadData will be called in parserDidEndParsingData:, guaranteeing that all data will ultimately be displayed even if reloadData is not called in this method because of user interaction. if (!ctableView.dragging && !ctableView.tracking && !ctableView.decelerating) { self.title = [NSString stringWithFormat:NSLocalizedString(@"Top %d Songs", @"Top Songs format"), [parsedIncidents count]]; [ctableView reloadData]; } } - (void)parser:(IncidentsImporter *)parser didFailWithError:(NSError *)error { // handle errors as appropriate to your application... } - (void)dealloc { [super dealloc]; } @end

Read the article

XML: what processing rules apply for values intertwined with tags?

- by iCE-9

I've started working on a simple XML pull-parser, and as I've just defuzzed my mind on what's correct syntax in XML with regards to certain characters/sequences, ignorable whitespace and such (thank you, http://www.w3schools.com/xml/xml_elements.asp), I realized that I still don't know squat about what can be sketched up as the following case (which Validome finds well-formed very much; note that I only want to use xml files for data storage, no entities, DTD or Schemas needed): <bookstore> <book id="1"> <author>Kurt Vonnegut Jr.</author> <title>Slapstick</title> </book> We drop a pie here. <book id="2">Who cares anyway? <author>Stephen King</author> <title>The Green Mile</title> </book> And another one here. <book id="3"> <author>Next one</author> <title>This time with its own title</title> </book> </bookstore> "We drop a pie here." and "And another one here." are values of the 'bookstore' element. "Who cares anyway?" is a value related to the second 'book' element. How are these processed, if at all? Will "We drop a pie here." and "Another one here." be concatenated to form one value for the 'bookstore' element, or are they treated separately, stored somewhere, affecting the outcome of the parsing of the element they belong to, or...?

Read the article

JavaCC: How can I specify which token(s) are expected in certain context?

- by java.is.for.desktop

Hello, everyone! I need to make JavaCC aware of a context (current parent token), and depending on that context, expect different token(s) to occur. Consider the following pseudo-code: TOKEN <abc> { "abc*" } // recognizes "abc", "abcd", "abcde", ... TOKEN <abcd> { "abcd*" } // recognizes "abcd", "abcde", "abcdef", ... TOKEN <element1> { "element1" "[" expectOnly(<abc>) "]" } TOKEN <element2> { "element2" "[" expectOnly(<abcd>) "]" } ... So when the generated parser is "inside" a token named "element1" and it encounter "abcdef" it recognizes it as <abc>, but when its "inside" a token named "element2" it recognizes the same string as <abcd>. element1 [ abcdef ] // aha! it can only be <abc> element2 [ abcdef ] // aha! it can only be <abcd> If I'm not wrong, it would behave similar to more complex DTD definitions of an XML file. So, how can one specify, in which "context" which token(s) are valid/expected? NOTE: It would be not enough for my real case to define a kind of "hierarchy" of tokens, so that "abcdef" is always first matched against <abcd> and than <abc>. I really need context-aware tokens.

Read the article

How to parse nagios status.dat file?

- by daniels

I'd like to parse status.dat file for nagios3 and output as xml with a python script. The xml part is the easy one but how do I go about parsing the file? Use multi line regex? It's possible the file will be large as many hosts and services are monitored, will loading the whole file in memory be wise? I only need to extract services that have critical state and host they belong to. Any help and pointing in the right direction will be highly appreciated. LE Here's how the file looks: ######################################## # NAGIOS STATUS FILE # # THIS FILE IS AUTOMATICALLY GENERATED # BY NAGIOS. DO NOT MODIFY THIS FILE! ######################################## info { created=1233491098 version=2.11 } program { modified_host_attributes=0 modified_service_attributes=0 nagios_pid=15015 daemon_mode=1 program_start=1233490393 last_command_check=0 last_log_rotation=0 enable_notifications=1 active_service_checks_enabled=1 passive_service_checks_enabled=1 active_host_checks_enabled=1 passive_host_checks_enabled=1 enable_event_handlers=1 obsess_over_services=0 obsess_over_hosts=0 check_service_freshness=1 check_host_freshness=0 enable_flap_detection=0 enable_failure_prediction=1 process_performance_data=0 global_host_event_handler= global_service_event_handler= total_external_command_buffer_slots=4096 used_external_command_buffer_slots=0 high_external_command_buffer_slots=0 total_check_result_buffer_slots=4096 used_check_result_buffer_slots=0 high_check_result_buffer_slots=2 } host { host_name=localhost modified_attributes=0 check_command=check-host-alive event_handler= has_been_checked=1 should_be_scheduled=0 check_execution_time=0.019 check_latency=0.000 check_type=0 current_state=0 last_hard_state=0 plugin_output=PING OK - Packet loss = 0%, RTA = 3.57 ms performance_data= last_check=1233490883 next_check=0 current_attempt=1 max_attempts=10 state_type=1 last_state_change=1233489475 last_hard_state_change=1233489475 last_time_up=1233490883 last_time_down=0 last_time_unreachable=0 last_notification=0 next_notification=0 no_more_notifications=0 current_notification_number=0 notifications_enabled=1 problem_has_been_acknowledged=0 acknowledgement_type=0 active_checks_enabled=1 passive_checks_enabled=1 event_handler_enabled=1 flap_detection_enabled=1 failure_prediction_enabled=1 process_performance_data=1 obsess_over_host=1 last_update=1233491098 is_flapping=0 percent_state_change=0.00 scheduled_downtime_depth=0 } service { host_name=gateway service_description=PING modified_attributes=0 check_command=check_ping!100.0,20%!500.0,60% event_handler= has_been_checked=1 should_be_scheduled=1 check_execution_time=4.017 check_latency=0.210 check_type=0 current_state=0 last_hard_state=0 current_attempt=1 max_attempts=4 state_type=1 last_state_change=1233489432 last_hard_state_change=1233489432 last_time_ok=1233491078 last_time_warning=0 last_time_unknown=0 last_time_critical=0 plugin_output=PING OK - Packet loss = 0%, RTA = 2.98 ms performance_data= last_check=1233491078 next_check=1233491378 current_notification_number=0 last_notification=0 next_notification=0 no_more_notifications=0 notifications_enabled=1 active_checks_enabled=1 passive_checks_enabled=1 event_handler_enabled=1 problem_has_been_acknowledged=0 acknowledgement_type=0 flap_detection_enabled=1 failure_prediction_enabled=1 process_performance_data=1 obsess_over_service=1 last_update=1233491098 is_flapping=0 percent_state_change=0.00 scheduled_downtime_depth=0 } It can have any number of hosts and a host can have any number of services.

Read the article

Iterating through json object doesn't seem to work for me...

- by Pandiya Chendur

From a previous question on Stackoverflow Iterating through/Parsing JSON Object via JavaScript.... My json object doesn't seem get parsed.... here is my function function Iteratejsondata(HfJsonValue) { var jsonObj = eval('(' + HfJsonValue + ')'); for (var i = 0, len = HfJsonValue.length; i < len; ++i) { var employee = HfJsonValue[i]; document.write(employee.Emp_Name); } } employee.Emp_Name is undefined but when i give document.write(employee); i get this {"Table" : [{"Emp_Id" : "3","Identity_No" : "","Emp_Name" : "Jerome","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Supervisior","Desig_Description" : "Supervisior of the Construction","SalaryBasis" : "Monthly","FixedSalary" : "25000.00"},{"Emp_Id" : "4","Identity_No" : "","Emp_Name" : "Mohan","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Acc ","Desig_Description" : "Accountant","SalaryBasis" : "Monthly","FixedSalary" : "200.00"},{"Emp_Id" : "5","Identity_No" : "","Emp_Name" : "Murugan","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Mason","Desig_Description" : "Mason","SalaryBasis" : "Weekly","FixedSalary" : "150.00"},{"Emp_Id" : "6","Identity_No" : "","Emp_Name" : "Ram","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Mason","Desig_Description" : "Mason","SalaryBasis" : "Weekly","FixedSalary" : "120.00"},{"Emp_Id" : "7","Identity_No" : "","Emp_Name" : "Raja","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Mason","Desig_Description" : "Mason","SalaryBasis" : "Weekly","FixedSalary" : "135.00"},{"Emp_Id" : "8","Identity_No" : "","Emp_Name" : "Raja kumar","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Mason Helper","Desig_Description" : "Mason Helper","SalaryBasis" : "Weekly","FixedSalary" : "105.00"},{"Emp_Id" : "9","Identity_No" : "","Emp_Name" : "Lakshmi","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Mason Helper","Desig_Description" : "Mason Helper","SalaryBasis" : "Weekly","FixedSalary" : "100.00"},{"Emp_Id" : "10","Identity_No" : "","Emp_Name" : "Palani","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Carpenter","Desig_Description" : "Carpenter","SalaryBasis" : "Weekly","FixedSalary" : "200.00"},{"Emp_Id" : "11","Identity_No" : "","Emp_Name" : "Annamalai","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Carpenter","Desig_Description" : "Carpenter","SalaryBasis" : "Weekly","FixedSalary" : "220.00"},{"Emp_Id" : "12","Identity_No" : "","Emp_Name" : "David","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Steel Fixer","Desig_Description" : "Steel Fixer","SalaryBasis" : "Weekly","FixedSalary" : "220.00"},{"Emp_Id" : "13","Identity_No" : "","Emp_Name" : "Chandru","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Steel Fixer","Desig_Description" : "Steel Fixer","SalaryBasis" : "Weekly","FixedSalary" : "220.00"},{"Emp_Id" : "14","Identity_No" : "","Emp_Name" : "Mani","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Steel Helper","Desig_Description" : "Steel Helper","SalaryBasis" : "Weekly","FixedSalary" : "175.00"},{"Emp_Id" : "15","Identity_No" : "","Emp_Name" : "Karthik","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Wood Fixer","Desig_Description" : "Wood Fixer","SalaryBasis" : "Weekly","FixedSalary" : "195.00"},{"Emp_Id" : "16","Identity_No" : "","Emp_Name" : "Bala","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Wood Fixer","Desig_Description" : "Wood Fixer","SalaryBasis" : "Weekly","FixedSalary" : "185.00"},{"Emp_Id" : "17","Identity_No" : "","Emp_Name" : "Tamil arasi","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Wood Helper","Desig_Description" : "Wood Helper","SalaryBasis" : "Weekly","FixedSalary" : "185.00"},{"Emp_Id" : "18","Identity_No" : "","Emp_Name" : "Perumal","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Cook","Desig_Description" : "Cook","SalaryBasis" : "Weekly","FixedSalary" : "105.00"},{"Emp_Id" : "19","Identity_No" : "","Emp_Name" : "Andiappan","Address" : "Madurai","Date_Of_Birth" : "","Desig_Name" : "Watchman","Desig_Description" : "Watchman","SalaryBasis" : "Weekly","FixedSalary" : "150.00"}]} Any suggestion how to get this done...

Read the article

Is there any open source tool that automatically 'detects' email threading like Gmail?

- by Chris W.

For instance, if the original message (message 1) is... Hey Jon, Want to go get some pizza? -Bill And the reply (message 2) is... Bill, Sorry, I can't make lunch today. Jonathon Parks, CTO Acme Systems On Wed, Feb 24, 2010 at 4:43 PM, Bill Waters wrote: Hey John, Want to go get some pizza? -Bill In Gmail, the system (a) detects that message 2 is a reply to message 1 and turns this into a 'thread' of sorts and (b) detects where the replied portion of the message actually is and hides it from the user. (In this case the hidden portion would start at "On Wed, Feb..." and continue to the end of the message.) Obviously, in this simple example it would be easy to detect the "On <Date, <Name wrote:" or the "" character prefixes. But many email systems have many different style of marking replies (not to mention HTML emails). I get the feeling that you would have to have some damn smart string parsing algorithms to get anywhere near how good GMail's is. Does this technology already exist in an open source project somewhere? Either in some library devoted to this exclusively or perhaps in some open source email client that does similar message threading? Thanks.

Read the article

How can I use Convert.ChangeType to convert string into numerics with group separator?

- by Loic

Hello, I want to make a generic string to numeric converter, and provide it as a string extension, so I wrote the following code: public static bool TryParse<T>( this string text, out T result, IFormatProvider formatProvider ) where T : struct try { result = (T)Convert.ChangeType( text, typeof( T ), formatProvider ); return true; } catch(... I call it like this: int value; var ok = "123".TryParse(out value, NumberFormatInfo.CurrentInfo) It works fine until I want to use a group separator: As I live in France, where the thousand separator is a space and the decimal separator is a comma, the string "1 234 567,89" should be equals to 1234567.89 (in Invariant culture). But, the function crashes! When a try to perform a non generic conversion, like double.Parse(...), I can use an overload which accepts a NumberStyles parameter. I specify NumberStyles.Number and this time it works! So, the questions are : Why the parsing does not respect my NumberFormatInfo (where the NumberGroupSeparator is well specified to a space, as I specified in my OS) How could I make work the generic version with Convert.ChangeTime, as it has no overload wich accepts a NumberStyles parameter ?

Read the article

Extract known pattern substring from NSString (without regex)

- by d11wtq

I'm really tempted to drop RegexKit (or my own libpcre wrapper) into my project in order to do this, but before I do that I want to know how Cocoa developers manage to do half of this basic stuff without really convoluted code or without linking with RegexKit or another regular expression library. I find it gobsmacking that Cocoa does not include any regular expression matching features. I've so accustomed to using regular expressions for all kinds of things that I'm lost without them. I can do what I need without them, but the code would be rather convoluted. So, Cocoa devs, I ask you, what's the "Cocoa way" to do this... The problem is an everyday problem in programming as far as I'm concerned. Cocoa must have ways of doing this with the built-in features. Note that the position of the elements I want to match changes, and sometimes "quotes" are present. Whitespace is variable. Take the following strings: Content-Type: application/xml; charset=utf-8 Content-Type: text/html; charset="iso-8859-1" Content-Type: text/plain; charset=us-ascii Content-Type: text/plain; name="example.txt"; charset=utf-8 From all of these strings, how would you go about determining the mime type (e.g. text/plain) and the charset (e.g. utf-8) using just the built-in Cocoa classes? I'd end up performing a series of -rangeOfString: and substring calls, with conditional checks to deal with the optional quotes etc. Is there a way to do this with NSScanner? The NSScanner class seems to have a pretty naive API to me. Something like C's sscanf() that works for NSString objects would be an ideal fit. Most of my string parsing needs are simple such as this example so maybe regular expressions, while I'm accustomed to them, are overkill?

Read the article

Perl XML SAX parser emulating XML::Simple record for record

- by DVK

Short Q summary: I am looking a fast XML parser (most likely a wrapper around some standard SAX parser) which will produce per-record data structure 100% identical to those produced by XML::Simple. Details: We have a large code infrastructure which depends on processing records one-by-one and expects the record to be a data structure in a format produced by XML::Simple since it always used XML::Simple since early Jurassic era. An example simple XML is: <root> <rec><f1>v1</f1><f2>v2</f2></rec> <rec><f1>v1b</f1><f2>v2b</f2></rec> <rec><f1>v1c</f1><f2>v2c</f2></rec> </root> And example rough code is: sub process_record { my ($obj, $record_hash) = @_; # do_stuff } my $records = XML::Simple->XMLin(@args)->{root}; foreach my $record (@$records) { $obj->process_record($record) }; As everyone knows XML::Simple is, well, simple. And more importantly, it is very slow and a memory hog - due to being a DOM parser and needing to build/store 100% of data in memory. So, it's not the best tool for parsing an XML file consisting of large amount of small records record-by-record. However, re-writing the entire code (which consist of large amount of "process_record"-like methods) to work with standard SAX parser seems like an big task not worth the resources, even at the cost of living with XML::Simple. What I'm looking for is an existing module which will probably be based on a SAX parser (or anything fast with small memory footprint) which can be used to produce $record hashrefs one by one based on the XML pictured above that can be passed to $obj->process_record($record) and be 100% identical to what XML::Simple's hashrefs would have been. I don't care much what the interface of the new module is - e.g whether I need to call next_record() or give it a callback coderef accepting a record.

Read the article

Extracting images from a PDF

- by sagar

My Query I want to extract only images from a PDF document, using Objective-C in an iPhone Application. My Efforts I have gone through the info on this link, which has details regarding different operators on PDF documents. I also studied this document from Apple about PDF parsing with Quartz. I also went through the entire PDF reference document from the Adobe site. According to that document, for each image there are the following operators: q Q BI EI I have created a table to get the image: myTable = CGPDFOperatorTableCreate(); CGPDFOperatorTableSetCallback(myTable, "q", arrayCallback2); CGPDFOperatorTableSetCallback(myTable, "TJ", arrayCallback); CGPDFOperatorTableSetCallback(myTable, "Tj", stringCallback); I use this method to get the image: void arrayCallback2(CGPDFScannerRef inScanner, void *userInfo) { // THIS DOESN'T WORK // CGPDFStreamRef stream; // represents a sequence of bytes // if (CGPDFDictionaryGetStream (d, "BI", &stream)){ // CGPDFDataFormat t=CGPDFDataFormatJPEG2000; // CFDataRef data = CGPDFStreamCopyData (stream, &t); // } } This method is called for the operator "q", but I don't know how to extract an image from it. What should be the solution for extracting the images from the PDF documents? Thanks in advance for your kind help.

Read the article

Perl XML SAX parser emulating XML::Simple record for record

- by DVK

Short Q summary: I am looking a fast XML parser (most likely a wrapper around some standard SAX parser) which will produce per-record data structure 100% identical to those produced by XML::Simple. Details: We have a large code infrastructure which depends on processing records one-by-one and expects the record to be a data structure in a format produced by XML::Simple since it always used XML::Simple since early Jurassic era. An example simple XML is: <root> <rec><f1>v1</f1><f2>v2</f2></rec> <rec><f1>v1b</f1><f2>v2b</f2></rec> <rec><f1>v1c</f1><f2>v2c</f2></rec> </root> And example rough code is: sub process_record { my ($obj, $record_hash) = @_; # do_stuff } my $records = XML::Simple->XMLin(@args)->{root}; foreach my $record (@$records) { $obj->process_record($record) }; As everyone knows XML::Simple is, well, simple. And more importantly, it is very slow and a memory hog - due to being a DOM parser and needing to build/store 100% of data in memory. So, it's not the best tool for parsing an XML file consisting of large amount of small records record-by-record. However, re-writing the entire code (which consist of large amount of "process_record"-like methods) to work with standard SAX parser seems like an big task not worth the resources, even at the cost of living with XML::Simple. What I'm looking for is an existing module which will probably be based on a SAX parser (or anything fast with small memory footprint) which can be used to produce $record hashrefs one by one based on the XML pictured above that can be passed to $obj->process_record($record) and be 100% identical to what XML::Simple's hashrefs would have been.

Read the article

Using JavaCC to infer semantics from a Composite tree

- by Skice

Hi all, I am programming (in Java) a very limited symbolic calculus library that manages polynomials, exponentials and expolinomials (sums of elements like "x^n * e^(c x)"). I want the library to be extensible in the sense of new analytic forms (trigonometric, etc.) or new kinds of operations (logarithm, domain transformations, etc.), so a Composite pattern that represent the syntactic structure of an expression, together with a bunch of Visitors for the operations, does the job quite well. My problem arise when I try to implement operations that depends on the semantics more than on the syntax of the Expression (like integrals, for instance: there are a lot of resolution methods for specific classes of functions, but these same classes can be represented with more than a single syntax). So I thought I need something to "parse" the Composite tree to infer its semantics in order to invoke the right integration method (if any). Someone pointed me to JavaCC, but all the examples I've seen deal only with string parsing; so, I don't know if I'm digging in the right direction. Some suggestions? (I hope to have been clear enough!)

Read the article

How to parse text fragments located outside tags (inbetween tags) by simplehtmldom?

- by moogeek

Hello! I'm using simplehtmldom to parse html and I'm stuck in parsing plaintext located outside of any tag (but between two different tags): <div class="text_small"> <b>?dress:</b> 7 Hange Road<br> <b>Phone:</b> 415641587484<br> <b>Contact:</b> Alex<br> <b>Meeting Time:</b> 12:00-13:00<br> </div> Is it possible to get these values of Adress, Phone, Contact, Meeting Time? I wonder if there is a opportunity to pass CSS Selectors into nextSibling/previousSibling functions... foreach($html->find('div.text_small') as $div_descr) { foreach($div_descr->find('b') as $b) { if ($b->innertext=="?dress:") {//someaction } if ($b->innertext=="Phone:") { //someaction } if ($b->innertext=="Contact:") { //someaction } if ($b->innertext=="Meeting Time:") { //someaction } } } What I should use instead "someaction" ? upd. Yes, I don't have an access for editing the target page. Otherwise, would it be worth to? :)

Read the article

Can't read some attributes with SAX

- by akappa

Hi all, I'm trying to parse that document with SAX: <scxml version="1.0" initialstate="start" name="calc"> <datamodel> <data id="expr" expr="0" /> <data id="res" expr="0" /> </datamodel> <state id="start"> <transition event="OPER" target="opEntered" /> <transition event="DIGIT" target="operand" /> </state> <state id="operand"> <transition event="OPER" target="opEntered" /> <transition event="DIGIT" /> </state> </scxml> I read all the attributes well, except "initialstate" and "name"... I get the attributes with the startElement handler, but the size of the attribute list for scxml is zero. Why? How I can overcome that problem? Edit: public void startElement(String uri, String localName, String qName, Attributes attributes){ System.out.println(attributes.getValue("initialstate")); System.out.println(attributes.getValue("name")); } that, when parsing the first tag, doesn't work (prints "null" two times). In fact, attributes.getLength(); evaluates to zero. Thanks

Read the article

BeautifulSoup can't parse a webpage?

- by JLTChiu

I am using beautiful soup for parsing webpage now, I've heard it's very famous and good, but it doesn't seems works properly. Here's what I did import urllib2 from bs4 import BeautifulSoup page = urllib2.urlopen("http://www.cnn.com/2012/10/14/us/skydiver-record-attempt/index.html?hpt=hp_t1") soup = BeautifulSoup(page) print soup.prettify() I think this is kind of straightforward. I open the webpage and pass it to the beautifulsoup. But here's what I got: Warning (from warnings module): File "C:\Python27\lib\site-packages\bs4\builder\_htmlparser.py", line 149 "Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help.")) ... HTMLParseError: bad end tag: u'</"+"script>', at line 634, column 94 I thought CNN website should be well designed, so I am not very sure what's going on though. Does anyone has idea about this?

Read the article

Why does 12:20 PM parse to 0:20 on the next day?

- by Hanno Fietz

I'm using java.text.SimpleDateFormat to parse string representations of date/time values inside an XML document. I'm seeing all times that have an hour value of 12 shifted by 12 hours into the future, i. e. 20 minutes past noon gets parsed to mean 20 minutes past midnight the following day. I wrote a unit test which seems to confirm that the error is made upon parsing (I checked the return values from getTime() with the linux shell command date). Now I'm wondering: is there a bug in the parse() method? is there something wrong with the input string? am I using the wrong format string for the input? The input data is taken from Yahoo's YWeather service. Here's the test and its output: public class YWeatherReaderTest { public static final String[] rgDateSamples = { "Thu, 08 Apr 2010 12:20 PM CEST", "Thu, 08 Apr 2010 12:20 AM CEST" }; public void dateParsing() throws ParseException { DateFormat formatter = new SimpleDateFormat("EEE, dd MMM yyyy K:m a z", Locale.US); for (String dtsSrc : YWeatherReaderTest.rgDateSamples) { Date dt = formatter.parse(dtsSrc); String dtsDst = formatter.format(dt); System.out.println(dtsSrc); System.out.println(dtsDst); System.out.println(); } } } Thu, 08 Apr 2010 12:20 PM CEST Fri, 09 Apr 2010 0:20 AM CEST Thu, 08 Apr 2010 12:20 AM CEST Thu, 08 Apr 2010 0:20 PM CEST The second output line of the second iteration is slightly weird, because 00:20 isn't PM. The milliseconds value of the Date object, however, corresponds to the (wrong) time of 20 minutes past noon.

Read the article

Problem using NSXMLParser with NOAA data on iPhone

- by Amagrammer

Can anyone help me see why NSXMLParser is not causing these methods parser:didStartElement:namespaceURI:qualifiedName:attributes: parser:didEndElement:namespaceURI:qualifiedName:attributes: to fire for the part of the following data: <?xml version="1.0" encoding="ISO-8859-1"?><SOAP-ENV:Envelope SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"><SOAP-ENV:Body><ns1:NDFDgenResponse xmlns:ns1=""><dwmlOut xsi:type="xsd:string"><?xml version="1.0"?> <dwml version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.nws.noaa.gov/forecasts/xml/DWMLgen/schema/DWML.xsd"> (body excluded) </dwml> </dwmlOut></ns1:NDFDgenResponse></SOAP-ENV:Body></SOAP-ENV:Envelope> I'm not an XML expert, but to me, the part looks like just a regular element, to be parsed just like the parts before it. I do get two parser:parseErrorOccurred: errors, #200 and #201, but they occur during the parsing of the <SOAP-ENV:Body> element, not the element, so I'm not sure if they are relevant. Thanks for any help you can give me.

Read the article

How to make a small engine like Wolfram|Alpha?

- by Koning WWWWWWWWWWWWWWWWWWWWWWW

Lets say I have three models/tables: operating_systems, words, and programming_languages: # operating_systems name:string created_by:string family:string Windows Microsoft MS-DOS Mac OS X Apple UNIX Linux Linus Torvalds UNIX UNIX AT&T UNIX # words word:string defenitions:string window (serialized hash of defenitions) hello (serialized hash of defenitions) UNIX (serialized hash of defenitions) # programming_languages name:string created_by:string example_code:text C++ Bjarne Stroustrup #include <iostream> etc... HelloWorld Jeff Skeet h AnotherOne Jon Atwood imports 'SORULEZ.cs' etc... When a user searches hello, the system shows the defenitions of 'hello'. This is relatively easy to implement. However, when a user searches UNIX, the engine must choose: word or operating_system. Also, when a user searches windows (small letter 'w'), the engine chooses word, but should also show Assuming 'windows' is a word. Use as an <a href="etc..">operating system</a> instead. Can anyone point me in the right direction with parsing and choosing the topic of the search query? Thanks. Note: it doesn't need to be able to perform calculations as WA can do.

Read the article

How could I parse this HTML file?

- by Sergio Tapia

<div id="main"> <style type="text/css"> </style> <script language="JavaScript"> </script> <p style="margin: 0pt 0pt 0.5em;"><b>Media from <a onclick="(new Image()).src='/rg/find-media-title/media_strip/images/b.gif?link=/title/tt0087538/';" href="/title/tt0087538/">The Karate Kid</a> (1984)</b></p> <style type="text/css"> </style> <table style="border-collapse: collapse;"> </table> </div> I need to somehow extract the href value of the (new Image()). How exactly would I accomplish this with HtmlAgilityPack? I'm new to it, and so far I haven't found a useful tutorial on how to effectively use it for parsing. Thanks for the help!

Read the article

Line Break in XML?

- by ew89

Hello, I'm a beginner in web development, and I'm trying to insert line breaks in my XML file. This is what my XML looks like: Song Title Lyrics <song> <title>Song Title</title> <lyric>Lyrics</lyric> <song> <title>Song Title</title> <lyric>Lyrics</lyric> <song> <title>Song Title</title> <lyric>Lyrics</lyric> I want to have line breaks in between the sentences for the lyrics. I tried everything from /n, and other codes similar to it, PHP parsing, etc., and nothing works! Have been googling online for hours and can't seem to find the answer. I'm using the XML to insert data to an HTML page using Javascript. Does anyone know how to solve this problem? Thanks before :)

Read the article

Can Haskell's Parsec library be used to implement a recursive descent parser with backup?

- by Thor Thurn

I've been considering using Haskell's Parsec parsing library to parse a subset of Java as a recursive descent parser as an alternative to more traditional parser-generator solutions like Happy. Parsec seems very easy to use, and parse speed is definitely not a factor for me. I'm wondering, though, if it's possible to implement "backup" with Parsec, a technique which finds the correct production to use by trying each one in turn. For a simple example, consider the very start of the JLS Java grammar: Literal: IntegerLiteral FloatingPointLiteral I'd like a way to not have to figure out how I should order these two rules to get the parse to succeed. As it stands, a naive implementation like this: literal = do { x <- try (do { v <- integer; return (IntLiteral v)}) <|> (do { v <- float; return (FPLiteral v)}); return(Literal x) } Will not work... inputs like "15.2" will cause the integer parser to succeed first, and then the whole thing will choke on the "." symbol. In this case, of course, it's obvious that you can solve the problem by re-ordering the two productions. In the general case, though, finding things like this is going to be a nightmare, and it's very likely that I'll miss some cases. Ideally, I'd like a way to have Parsec figure out stuff like this for me. Is this possible, or am I simply trying to do too much with the library? The Parsec documentation claims that it can "parse context-sensitive, infinite look-ahead grammars", so it seems like something like I should be able to do something here.

Read the article

Technique to remove common words(and their plural versions) from a string

- by Jake M

I am attempting to find tags(keywords) for a recipe by parsing a long string of text. The text contains the recipe ingredients, directions and a short blurb. What do you think would be the most efficient way to remove common words from the tag list? By common words, I mean words like: 'the', 'at', 'there', 'their' etc. I have 2 methodologies I can use, which do you think is more efficient in terms of speed and do you know of a more efficient way I could do this? Methodology 1: - Determine the number of times each word occurs(using the library Collections) - Have a list of common words and remove all 'Common Words' from the Collection object by attempting to delete that key from the Collection object if it exists. - Therefore the speed will be determined by the length of the variable delims import collections from Counter delim = ['there','there\'s','theres','they','they\'re'] # the above will end up being a really long list! word_freq = Counter(recipe_str.lower().split()) for delim in set(delims): del word_freq[delim] return freq.most_common() Methodology 2: - For common words that can be plural, look at each word in the recipe string, and check if it partially contains the non-plural version of a common word. Eg; For the string "There's a test" check each word to see if it contains "there" and delete it if it does. delim = ['this','at','them'] # words that cant be plural partial_delim = ['there','they',] # words that could occur in many forms word_freq = Counter(recipe_str.lower().split()) for delim in set(delims): del word_freq[delim] # really slow for delim in set(partial_delims): for word in word_freq: if word.find(delim) != -1: del word_freq[delim] return freq.most_common()

Search Results

Search found 4222 results on 169 pages for 'dtd parsing'.

Page 38/169 | < Previous Page | 34 35 36 37 38 39 40 41 42 43 44 45 | Next Page >

- by wo_shi_ni_ba_ba

- by C S

- by pure841

- by Graeme

- by iCE-9

- by java.is.for.desktop

- by daniels

- by Pandiya Chendur

- by Chris W.

- by Loic

- by d11wtq

- by DVK

- by sagar

- by DVK

- by Skice

- by moogeek

- by akappa

- by JLTChiu

- by Hanno Fietz

- by Amagrammer

- by Koning WWWWWWWWWWWWWWWWWWWWWWW

- by Sergio Tapia

- by ew89

- by Thor Thurn

- by Jake M

< Previous Page | 34 35 36 37 38 39 40 41 42 43 44 45 | Next Page >