pdf parsing - Page 70 - Developer IT

Webserver parsing chrome input from post request

- by ravenspoint

I am developing a small embedded web server. I want to add parsing of post requests, but I am having a problem with input password fields from Chrome. Firefox and IE work perfectly. The HTML: <form action=start.webem method=post> <input value="START" type=submit /> <p>Password: <input TYPE=PASSWORD name=yourname AUTOCOMPLETE=OFF /><br> </form> From Firefox I get POST /stop.webem HTTP/1.1 Host: 127.0.0.1:8080 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.9) Gecko/20100315 Firefox/3.5.9 (.NET CLR 3.5.30729) Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://127.0.0.1:8080/ Content-Type: application/x-www-form-urlencoded Content-Length: 13 yourname=test However from Chrome, about 90% of the time, the yourname=test is missing POST /start.webem HTTP/1.1 Host: 127.0.0.1:8080 Connection: keep-alive User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.1.249.1045 Safari/532.5 Referer: http://127.0.0.1:8080/ Content-Length: 13 Cache-Control: max-age=0 Origin: http://127.0.0.1:8080 Content-Type: application/x-www-form-urlencoded Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Encoding: gzip,deflate,sdch Accept-Language: en-US,en;q=0.8 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3 Though, occasionally it does work!!! POST /start.webem HTTP/1.1 Host: 127.0.0.1:8080 Connection: keep-alive User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.1.249.1045 Safari/532.5 Referer: http://127.0.0.1:8080/start.webem Content-Length: 13 Cache-Control: max-age=0 Origin: http://127.0.0.1:8080 Content-Type: application/x-www-form-urlencoded Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Encoding: gzip,deflate,sdch Accept-Language: en-US,en;q=0.8 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3 yourname=test I cannot find what causes it to work sometimes.

Read the article

Parsing language for both binary and character files

- by Thorsten S.

The problem: You have some data and your program needs specified input. For example strings which are numbers. You are searching for a way to transform the original data in a format you need. And the problem is: The source can be anything. It can be XML, property lists, binary which contains the needed data deeply embedded in binary junk. And your output format may vary also: It can be number strings, float, doubles.... You don't want to program. You want routines which gives you commands capable to transform the data in a form you wish. Surely it contains regular expressions, but it is very good designed and it offers capabilities which are sometimes much more easier and more powerful. Something like a super-grep which you can access (!) as program routines, not only as tool. It allows: joining/grouping/merging of results inserting/deleting/finding/replacing write macros which allows to execute a command chain repeatedly meta-grouping (lists-tables-hypertables) Example (No, I am not looking for a solution to this, it is just an example): You want to read xml strings embedded in a binary file with variable length records. Your tool reads the record length and deletes the junk surrounding your text. Now it splits open the xml and extracts the strings. Being Indian number glyphs and containing decimal commas instead of decimal points, your tool transforms it into ASCII and replaces commas with points. Now the results must be stored into matrices of variable length....etc. etc. I am searching for a good language / language-design and if possible, an implementation. Which design do you like or even, if it does not fulfill the conditions, wouldn't you want to miss ? EDIT: The question is if a solution for the problem exists and if yes, which implementations are available. You DO NOT implement your own sorting algorithm if Quicksort, Mergesort and Heapsort is available. You DO NOT invent your own text parsing method if you have regular expressions. You DO NOT invent your own 3D language for graphics if OpenGL/Direct3D is available. There are existing solutions or at least papers describing the problem and giving suggestions. And there are people who may have worked and experienced such problems and who can give ideas and suggestions. The idea that this problem is totally new and I should work out and implement it myself without background knowledge seems for me, I must admit, totally off the mark.

Read the article

Parsing Data in XML and Storing to DB in Python

- by Rakesh

Hi Guys i have problem parsing an xml file and entering the data to sqlite, the format is like i need to enter the chracters before the token like 111,AAA,BBB etc <DOCUMENT> <PAGE width="544.252" height="634.961" number="1" id="p1"> <MEDIABOX x1="0" y1="0" x2="544.252" y2="634.961"/> <BLOCK id="p1_b1"> <TEXT width="37.7" height="74.124" id="p1_t1" x="51.1" y="20.8652"> <TOKEN sid="p1_s11" id="p1_w1" font-name="Verdanae" bold="yes" italic="no">111</TOKEN> </TEXT> </BLOCK> <BLOCK id="p1_b3"> <TEXT width="151.267" height="10.725" id="p1_t6" x="24.099" y="572.096"> <TOKEN sid="p1_s35" id="p1_w22" font-name="Verdanae" bold="yes" italic="yes">AAA</TOKEN> <TOKEN sid="p1_s36" id="p1_w23" font-name="verdanae" bold="yes" italic="no">BBB</TOKEN> <TOKEN sid="p1_s37" id="p1_w24" font-name="verdanae" bold="yes" italic="no">CCC</TOKEN> </TEXT> </BLOCK> <BLOCK id="p1_b4"> <TEXT width="82.72" height="26" id="p1_t7" x="55.426" y="138.026"> <TOKEN sid="p1_s42" id="p1_w29" font-name="verdanae" bold="yes" italic="no">DDD</TOKEN> <TOKEN sid="p1_s43" id="p1_w30" font-name="verdanae" bold="yes" italic="no">EEE</TOKEN> </TEXT> <TEXT width="101.74" height="26" id="p1_t8" x="55.406" y="162.026"> <TOKEN sid="p1_s45" id="p1_w31" font-name="verdanae" bold="yes" italic="no">FFF</TOKEN> </TEXT> <TEXT width="152.96" height="26" id="p1_t9" x="55.406" y="186.026"> <TOKEN sid="p1_s47" id="p1_w32" font-name="verdanae" bold="yes" italic="no">GGG</TOKEN> <TOKEN sid="p1_s48" id="p1_w33" font-name="verdanae" bold="yes" italic="no">HHH</TOKEN> </TEXT> </BLOCK> </PAGE> </DOCUMENT> in .net it is done with 3 foreach loops 1. for "DOCUMENT/PAGE/BLOCK" 2."TEXT" 3. "TOKEN" and then it is entered into the DB i dont get how to do it in python and i am trying it with lxml module

Read the article

Problem parsing XML data to Multi dimensional array

- by Cam

Hi there, i'm still transitioning from as2 to as3, i'm having trouble with parsing XML data to Multi dimensional array, below is the onComplete handler which is succesfully tracing 'event.target.data' but outputs 'A term is undefined and has no properties' when tracing _vein_data[0][0].xPos . I'm guessing there is a easier way to approach it than this attempt private function on_xml_completed(event:Event):void { var XMLPoints:XML = new XML(event.target.data); for ( var i:int = 0; i < XMLPoints.shape.length(); i++ ) { var shapeArray:Array = new Array(); _vein_data.push(shapeArray); for ( var j:int = 0; j < 4; i++ ) { _vein_data[i].push({'xPos':XMLPoints.shape[i].point[j].@xPos, 'yPos':XMLPoints.shape[i].point[j].@yPos}); } } trace(_vein_data[0][0].xPos) loadAsset(); } here's a portion of my XML; <items> <shape> <point xPos="60" yPos="23" /> <point xPos="65" yPos="23" /> <point xPos="93" yPos="85" /> <point xPos="88" yPos="87" /> </shape> <shape> <point xPos="88" yPos="87" /> <point xPos="92" yPos="83" /> <point xPos="145" yPos="174" /> <point xPos="138" yPos="175" /> </shape> <shape> <point xPos="138" yPos="175" /> <point xPos="143" yPos="171" /> <point xPos="147" yPos="211" /> <point xPos="141" yPos="212" /> </shape> </items> thank you in advance for any guidance on this Cam

Read the article

parsed xml file: skip creation if blank?

- by GoodGets

This could be a HappyMapper specific question, but I don't think so. In my app, users can upload their blog subscriptions (via an OPML file), which I parse and add to their profile. The only problem is during the parsing, or more specifically the creation of each subscription, I can't figure out how to skip over entries that are just "labels". Since OPML files allow you to label your blogs, or organize them into folders, this is my problem. The actual blog subscriptions and their labels both have "outline" tags. <outline text="Rails" > <outline title="Katz Got Your Tongue?" text="Katz Got Your Tongue?" htmlUrl="http://yehudakatz.com" type="rss" xmlUrl="http://feeds.feedburner.com/KatzGotYourTongue" /> After parsing, I create each feed via a method call inside of the HappyMapper module def create_feed Feed.new( :feed_htmlUrl => self.htmlUrl, :feed_title => self.title, ... But how do I prevent it from creating new "feeds" for those outline tags that are just tags? (i.e. those that don't have an htmlUrl?)

Read the article

Shift-reduce: when to stop reducing?

- by Joey Adams

I'm trying to learn about shift-reduce parsing. Suppose we have the following grammar, using recursive rules that enforce order of operations, inspired by the ANSI C Yacc grammar: S: A; P : NUMBER | '(' S ')' ; M : P | M '*' P | M '/' P ; A : M | A '+' M | A '-' M ; And we want to parse 1+2 using shift-reduce parsing. First, the 1 is shifted as a NUMBER. My question is, is it then reduced to P, then M, then A, then finally S? How does it know where to stop? Suppose it does reduce all the way to S, then shifts '+'. We'd now have a stack containing: S '+' If we shift '2', the reductions might be: S '+' NUMBER S '+' P S '+' M S '+' A S '+' S Now, on either side of the last line, S could be P, M, A, or NUMBER, and it would still be valid in the sense that any combination would be a correct representation of the text. How does the parser "know" to make it A '+' M So that it can reduce the whole expression to A, then S? In other words, how does it know to stop reducing before shifting the next token? Is this a key difficulty in LR parser generation?

Read the article

int.Parse of "8" fails. int.Parse always requires CultureInfo.InvariantCulture?

- by Henrik Carlsson

We develop an established software which works fine on all known computers except one. The problem is to parse strings that begin with "8". It seems like "8" in the beginning of a string is a reserved character. Parsing: int.Parse("8") -> Exception message: Input string was not in a correct format. int.Parse("80") -> 0 int.Parse("88") -> 8 int.Parse("8100") -> 100 CurrentCulture: sv-SE CurrentUICulture: en-US The problem is solved using int.Parse("8", CultureInfo.InvariantCulture). However, it would be nice to know the source of the problem. Question: Why do we get this behaviour of "8" if we don't specify invariant culture? Additional information: I did send a small program to my client achieve the result above: private int ParseInt(string s) { int parsedInt = -1000; try { parsedInt = int.Parse(s); textBoxMessage.Text = "Success: " + parsedInt; } catch (Exception ex) { textBoxMessage.Text = string.Format("Error parsing string: '{0}'", s) + Environment.NewLine + "Exception message: " + ex.Message; } textBoxMessage.Text += Environment.NewLine + Environment.NewLine + "CurrentCulture: " + Thread.CurrentThread.CurrentCulture.Name + "\r\n" + "CurrentUICulture: " + Thread.CurrentThread.CurrentUICulture.Name + "\r\n"; return parsedInt; }

Read the article

How do I efficiently parse a CSV file in Perl?

- by Mike

I'm working on a project that involves parsing a large csv formatted file in Perl and am looking to make things more efficient. My approach has been to split() the file by lines first, and then split() each line again by commas to get the fields. But this suboptimal since at least two passes on the data are required. (once to split by lines, then once again for each line). This is a very large file, so cutting processing in half would be a significant improvement to the entire application. My question is, what is the most time efficient means of parsing a large CSV file using only built in tools? note: Each line has a varying number of tokens, so we can't just ignore lines and split by commas only. Also we can assume fields will contain only alphanumeric ascii data (no special characters or other tricks). Also, i don't want to get into parallel processing, although it might work effectively. edit It can only involve built-in tools that ship with Perl 5.8. For bureaucratic reasons, I cannot use any third party modules (even if hosted on cpan) another edit Let's assume that our solution is only allowed to deal with the file data once it is entirely loaded into memory. yet another edit I just grasped how stupid this question is. Sorry for wasting your time. Voting to close.

Read the article

How to parse a string (by a "new" markup) with R ?

- by Tal Galili

Hi all, I want to use R to do string parsing that (I think) is like a simplistic HTML parsing. For example, let's say we have the following two variables: Seq <- "GCCTCGATAGCTCAGTTGGGAGAGCGTACGACTGAAGATCGTAAGGtCACCAGTTCGATCCTGGTTCGGGGCA" Str <- ">>>>>>>..>>>>........<<<<.>>>>>.......<<<<<.....>>>>>.......<<<<<<<<<<<<." Say that I want to parse "Seq" According to "Str", by using the legend here Seq: GCCTCGATAGCTCAGTTGGGAGAGCGTACGACTGAAGATCGTAAGGtCACCAGTTCGATCCTGGTTCGGGGCA Str: >>>>>>>..>>>>........<<<<.>>>>>.......<<<<<.....>>>>>.......<<<<<<<<<<<<. | | | | | | | || | +-----+ +--------------+ +---------------+ +---------------++-----+ | Stem 1 Stem 2 Stem 3 | | | +----------------------------------------------------------------+ Stem 0 Assume that we always have 4 stems (0 to 3), but that the length of letters before and after each of them can very. The output should be something like the following list structure: list( "Stem 0 opening" = "GCCTCGA", "before Stem 1" = "TA", "Stem 1" = list(opening = "GCTC", inside = "AGTTGGGA", closing = "GAGC" ), "between Stem 1 and 2" = "G", "Stem 2" = list(opening = "TACGA", inside = "CTGAAGA", closing = "TCGTA" ), "between Stem 2 and 3" = "AGGtC", "Stem 3" = list(opening = "ACCAG", inside = "TTCGATC", closing = "CTGGT" ), "After Stem 3" = "", "Stem 0 closing" = "TCGGGGC" ) I don't have any experience with programming a parser, and would like advices as to what strategy to use when programming something like this (and any recommended R commands to use). What I was thinking of is to first get rid of the "Stem 0", then go through the inner string with a recursive function (let's call it "seperate.stem") that each time will split the string into: 1. before stem 2. opening stem 3. inside stem 4. closing stem 5. after stem Where the "after stem" will then be recursively entered into the same function ("seperate.stem") The thing is that I am not sure how to try and do this coding without using a loop. Any advices will be most welcomed.

Read the article

Parsing GeoRSS Feed with jQuery

- by senfo

I'm attempting to use the jQuery jFeed plugin for parsing an Atom, GeoRSS feed and I'm running into issues extracting the information I need. For example, I need to extract the summary element and I would like to render the contents in a div on my HTML page. Additionally, I'd like to extract the contents from the georss:point elements and pass them into Google Maps to render them as points on a map. The problem is that it seems jFeed is stripping out the GeoRSS-related information. For example, I can extract the title element without issues, but it seems it doesn't extract the summary or georss:point elements, at all. Following is a snippet of the XML I'm working with: <feed xmlns="http://www.w3.org/2005/Atom" xmlns:georss="http://www.georss.org/georss"> <title>Search Results from DataWarehouse.HRSA.gov</title> <link rel="self" href="http://datawarehouse.hrsa.gov/HGDWDataWebService/HGDWDataService.aspx?service=HC&zip=20002&radius=10"/> <link rel="alternate" href="http://datawarehouse.hrsa.gov/"/> <author> <name>HRSA Geospatial Data Warehouse</name> </author> <id>tag:datawarehouse.hrsa.gov,2010-04-05:/</id> <updated>2010-04-05T19:25:28-05:00</updated> <entry> <title>Christ House</title> <link href="http://www.christhouse.org" /> <id>tag:datawarehouse.hrsa.gov,2010-04-05:/D388C4C6-FFA4-4091-819B-64D67DC64931</id> <summary type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <div class="vcard"> <div class="fn org">Christ House</div> <div class="adr"> <div class="street-address">1717 Columbia Rd. N.W.</div> <span class="locality">Washington</span>, <span class="region">District of Columbia</span>, <span class="postal-code">20009-2803</span> </div> <div class="tel">202-328-1100</div> </div> <div> Categories: <span class="category">Service Delivery Site</span> </div> </div> </summary> <georss:point>38.9243636363636 -77.0395636363637</georss:point> <updated>2010-04-04T00:00:00-05:00</updated> </entry> </feed> Following is the jQuery code that I'm using: $(document).ready(function() { $.getFeed({ //url: 'http://datawarehouse.hrsa.gov/HGDWDataWebService/HGDWDataService.aspx?service=HC&zip=20002&radius=10', url: 'test.xml', success: function(feed) { $.each(feed.items, function(index, value) { $('#rssContent').append(value.title); // Set breakpoint here }); } }); }); I set a breakpoint on the line that appends to the rssContent div and noticed the objects in feed.items don't have the properties I'm after. Am I doing something wrong or was jFeed simply not designed to work the way I want it to?

Read the article

Parsing HTML using HtmlParser

- by Blankman

My html has 20 or so rows of the following HTML pattern. So the below is considered a single instance of the pattern. Each instance of this pattern represents a product. Again the below is a single instance, it spans multiple rows in the HTML table. <table> ..  <tr> <td rowspan="5" class="product" valign="top"><nobr> ????????????</td> </tr> <tr> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> </tr> <tr> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> </tr> </tr> <tr> <td colspan="5" ????????</td> </tr> <tr> <td colspan="6" width="100%"> <hr></td> </tr>   .. <table> I am trying to use HtmlParser for this. Parser rowParser = new Parser(); rowParser.setInputHtml(page.getHtml()); // page object represents a html page rowParser.setEncoding("UTF-8"); NodeFilter productRowFilter = new AndFilter( new TagNameFilter("tr"), new HasChildFilter( new AndFilter( new TagNameFilter("td"), new HasAttributeFilter("class", "product"))) The above filter doesn't work, just showing you what I have so far. I need to somehow combine these filters, and use the last td to mark the end of the pattern i.e. the td with the colspan=6 and width=100% with child element hr. I have been struggling with this, and have resorted to Regex'ing but was told numerous times to NOT use regex for html parsing, so here I am! Your help is much appreciated!

Read the article

What good alternatives to CHM are there for context sensitive help documents in desktop applications

- by ninesided

We currently have a number of desktop applications (PowerBuilder, Winforms, WPF) that make use of a single CHM for context sensitive help. We'd like to move away from CHM as it's difficult to maintain but we've not found a suitable alternative. Ideally we'd like our developers to keep the help files up to date (perhaps in a wiki) as they add funtionality and simply export this to PDF or something like that, but is it possible to use a PDF for context sensitve help, or are there any other promising alternative to CHM?

Read the article

XML Parsing Error: junk after document element

- by Jake

I am using the following script to generate a RSS feed for my site: <?php class RSS { public function RSS() { $root = $_SERVER['DOCUMENT_ROOT']; require_once ("../connect.php"); } public function GetFeed() { return $this->getDetails() . $this->getItems(); } private function dbConnect() { DEFINE ('LINK', mysql_connect (DB_HOST, DB_USER, DB_PASSWORD)); } private function getDetails() { $detailsTable = "rss_feed_details"; $this->dbConnect($detailsTable); $query = "SELECT * FROM ". $detailsTable ." WHERE feed_category = ''"; $result = mysql_db_query (DB_NAME, $query, LINK); while($row = mysql_fetch_array($result)) { $details = '<?xml version="1.0" encoding="ISO-8859-1" ?> <rss version="2.0"> <channel> <title>'. $row['title'] .'</title> <link>'. $row['link'] .'</link> <description>'. $row['description'] .'</description> <language>'. $row['language'] .'</language> <image> <title>'. $row['image_title'] .'</title> <url>'. $row['image_url'] .'</url> <link>'. $row['image_link'] .'</link> <width>'. $row['image_width'] .'</width> <height>'. $row['image_height'] .'</height> </image>'; } return $details; } private function getItems() { $itemsTable = "rss_posts"; $this->dbConnect($itemsTable); $query = "SELECT * FROM ". $itemsTable ." ORDER BY id DESC"; $result = mysql_db_query (DB_NAME, $query, LINK); $items = ''; while($row = mysql_fetch_array($result)) { $items .= '<item> <title>'. $row["title"] .'</title> <link>'. $row["link"] .'</link> <description><![CDATA['.$row["readable_date"]."<br /><br />".$row["description"]."<br /><br />".']]></description> </item>'; } $items .= '</channel> </rss>'; return $items; } } ?> The baffling thing is, the script works perfectly fine on my localhost but gives the following error on my remote server: XML Parsing Error: junk after document element Location: http://mysite.com/rss/main/ Line Number 2, Column 1:<b>Parse error</b>: syntax error, unexpected T_STRING in <b>/home/studentw/public_html/rss/global-reach/rssClass.php</b> on line <b>1</b><br /> ^ Can someone please tell me what's wrong?

Read the article

save a cfdocument as an excel file

- by Winter

is there a workaround to use the cfdocument tag to save a page/file as an excel sheet instead of a PDF file? I already have a process set up to make pdf files and email them out and would like to give my customers the option of getting an excel file instead. It would be nice if I could reuse the code I already have instead of having to rewrite it in POI or something like that.

Read the article

Parsing xml file that comes in as one object per line

- by Casey

I haven't been here in so long, I forgot my prior account! Anyways, I am working on parsing an xml document that comes in ugly. It is for banking statements. Each line is a <statement>all tags</statement>. Now, what I need to do is read this file in, and parse the XML document at the same time, while formatting it more human readable too. Point beeing, Original input looks like this: <statement><accountHeader><fiAddress></fiAddress><accountNumber></accountNumber><startDate>20140101</startDate><endDate>20140228</endDate><statementGroup>1</statementGroup><sortOption>0</sortOption><memberBranchCode>1</memberBranchCode><memberName></memberName><jointOwner1Name></jointOwner1Name><jointOwner2Name></jointOwner2Name></summary></statement> <statement><accountHeader><fiAddress></fiAddress><accountNumber></accountNumber><startDate>20140101</startDate><endDate>20140228</endDate><statementGroup>1</statementGroup><sortOption>0</sortOption><memberBranchCode>1</memberBranchCode><memberName></memberName><jointOwner1Name></jointOwner1Name><jointOwner2Name></jointOwner2Name></summary></statement> <statement><accountHeader><fiAddress></fiAddress><accountNumber></accountNumber><startDate>20140101</startDate><endDate>20140228</endDate><statementGroup>1</statementGroup><sortOption>0</sortOption><memberBranchCode>1</memberBranchCode><memberName></memberName><jointOwner1Name></jointOwner1Name><jointOwner2Name></jointOwner2Name></summary></statement> I need the final output to be as follows: <statement> <name></name> <address></address> </statement> This is fine and dandy. I am using the following "very slow considering 5.1 million lines, 254k data file, and about 60k statements takes around 8 minutes". foreach(String item in lines) { XElement xElement = XElement.Parse(item); sr.WriteLine(xElement.ToString().Trim()); } Then when the file is formatted this is what sucks. I need to check every single tag in transaction elements, and if a tag is missing that could be there, I have to fill it in. Our designer software will default prior values in if a tag is possible, and the current objects does not have. It defaults in the value of a prior one that was not Null. "I know, and they swear up and down it is not a bug... ok?" So, that is also taking about 5 to 10 minutes. I need to break all this down, and find a faster method for working with the initial XML. This is a preprocess action, and cannot take that long if not necessary. It just seems redundant. Is there a better way to parse the XML, or is this the best I can do? I parse the XML, write to a temp file, and then read that file in, to the output file inserting the missing tags. 2 IO runs for one process. Yuck.

Read the article

Can I reset a forgotten owner password with iText?

- by Tom Hubbard

With iText I can use Java to open a pdf and write it. If the pdf has an owner password I can still open it but it can not be written. Clearly the content is readable, it seems like at that point you could simply write the document to a new file. iText doesn't allow this, it throws a bad password exception. Is there a way around this?

Read the article

iPhone ebook app

- by pablo

Hi people, i was wondering if you know any tutorial or if you have any experience in doing an ebook reader. Is it posible to read a pdf file and extract its pages for using them, or if i have to directly convert the pages to png and use them. Also if it is posible to use the pdf data, can i somehow access the text within? Like for example for doing a word search? Thanks!

Read the article

Problem in named destination

- by Palanisamy

hi, i want to give deep linking for named destination.. is that possible? i am using this tool : http://flexpaper.devaldi.com/ This tool is loading PDF from converting PDF to SWF using PDF2SWF(www.swftools.org) is there any way to get the named destination and give deep linking for the named destination..? Please help me. thanks in advance.. Palanisamy

Read the article

Proxy Issues with Javascript Cross Domain RSS Feed Parsing

- by Amir

This is my Javascript function which grabs an rss feed via the proxy script and then spits out the 5 latest rss items from the feed along with a link to my stylesheet: function getWidget (feed,limit) { if (window.XMLHttpRequest) { xhttp=new XMLHttpRequest() } else { xhttp=new ActiveXObject("Microsoft.XMLHTTP") } xhttp.open("GET","http://MYSITE/proxy.php?url="+feed,false); xhttp.send(""); xmlDoc=xhttp.responseXML; var x = 1; var div = document.getElementById("div"); srdiv.innerHTML = '<link type="text/css" href="http://MYSITE/css/widget.css" rel="stylesheet" /><div id="rss-title"></div></h3><div id="items"></div><br /><br /><a href="http://MYSITE">Powered by MYSITE</a>'; document.body.appendChild(div); content=xmlDoc.getElementsByTagName("title"); thelink=xmlDoc.getElementsByTagName("link"); document.getElementByTagName("rss-title").innerHTML += content[0].childNodes[0].nodeValue; for (x=1;x<=limit;srx++) { y=x; y--; var shout = '<div class="item"><a href="'+thelink[y].childNodes[0].nodeValue+'">'+content[x].childNodes[0].nodeValue+'</a></div>'; document.getElementById("items").innerHTML += shout; } } Here is the the code from proxy.php: $session = curl_init($_GET['url']); // Open the Curl session curl_setopt($session, CURLOPT_HEADER, false); // Don't return HTTP headers curl_setopt($session, CURLOPT_RETURNTRANSFER, true); // Do return the contents of the call $xml = curl_exec($session); // Make the call header("Content-Type: text/xml"); // Set the content type appropriately echo $xml; // Spit out the xml curl_close($session); // And close the session Now when I try to load this on any domain that's not my site nothing loads. I get no JS errors, but I in the Console tab in firebug I get "407 Proxy Authentication Required" So I'm not really sure how to make this work. The goal is to be able to grab the RSS feed, parse it to grab the titles and links and spit it out into some HTML on any website on the web. I"m basically making a simple RSS widget for my site's various RSS feeds. My Javascript is wack Also, I'm really a beginner with Javascript. I know jQuery pretty well, but I wasn't able to use it in this case, because this script will be embeded on any site and I can't really rely on the jQuery library. So I was decided to write some basic Javascript relying on the default XML parsing options available. Any suggestions here would be cool. Thanks! What's with the x and y They way my site creates RSS feeds is that the first title is actually the RSS feed title. The second title is the title of the first item. The first link is the link to the first item. So when using the javascript to get the title, I had to first grab the first title (which is the RSS title) and then start with the second title that being the first title of the item. Sorry for the confusion, but I don't think this is related to my issue. Just wanted to clarify my code.

Read the article

Crystal Reports "File Break"

- by Chris B. Behrens

I'm generating a Crystal Reports report which will ultimately need to be split into thousands of pdf files. What would be ideal would be if Crystal Reports had something like a "file break", like a page break, that you could insert into the file at the appropriate places. I will need reasonably fine control over the file names, as well....something like "fileName_{CustomerId}_{CustomerIsLocal}.pdf". I'm presuming a third-party piece of software will probably be needed. Thoughts? TIA.

Read the article

Problem with parsing strings

- by Peter Small

I am trying to put a line of dialog on each of a series of images. To match the dialog line with the correct image, I end each line with a forward slash (/) followed by a number to identify the matching image. I then parse each line to get the dialog and then the reference number for the image. It all works fine except that when I put the dialog line into a textView I get the whole line in the textView instead of the dialog part. What is confusing is that the console seems to indicate that the parsing of the dialog line has been carried out correctly. Here are the details of my coding: @interface DialogSequence_1ViewController : UIViewController { IBOutlet UIImageView *theImage; IBOutlet UITextView *fullDialog; IBOutlet UITextView *selectedDialog; IBOutlet UIButton *test_1; IBOutlet UIButton *test_2; IBOutlet UIButton *test_3; NSArray *arrayLines; IBOutlet UISlider *readingSpeed; NSArray *cartoonViews; NSMutableString *dialog; NSMutableArray *dialogLineSections; int lNum; } @property (retain,nonatomic) UITextView *fullDialog; @property (retain,nonatomic) UITextView *selectedDialog; @property (retain,nonatomic) UIButton *test_1; @property (retain,nonatomic) UIButton *test_2; @property (retain,nonatomic) UIButton *test_3; @property (retain,nonatomic) NSArray *arrayLines; @property (retain,nonatomic) NSMutableString *dialog; @property (retain,nonatomic) NSMutableArray *dialogLineSections; @property (retain,nonatomic) UIImageView *theImage; @property (retain,nonatomic) UISlider *readingSpeed; -(IBAction)start:(id)sender; -(IBAction)counter:(id)sender; -(IBAction)runNextLine:(id)sender; @end @implementation DialogSequence_1ViewController @synthesize fullDialog; @synthesize selectedDialog; @synthesize test_1; @synthesize test_2; @synthesize test_3; @synthesize arrayLines; @synthesize dialog; @synthesize theImage; @synthesize readingSpeed; @synthesize dialogLineSections; -(IBAction)runNextLine:(id)sender{ //Get dialog line to display from the arrayLines array NSMutableString *dialogLineDetails; dialogLineDetails =[arrayLines objectAtIndex:lNum]; NSLog(@"dialogLineDetails = %@",dialogLineDetails); //Parse the dialog line dialogLineSections = [dialogLineDetails componentsSeparatedByString: @"/"]; selectedDialog.text =[dialogLineSections objectAtIndex: 0]; NSLog(@"Dialog part of line = %@",[dialogLineSections objectAtIndex: 0]); NSMutableString *imageBit; imageBit = [dialogLineSections objectAtIndex: 1]; NSLog(@"Image code = %@",imageBit); //Select right image int im = [imageBit intValue]; NSLog(@"imageChoiceInteger = %i",im); //------more code } I get a warning on the line: dialogLineSections = [dialogLineDetails componentsSeparatedByString: @"/"]; warning: incompatible Objective-C types assigning 'struct NSArray *', expected 'struct NSMutableArray *' I don't quite understand this and have tried to change the types but to no avail. Would be grateful for some advice here.

Read the article

What is the process of turning HTML into Postscript programmatically

- by Dean

I am trying to understand what the process is of turning HTML into a PDF/Postscript programmatically All Google searches turn up libraries to do this, but I am more interested in the actual process required. I know you could just set up a Postscript printer and print directly to that, but some of these libraries appear to create the PDF on the fly to allow previews etc. has anyone had any experience in this, or can provide any guidance?

Read the article

Can MikTeX create tagged PDFs?

- by soundasleepful

Tagged PDFs allow for the easy reflow and accessibility of PDFs. It seems like this would be a natural use case for using LaTeX, which advocates content over style. But as far as I can tell, there is no way to create a tagged PDF with MikTeX 2.8. Does anybody know of any tips, tricks or techniques to get a tagged PDF through LaTeX without resorting to the commercial version of Adobe Acrobat?

Read the article

Need help with regex parsing (in perl)

- by Charlie

Hi all, need some help parsing an html file in perl. I used the LWP module to retrieve a webpage into $_ with $/ undefined so there are no newline issues. Then I'm trying to find all strings matching a pattern. How do I do that? I know how to find 1 instance of it, but how do I match all instances? and what data structure would the results go to? a multi dimensional array? my text (excerpt) looks like the following: <TR> <TD BGCOLOR=EEEEEE><A HREF="/program.cgi?pid=1233"><FONT FACE="ARIAL,HELVETICA,SANS-SERIF" SIZE=2>Title 1</A></FONT></TD> <TD BGCOLOR=EEEEEE nowrap><FONT FACE="ARIAL,HELVETICA" SIZE=2>Jun 27 2010 3:00PM</FONT></TD> <TD BGCOLOR=EEEEEE> </TD> </TR> <TR><TD BGCOLOR=EEEEEE COLSPAN=3><IMG SRC="http://images.domain.com/images/spacer.gif" WIDTH=1 HEIGHT=2><BR></TD></TR> <TR><TD COLSPAN=3 BGCOLOR=999999><IMG SRC="http://images.domain.com/images/spacer.gif" HEIGHT=1 WIDTH=1></TD></TR> <TR><TD COLSPAN=3 ><IMG SRC="http://images.domain.com/images/spacer.gif" WIDTH=1 HEIGHT=2><BR></TD></TR> <TR> <TD><A HREF="/program.cgi?pid=1234"><FONT FACE="ARIAL,HELVETICA,SANS-SERIF" SIZE=2>Title 2</A></FONT></TD> <TD nowrap><FONT FACE="ARIAL,HELVETICA" SIZE=2>Jun 29 2010 7:00PM</FONT></TD> <TD> </TD> </TR> <TR><TD COLSPAN=3><IMG SRC="http://images.domain.com/images/spacer.gif" WIDTH=1 HEIGHT=2><BR></TD></TR> <TR><TD COLSPAN=3 BGCOLOR=999999><IMG SRC="http://images.domain.com/images/spacer.gif" HEIGHT=1 WIDTH=1></TD></TR> <TR><TD COLSPAN=3 BGCOLOR=EEEEEE><IMG SRC="http://images.domain.com/images/spacer.gif" WIDTH=1 HEIGHT=2><BR></TD></TR> <TR> <TD BGCOLOR=EEEEEE><A HREF="/program.cgi?pid=1235"><FONT FACE="ARIAL,HELVETICA,SANS-SERIF" SIZE=2>Title 3</A></FONT></TD> <TD BGCOLOR=EEEEEE nowrap><FONT FACE="ARIAL,HELVETICA" SIZE=2>Jul 3 2010 7:00PM</FONT></TD> <TD BGCOLOR=EEEEEE> </TD> </TR> I want to get the following into an array (or any structure): { ["/program.cgi?pdi=1233", "Title 1"], ["/program.cgi?pdi=1234", "Title 2"], ["/program.cgi?pdi=1235", "Title 3"] } Thanks

Read the article

Getting size of a webpage before parsing it

- by user2869844

I am trying to parse a webpage using jsoup and all is working good using this code: class DownloadSearchResultsTask extends AsyncTask<String, Integer, ArrayList> { private String link = "link"; private String title = "title"; private String vote = "vote"; private String age = "age"; private String size = "size"; private String seeders = "seeders"; private String leechers = "leachers"; @Override protected void onPreExecute() { // TODO Auto-generated method stub super.onPreExecute(); } @Override protected ArrayList doInBackground(String... params) { // TODO Auto-generated method stub ArrayList <HashMap<String, String>> searchResult = new ArrayList<HashMap<String, String>>(); HashMap<String, String> map; String link, title, vote, age, size, seeders, leechers; try { HttpURLConnection httpURLConnection=(HttpURLConnection) new URL("http://www.facebook.com").openConnection(); Log.d("VIVZ", httpURLConnection.getContentLength()+""); } catch (MalformedURLException e1) { // TODO Auto-generated catch block e1.printStackTrace(); } catch (IOException e1) { // TODO Auto-generated catch block e1.printStackTrace(); } Document mDocument; try { long l1=System.nanoTime(); Log.e("VIVZ",l1+""); mDocument = Jsoup .connect(params[0]) .userAgent( "Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6") .referrer("http://www.google.com").get(); long l2=System.nanoTime(); Log.e("VIVZ",(l2-l1)+""); Elements mResults = mDocument.select("div.results dl"); for (Element result : mResults) { map = new HashMap<String, String>(); Elements elements = result.select("dt a"); for (Element linkAndTitle : elements) { link = linkAndTitle.attr("abs:href"); title = linkAndTitle.text(); map.put(this.link, link); map.put(this.title, title); } elements = result.select("dd span.v"); for (Element v : elements) { vote = v.text(); map.put(this.vote, vote); } elements = result.select("dd span.a"); for (Element a : elements) { age = a.text(); map.put(this.age, age); } elements = result.select("dd span.s"); for (Element s : elements) { size = s.text(); map.put(this.size, size); } elements = result.select("dd span.u"); for (Element u : elements) { seeders = u.text(); map.put(this.seeders, seeders); } elements = result.select("dd span.d"); for (Element d : elements) { leechers = d.text(); map.put(this.leechers, leechers); } searchResult.add(map); } Log.e("VIVZ", searchResult.toString()); return searchResult; } catch (IOException e) { // TODO Auto-generated catch block Log.e("VIVZ",e+""); } return null; } @Override protected void onPostExecute(ArrayList result) { // TODO Auto-generated method stub super.onPostExecute(result); } } The problem is i want to get the size of page before parsing it and show a Determinate progress bar please help me ..... thanx in advance

Search Results

Search found 7251 results on 291 pages for 'pdf parsing'.

Page 70/291 | < Previous Page | 66 67 68 69 70 71 72 73 74 75 76 77 | Next Page >

- by ravenspoint

- by Thorsten S.

- by Rakesh

- by Cam

- by GoodGets

- by Joey Adams

- by Henrik Carlsson

- by Mike

- by Tal Galili

- by senfo

- by Blankman

- by ninesided

- by Jake

- by Winter

- by Casey

- by Tom Hubbard

- by pablo

- by Palanisamy

- by Amir

- by Chris B. Behrens

- by Peter Small

- by Dean

- by soundasleepful

- by Charlie

- by user2869844

< Previous Page | 66 67 68 69 70 71 72 73 74 75 76 77 | Next Page >