Search Results

Search found 7251 results on 291 pages for 'pdf parsing'.

Page 19/291 | < Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26 | Next Page >

Export SharePoint Wiki to PDF from the Command Line

- by Wyatt Barnett

We use a SharePoint wiki* at the office to serve as a knowledgebase for our IT operations. Recently we went through a disaster recovery exercise where we realized we had a key hole in our plans: how do you restore the services if your instruction manual is down because some services are offline? Anyhow, we did realize that the wiki angle was definitely something we wanted to keep, but rather that we should explore a way to create offline backups of the wiki which could be easily read using common software we should be able to setup without any knowledge from the wiki. So, does anyone know of a good utility that can take a SharePoint wiki and dump it to PDF/Word/RTF/[INSERT HUMAN FRIENDLY FORMAT] easily from the command line? *-Yes, there are better solutions out there. But this was easy and used existing infrastructure and generally does what we need it to do.

Read the article
Should I post my PDF library for SEO? [closed]

- by Iunknown

Possible Duplicate: Do search engines crawl PDFs and if so are there any rules to follow when making them When a Sales call comes in, the caller often says something like: 'I searched for 3 days before finding your product and it's exactly what I need!' That's telling me that I need some SEO work. We redid our website and streamlined it which removed many of our 'How-To' documents. Since those PDF documents contain words that people might search for, I was wondering if I could add a 'Complete library' link to the bottom of a page that will load up the entire PDF library. Would that help my ranking?

Read the article
Why does a pdf file download result in varying bytes logged, all with sc-status 200

- by Pat James

I have a mojoportal CMS installation on an IIS7 server where users are reporting problems downloading a pdf file. It always downloads fine for me and most others, either displaying in browser or in Adobe Reader. Using logparser to query the IIS logs, all the responses are status 200 (OK) or 304 (Not modified), but the bytes sent vary quite a bit. Sometimes zero, some 211, some about half the full file size of 27059, and lots in between. Plenty show the full size of 27059. Do these other entries for smaller byte counts represent errors of some kind, correlating with the problems reported? Is this likely to be a browser/client issue or a server side problem? If there is any other info that would be helpful let me know. This is a shared hosting server though so I am somewhat limited in what I can dig into on the server.

Read the article
How do I convert this filetype to pdf?

- by Gnoupi

This question is coming back often, and the general answers are very often the same. In an objective to concentrate useful information in one place, here is a community wiki about it. How can I convert this filetype to pdf? This question will have two kind of answers: The generic case, which works for most filetypes. The specific cases, which should be one answer per filetype. Restricting the OS field to Windows, as most of these questions are about this OS. This may change eventually. As it is community wiki, feel free to edit this question to improve it as well.

Read the article
Word document to PDF: open hyperlinks in new window

- by baens

I have a Mircosoft Word document with hyperlinks in it. When I save the PDF document, those hyperlinks no longer open that link in a new window. I have tried all the settings under the "Target Frame..." option, but those don't seem to persist. Is there any settings that help with this to make all hyperlinks in the document open in a new window? I am currently using the Acrobat plugin, but could move to a different plugin if it offers this feature.

Read the article
rel="Canonical": Ranking Benefits ? & specifying for PDF?

- by Miak

I think I understand the basic case for using rel="canonical": to tell google which is the preferred URI when the same page/content may be accessed via more than one URI. This helps you avoid duplicate content penalties. But what else does it do? Does it also affect search ranking? i.e. will the page I specify in the canonical be ranked higher than the others? (if all else equal). And in the case of PDF documents, I understand that you can now specify rel="canonical" for them too, using HTTP headers (i.e. in htaccess). Again, this would obviously help avoid dupilcate content penalties if the PDF content is the same as the HTML page or if it can be accessed in more than one place. But does it affect ranking? or are there any other benefits to doing this.

Read the article
IE10 does not open .pdf

- by user203298

I can't open any PDFs in IE10 on Win7 64bit. I've tested with PDFs from Intranet / Internet / local file system, http and https. I've tested installing/uninstalling Acrobat Reader 11.0.03 and the Nitro PDF Reader. I've also tried enabling/disabling the Tools Internet Options Advanced Security Do not save encrypted pages to disk option. In Google Chrome PDFs are opened in the Acrobat Reader Plugin, but in IE10 the only thing I always get is a small cross in the top left corner of the browser. Can anybody help me?

Read the article
Free tool to automatically deskew and crop PDF made up of scanned pages [closed]

- by Pietro M.

I have several PDFs made up of book pages' scans. The scans are made from two pages at a time and some of these scans are skewed, making text appear slightly tilted. I'm looking for a tool that could allow me to do an automatic optimization by deskewing the scans without losing readability. I've found the GPL software briss to crop the scans in order to have a 1:1 page ratio instead of 2:1, but I don't have any tool to deskew the pages. I stumbled upon unpaper, another open source tool that seems perfect for what I want to do, but that tool is Linux only and it doesn't work on PDF files directly. Any hint is appreciated. Thank you.

Read the article
Selection Issues with a PDF from a Word document

- by syrion

I have a long Word document that has a running footer. When I try to copy and paste across pages in the PDF generated from this document, the behavior of this footer is unpredictable--sometimes it is unselected, sometimes it is selected, sometimes the footer on the next page is selected. I would prefer to make this portion of the document unselectable, so that it still shows up but doesn't interfere with copying and pasting. Does anyone have an idea of how to do this? No, changing it to an image isn't possible, because it includes a page number.

Read the article
PDF - re/generate image using stream content

- by tom_tap

I have pdf file with 8 content streams (bytes) which behave like image layers (but they are not layers that I can turn off/on in Adobe Reader). I would like to extract these images separately, because they overlap each other (thus I am not able to "Take a Snapshot" or "Copy File to Clipboard"). So now I have these streams in below format: <Start Stream> q 599.7601 0 0 71.99921 5951.03423 4282.48177 cm /Im0 Do Q q 599.7601 0 0 71.99921 5951.03432 4210.48177 cm /Im1 Do Q q 599.7601 0 0 71.99921 5951.03441 4138.48177 cm /Im2 Do [...] My question is: how to use these data to generate or regenerate these images to be able to save it as raster or vector file? I have already tried pstoedit, but it doesn't work properly beacuse of these multi streams. Same with PDFedit.

Read the article
Print each bookmark of a PDF separately

- by Dave

I have a very large (1000 page) PDF which contains about 100, ten page each documents one after the other. I would like to have them sent to my office printer as individual files so my office printer will print them double sided and staple each one individually. I'm using Adobe Acrobat X and think the first step is to bookmark the start of each of those 100 documents. I don't know the next step though. I also have a batch printing program so if i can extract each of those 100 bookmarks to individual files that would work too. Thanks for all the help.

Read the article
Parsing basic math equations for children's educational software?

- by Simucal

Inspired by a recent TED talk, I want to write a small piece of educational software. The researcher created little miniature computers in the shape of blocks called "Siftables". [David Merril, inventor - with Siftables in the background.] There were many applications he used the blocks in but my favorite was when each block was a number or basic operation symbol. You could then re-arrange the blocks of numbers or operation symbols in a line, and it would display an answer on another siftable block. So, I've decided I wanted to implemented a software version of "Math Siftables" on a limited scale as my final project for a CS course I'm taking. What is the generally accepted way for parsing and interpreting a string of math expressions, and if they are valid, perform the operation? Is this a case where I should implement a full parser/lexer? I would imagine interpreting basic math expressions would be a semi-common problem in computer science so I'm looking for the right way to approach this. For example, if my Math Siftable blocks where arranged like: [1] [+] [2] This would be a valid sequence and I would perform the necessary operation to arrive at "3". However, if the child were to drag several operation blocks together such as: [2] [\] [\] [5] It would obviously be invalid. Ultimately, I want to be able to parse and interpret any number of chains of operations with the blocks that the user can drag together. Can anyone explain to me or point me to resources for parsing basic math expressions? I'd prefer as much of a language agnostic answer as possible.

Read the article
Parsing HTTP - Bytes.length != String.length

- by hotzen

Hello, I consume HTTP via nio.SocketChannel, so I get chunks of data as Array[Byte]. I want to put these chunks into a parser and continue parsing after each chunk has been put. HTTP itself seems to use an ISO8859-Charset but the Payload/Body itself may be arbitrarily encoded: If the HTTP Content-Length specifies X bytes, the UTF8-decoded Body may have much less Characters (1 Character may be represented in UTF8 by 2 bytes, etc). So what is a good parsing strategy to honor an explicitly specified Content-Length and/or a Transfer-Encoding: Chunked which specifies a chunk-length to be honored. append each data-chunk to an mutable.ArrayBuffer[Byte], search for CRLF in the bytes, decode everything from 0 until CRLF to String and match with Regular-Expressions like StatusRegex, HeaderRegex, etc? decode each data-chunk with the proper charset (e.g. iso8859, utf8, etc) and add to StringBuilder. With this solution I am not able to honor any Content-Length or Chunk-Size, but.. do I have to care for it? any other solution... ?

Read the article
Need some ideas on how to acomplish this in Java (parsing strings)

- by Matt

Sorry I couldn't think of a better title, but thanks for reading! My ultimate goal is to read a .java file, parse it, and pull out every identifier. Then store them all in a list. Two preconditions are there are no comments in the file, and all identifiers are composed of letters only. Right now I can read the file, parse it by spaces, and store everything in a list. If anything in the list is a java reserved word, it is removed. Also, I remove any loose symbols that are not attached to anything (brackets and arithmetic symbols). Now I am left with a bunch of weird strings, but at least they have no spaces in them. I know I am going to have to re-parse everything with a . delimiter in order to pull out identifiers like System.out.print, but what about strings like this example: Logger.getLogger(MyHash.class.getName()).log(Level.SEVERE, After re-parsing by . I will be left with more crazy strings like: getLogger(MyHash getName()) log(Level SEVERE, How am I going to be able to pull out all the identifiers while leaving out all the trash? Just keep re-parsing by every symbol that could exist in java code? That seems rather lame and time consuming. I am not even sure if it would work completely. So, can you suggest a better way of doing this?

Read the article
Which is best website for generating simple invoices with pdf and email pdf facility

- by Mirage

I want the website where i can generate invoices and send to customers. There are many on internet but i want which other have used

Read the article
Exceptions with DateTime parsing in RSS feed in C#

- by hIpPy

I'm trying to parse Rss2, Atom feeds using SyndicationFeedFormatter and SyndicationFeed objects. But I'm getting XmlExceptions while parsing DateTime field like pubDate and/or lastBuildDate. Wed, 24 Feb 2010 18:56:04 GMT+00:00 does not work Wed, 24 Feb 2010 18:56:04 GMT works So, it's throwing due to the timezone field. As a workaround, for familiar feeds I would manually fix those DateTime nodes - by catching the XmlException, loading the Rss into an XmlDocument, fixing those nodes' value, creating a new XmlReader and then returning the formatter from this new XmlReader object (code not shown). But for this approach to work, I need to know beforehand which nodes cause exception. SyndicationFeedFormatter syndicationFeedFormatter = null; XmlReaderSettings settings = new XmlReaderSettings(); using (XmlReader reader = XmlReader.Create(url, settings)) { try { syndicationFeedFormatter = SyndicationFormatterFactory.CreateFeedFormatter(reader); syndicationFeedFormatter.ReadFrom(reader); } catch (XmlException xexp) { // fix those datetime nodes with exceptions and read again. } return syndicationFeedFormatter; } rss feed: http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=test&cf=all&output=rss exception detials: XmlException Error in line 1 position 376. An error was encountered when parsing a DateTime value in the XML. at System.ServiceModel.Syndication.Rss20FeedFormatter.DateFromString(String dateTimeString, XmlReader reader) at System.ServiceModel.Syndication.Rss20FeedFormatter.ReadXml(XmlReader reader, SyndicationFeed result) at System.ServiceModel.Syndication.Rss20FeedFormatter.ReadFrom(XmlReader reader) at ... cs:line 171 <rss version="2.0"> <channel> ... <pubDate>Wed, 24 Feb 2010 18:56:04 GMT+00:00</pubDate> <lastBuildDate>Wed, 24 Feb 2010 18:56:04 GMT+00:00</lastBuildDate> <-----exception ... <item> ... <pubDate>Wed, 24 Feb 2010 16:17:50 GMT+00:00</pubDate> <lastBuildDate>Wed, 24 Feb 2010 18:56:04 GMT+00:00</lastBuildDate> </item> ... </channel> </rss> Is there a better way to achieve this? Please help. Thanks.

Read the article
Looking for a tutorial on Recursive Descent Parsing.

- by bodacydo

I am trying to parse some data to no success. Can anyone recommend a good introduction with a lot of examples to Recursive Descent Parsing? I haven't been able to find any.

Read the article
Any python libs for parsing apache config files?

- by daniels

Any python libs for parsing apache config files or if not python anyone aware of such thing in other languages (perl, php, java, c#)? As i'll be able to rewrite them in python.

Read the article
Sending and Parsing JSON in Android

- by primal

Hi, In the application I am developing, I would like to send messages in the form of JSON objects to a Django Server and parse the JSON response from the server and populate a custom listview. From the little JSON knowledge I have, I thought this format for the response from server { "post": { "username": "someusername", "message": "this is a sweet message", "image": "http://localhost/someimage.jpg", "time": "present time" }, } How much knowledge of JSON should I have to accomplish this purpose? Also it would be great if someone could provide me links of some tutorials for sending and parsing JSON Objects.

Read the article
Which is best pdf parser ?

- by Harikrishna

I want to parse the tabular information from pdf file,and want to display that tabular information in datagridview so for that which is the best pdf parser for that in c#.net application ?

Read the article
parsing HTML on the iPhone

- by Ben Alpert

Can anyone recommend a C or Objective-C library for HTML parsing? It needs to handle messy HTML code that won't quite validate. Does such a library exist, or am I better off just trying to use regular expressions?

Read the article
Looking for a good text parsing library for C#

- by Chris Stewart

Has anyone run across a quality library that will parse, line by line, CSV, tab-delimited, and Excel files? I've started to do it manually but have noticed some of the intricacies in parsing a comma-delimited file. Such as situations where a cell has a comma in it as part of the data (blah,\"LastName, Jr.\",blah,blah).

Read the article
XML Parsing need help iphone sdk

- by neha

Hi all, How do you get "MayurS123" from following xml tag by parsing? <eletitle lnk="http://192.168.10.2/justmeans/trunk/newsfeed/mayurs">MayurS123 Sharma</eletitle> My file is getting parsed properly. Here I'm able to retrieve the lnk component by doing: if([elementName isEqualToString:@"eletitle"]) { aGoodwork.lnk = [attributeDict objectForKey:@"lnk"]; } But I'm not getting how to get in actual title. Thanx in advance.

Read the article
Path parsing in rails

- by fl00r

Hi! I am looking for method for parsing route path like this: ActionController::Routing.new("post_path").parse #=> {:controller => "posts", :action => "index"} It should be opposite to url_for Upd I've found out: http://stackoverflow.com/questions/2222522/what-is-the-opposite-of-url-for-in-rails-a-function-that-takes-a-path-and-genera ActionController::Routing::Routes.recognize_path("/posts") So now I need to convert posts_path into "/posts"

Read the article
Parsing plain text to some structured object

- by Jeriho

I am working on parsing plain text and converting it to key-value pairs. For example, plain text: some_uninteresting_thing key1 valueA, valueB, valueC key2 valueD key3 valueE valueF key4 valueG(valueH, valueI) key5 some_uninteresting_thing valueJ some_uninteresting_thing key6 some_uninteresting_thing (key6 shouldn't be mapped because has no appropriate values) As you can see plain text is lenient. What java library can handle this? If no such library exist, any suggestions on algorithm to do this.

Read the article

< Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26 | Next Page >