pdf scraping - Page 68 - Developer IT

Online OCR website for processing an entire pdf file at one time?

- by Tim

I am looking for an online OCR website for processing a multi-page pdf file at one time. Free preferably. I know http://www.newocr.com/. If I am correct, it can only OCR one page at a time, by manually clicking "Preview" and then clicking "OCR" for each page. After each page is OCRed, I have to copy out the text result manually too. If my pdf file has 30~ pages, it will be tedious to repeat the above process for each page. I wonder if there is some other online websites that OCR a whole pdf file, without asking me for manual operation? Thanks!

Read the article

How to enlarge a .PDF document to better show it in a Kindle 6"?

- by Gus

I have a kindle 6". The problem is that I often read pdf files that are technical, therefore, it doesn't get converted very well to kindle's native format (for example, code blocks get messed, and things like that). When I view the pdf page, it's very small to read easily, so I have to rotate the screen to a horizontal position in order to see it better, but my page get divided. But some documents would be easy to read in vertical position if I had the chance to enlarge the font size a little bit in a external pdf editor, therefore enabling the reading in the vertical orientation. Has anybody had the same situation? Is there a solution for that?

Read the article

How to suppress the unsolicited footer when converting HTML -> PDF with Acrobat?

- by gojira

I often convert & combine (via contextmenu) HTML pages to PDF using Acrobat (not Acrobat Reader). I use Adobe Acrobat Pro 9 Extended, version 9.1.2. The converted PDFs always have the full path of the original file on the bottom of the PDF-page, also they have an additional header line with the document. I need to suppress that. I do not want the unsolicited header and footer in the resulting PDF files as they are a pain to reomve manually, with a certain page count per document it becomes impossible. Is it possible to suppres that and if, how?

Read the article

How to get the headers for all the pages of the exported data from php to pdf

- by udaya

Hi I am exporting data from php page to pdf when the datas exceeed the page limit the header is not available for the consecutive pages function where i call the export to pdf is function changeDetails() { $bType = $this-input-post('textvalue'); if($bType == "pdf") { $this->load->library('table'); $this->load->plugin('to_pdf'); $data['countrytoword'] = $this->AddEditmodel1->export(); $this->table->set_heading('Country','State','Town','Name'); $out = $this->table->generate($data['countrytoword']); $html = $this->load->view( 'newpdf',$data, true); pdf_create($html, $cur_date); } } This is my view page from which i export data to pdf Name Country State Town Here I am getting the result as page:1 Name country State Town udaya india Tamilnadu kovai chandru srilanka columbo aaaaa page:2 vivek england gggkj gjgjkj in the page 2 i dont get the headers name, country ,state and town

Read the article

How To Send a PDF File to a Progress AppServer?

- by Jay

I have a PDF file at client and i want to send this PDF file on AppServer. How can i send this pdf file at AppServer?

Read the article

How to calculate the correct image size in out pdf using itextsharp ?

- by MK

I' am trying to add an image to a pdf using itextsharp, regardless of the image size it always appears to be mapped to a different greater size inside the pdf ? The image I add is 624x500 pixel (DPI:72): And here is a screen of the output pdf: And here is how I created the document: Document document = new Document(); System.IO.MemoryStream stream = new MemoryStream(); PdfWriter writer = PdfWriter.GetInstance(document, stream); document.Open(); System.Drawing.Image pngImage = System.Drawing.Image.FromFile("test.png"); Image pdfImage = Image.GetInstance(pngImage, System.Drawing.Imaging.ImageFormat.Png); document.Add(pdfImage); document.Close(); byte[] buffer = stream.GetBuffer(); FileStream fs = new FileStream("test.pdf", FileMode.Create); fs.Write(buffer, 0, buffer.Length); fs.Close(); Any idea why on how to calculate the correct size ?

Read the article

Poppler installation

- by Menopia

I downloaded the new poppler 0.15 tar ball and i built it from source successfully but when trying dpkg -l | grep poppler it outputs ii libpoppler-dev 0.14.3-0ubuntu1.1 PDF rendering library -- development files ii libpoppler-glib-dev 0.14.3-0ubuntu1.1 PDF rendering library -- development files (GLib interface) ii libpoppler-glib4 0.12.4-1ubuntu1 PDF rendering library (GLib-based shared library) ii libpoppler-glib5 0.14.3-0ubuntu1.1 PDF rendering library (GLib-based shared library) ii libpoppler5 0.12.4-1ubuntu1 PDF rendering library rc libpoppler6 0.14.2.is.0.14.1-0ubuntu1 PDF rendering library ii libpoppler7 0.14.3-0ubuntu1.1 PDF rendering library ii poppler-utils 0.14.3-0ubuntu1.1 PDF utilitites (based on libpoppler) So AFAIK this means that the new version is not installed !!

Read the article

How to search a pdf after opening it from a new intent?

- by Nate

I've used the code from the PDF rendering question http://stackoverflow.com/questions/2883355/how-to-render-pdf-in-android And it works! Props to the answerer, but my question is about doing that same thing but also sending a keyword to search in the pdf. I have no idea how to do this, should I set a flag? Any help would be greatly appreciated.

Read the article

java.awt.Desktop.open doesn’t work with PDF files?

- by Jason S

It looks like I cannot use Desktop.open() on PDF files regardless of location. Here's a small test program: package com.example.bugs; import java.awt.Desktop; import java.io.File; import java.io.IOException; public class DesktopOpenBug { static public void main(String[] args) { try { Desktop desktop = null; // Before more Desktop API is used, first check // whether the API is supported by this particular // virtual machine (VM) on this particular host. if (Desktop.isDesktopSupported()) { desktop = Desktop.getDesktop(); for (String path : args) { File file = new File(path); System.out.println("Opening "+file); desktop.open(file); } } } catch (IOException e) { e.printStackTrace(); } } } If I run DesktopOpenBug with arguments c:\tmp\zz1.txt c:\tmp\zz.xml c:\tmp\ss.pdf (3 files I happen to have lying around) I get this result: (the .txt and .xml files open up fine) Opening c:\tmp\zz1.txt Opening c:\tmp\zz.xml Opening c:\tmp\ss.pdf java.io.IOException: Failed to open file:/c:/tmp/ss.pdf. Error message: The parameter is incorrect. at sun.awt.windows.WDesktopPeer.ShellExecute(Unknown Source) at sun.awt.windows.WDesktopPeer.open(Unknown Source) at java.awt.Desktop.open(Unknown Source) at com.example.bugs.DesktopOpenBug.main(DesktopOpenBug.java:21) What the heck is going on? I'm running WinXP, I can type "c:\tmp\ss.pdf" at the command prompt and it opens up just fine. edit: if this is an example of Sun Java bug #6764271 please help by voting for it. What a pain. :(

Read the article

Anyone have a good solution for scraping the HTML source of a page with content (in this case, HTML

- by phpwns

Anyone have a good solution for scraping the HTML source of a page with content (in this case, HTML tables) generated with Javascript? An embarrassingly simple, though workable solution using Crowbar: <?php function get_html($url) // $url must be urlencode(d) { $context = stream_context_create(array( 'http' => array('timeout' => 120) // HTTP timeout in seconds )); $html = substr(file_get_contents('http://127.0.0.1:10000/?url=' . $url . '&delay=3000&view=browser', 0, $context), 730, -32); // substr removes HTML from the Crowbar web service, returning only the $url HTML return $html; } ?> The advantage to using Crowbar is that the tables will be rendered (and accessible) thanks to the headless mozilla-based browser. The problem, of course, is being dependent on on an external web service, especially given that SIMILE seems to undergo regular server maintenance. :( A pure php solution would be nice, but any functional (and reliable) alternatives would be great.

Read the article

why i cannot download the pdf document from openstack? [closed]

- by hugemeow

http://docs.openstack.org/trunk/openstack-compute/admin/os-compute-adminguide-trunk.pdf you may find the above link by clicking http://wiki.openstack.org/Documentation#Administration it seems a bit strange, i used to think openstack is a well known project, but such a nice project still have some broken links, very sorry to find this if somebody know how to download this pdf, just let me know:) thank you

Read the article

Le PDF en 3D avec WebGL inventé par des membres de Developpez.com, ils fondent la société SPACEGOO

Le PDF en 3D avec WebGL inventé par des membres de Developpez.com Qui créent un nouveau mode de navigation et fondent SPACEGOO La standardisation de la technologie WebGL à l'été 2010, et son implémentation par les navigateurs récents (Firefox 4, Chrome 7, Opera 11.50, Webkit) ont permis de faire simplement du rendu 3D de qualité dans le navigateur. La scène 3D est rendue dans un canvas HTML5, ce qui permet d'intégrer ces éléments 3D à un site « standard », développé en PHP et CSS par exemple. Utilisée essentiellement pour les jeux, les démos, et la visualisation d'objets 3D, des membres de développez ont fondé la société SPACEGOO et ont eu l'idée d'utiliser le WebGL pour représenter du contenu te...

Read the article

error of pdf2djvu: "Bogus memory allocation size"

- by Tim

I am using pdf2djvu to convert a pdf file into a djvu file, but got this error while trying to convert to either bundled or indirect multi-page djvu file: $ pdf2djvu 1.pdf -o 1.djvu 1.pdf: - page #1 -> #1 Bogus memory allocation size $ pdf2djvu 1.pdf -i 1.djvu 1.pdf: - page #1 -> #1 Bogus memory allocation size I was wondering what is wrong here and how I shall fix the problem? You can suggest another application other than pdf2djvu. My pdf file can be downloaded from here , in case that you may wonder what is special about it. Thanks and regards!

Read the article

???????????!???·???

- by Kumiko Fujita

“???????????!”???? “???????????!”????????????·????????????????????????????????????????????????????????????? ???????????????????????????????????????????????! ???????·??? ???????????IT???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ??????????????????????????????????????????????????????·???????/?????????????????????????????????????????????????????????????! ???? ????? ????? ??????????????/??/??? ??????????????? PDF??(WMV)??(MP4) ????????????????????/?? ???????????????????? PDF??(WMV)??(MP4) ??????????????????? ?????????~????/????????~ PDF??(WMV)??(MP4) ???????????????????? ?????????????????????????- Oracle ASM Cluster File System (ACFS)????! PDF??(WMV)??(MP4) ??????????????????EM????? ???????? Oracle Enterprise Manager 12c ??? PDF??(WMV)??(MP4) SPARC???????Solaris?????? SPARC ????? ~ OVM ???????! PDF??(WMV)??(MP4) ???? ?Oracle DB 11g R2 ??????????????????/????????????????! Oracle????????

Read the article

How to set header font style as bold for the header of the table in a pdf file, in jsf

- by Radhika

Hi I have used PdfPTable to convert table data into a pdf file using com.itextpdf.text.pdf.PdfPTable. Table is displaying, but table data and the header are in same style. To make difference i have to set the header font style to bold. can anybody help me out in this, I have attached my code here.. Thanks in advance import java.awt.Color; import java.util.ArrayList; import java.util.List; import javax.faces.model.ListDataModel; import com.mypackage.core.filter.domainobject.FilterResultDO; import com.itextpdf.text.Font; import com.itextpdf.text.FontFactory; import com.itextpdf.text.Phrase; import com.itextpdf.text.pdf.PdfPTable; public class PDFGenerator { //This method will generate PDF for Filter Result Screen (only DataTable level) @SuppressWarnings("unchecked") public static PdfPTable generatePDF(PdfPTable table,List filterResultDOList ,List filterResultHeaderList ) { //Initialize the table with number of columns required for the Datatable header int numberOfFilterLabelCols = filterResultHeaderList.size(); //PDF Table Frame table = new PdfPTable(numberOfFilterLabelCols); //Getting Filter Detail Table Heading for(int i = 0 ; i < numberOfFilterLabelCols; i++) { ColumnHeader commandHeaderObj = filterResultHeaderList.get(i); table.addCell(commandHeaderObj.getLabel()); } //Getting Filter Detail Data (Rows X Cols) FilterResultDO filterResultDOObj = filterResultDOList.get(0); List filterResultDataList = filterResultDOObj.getFilterResultLst(); int numberOfFilterDataRows = filterResultDataList.size(); //each row iteration for(int row = 0; row < numberOfFilterDataRows; row++) { List filterResultCols = filterResultDataList.get(row); int numberOfFilterDataCols = filterResultCols.size(); //columns iteration of each row for(int col = 0; col < numberOfFilterDataCols ; col++) { String filterColumnsValues = (String) filterResultCols.get(col); table.addCell(filterColumnsValues); } } return table; }//generatePDF }

Read the article

PHP MYSQL FPDF retrieving pdf string stored as blob.

- by jj.amonit

Using the above technologies, I want to create a PDF, store it in my db, and email it. All with the click of one button. I also want to call it up and have it be able to display with a hyperlink. I am very new to FPDF. Therefore, I am trying to start off very slowly. I began with this link stackoverflow Q I put both parts of his code into the same page and tried with separate pages. I made the suggested changes/additions and even did a line by line comparison. I still get the message, "format error: not a PDF or corrupted" If I just $pdf-Output(); I get the pdf to display. It's either the way the string is being Output, or it's the header() function. It's not the storage method, unless my column setup is incorrect. BUt a blob is a blob, right? If you want, I can upload the sanitized code. Just let me know what would help answer this. Thanks JJ

Read the article

UIView using Quartz rendering engine to display PDF has poor quality compared to original.

- by Josh Kerr

I'm using the quartz rendering engine to display a PDF file on the iphone using the 3.0 SDK. The result is a bit blurry compared to a PDF being shown in a UIWebView. How can I improve the quality in the UIView so that I don't need to rewrite my app to use the UIWebView. I'm using pretty much close to the example code that Apple provides. Here is some of my sample code: CGContextRef gc = UIGraphicsGetCurrentContext(); CGContextSaveGState(gc); CGContextTranslateCTM(gc, 0.0, rect.size.height); CGContextScaleCTM(gc, 1.0, -1.0); CGAffineTransform m = CGPDFPageGetDrawingTransform(page, kCGPDFCropBox, rect, 0, false); CGContextConcatCTM(gc, m); CGContextSetGrayFillColor(gc, 1.0, 1.0); CGContextFillRect(gc, rect); CGContextDrawPDFPage(gc, page); CGContextRestoreGState(gc); Apple's tutorial code actually results in a blurry PDF view as well. If you drop the same PDF into a UIWebView you'll see it is actually sharper. Anyone have any ideas? This one issue is holding a two year development project from launching. :(

Read the article

Digitally sign MS Office (Word, Excel, etc..) and PDF files on the server

- by Sébastien Nussbaumer

I need to digitally sign MS Office and PDF files that are stored on a server. I really mean a digital signature that is integrated in the document, according to each specific file formats. This is the process I had in mind : Create a hash of the file's content Send the hash to a custom written java applet in the browser The user encrypts the hash with his/her private key (on an usb token via PKCS#11 for example), thus effectively signing the file. The applet then sends the signature to the server On the server I would then incorporate the signature in the file's (MS Office and PDF files can do that without changing the file's content, probably by just setting some metadata field) What is cool is that you never have to download and upload the complete file to the server again. What is even cooler, the customer doesn't need Office or PDF Writer to sign the files. Parts 2, 3 and 4 are OK for me, my company bought all the JAVA technology I need for that for a previous project I worked on. Problem : I can't seem to find any documentation/examples to do parts 1 and 5 for Office files . Are my google skills failing me this time ? Do you have any pointers to documentation or examples for doing that for MS Office files ? The underlying technology isn't that important to me : I can use Java, .Net, COM, any working technology is OK ! Note : I'm 95% sure I can nail points 1 and 5 for PDF files using iText Thanks ** Edit : If I can't do that with hashes and must download the complete file to the client, it's also possible. But then I still need the documentation to be able to sign Office file... in java this time (from an applet)

Read the article

SQLAuthority News – SQL Server Wait Stats – eBook to Download on Kindle – Answer to FREE PDF Download Request

- by pinaldave

Being a book author is a completely new experience for me. I am yet to come across the issues faced by expert book authors. I assume that these interesting issues can be routine ones for expert book authors. One of the biggest requests I am getting for my SQL Server Wait Stats [Amazon] | [Flipkart] | [Kindle] book is my humble attempt to write a book. This is our very first experiment, and the book is beginning of the subject of SQL Server Wait Stats; we will come up with a new version of the book later next year when we have enough information for the SQL Server 2012 version. Following are the top 2 requests that I keep on receiving in emails, on blogs, Twitter, and Facebook. “Please send us FREE PDF of your book so we do not have to purchase it.” “If you can share with us the eBook (free and downloadable) format of your book, we will share it with everybody we know and you will get additional exposure.” Here is my response for the abovementioned requests: If you really need my book and cannot purchase it due to financial trouble, then feel free to let me know and I will purchase it myself and ship it to you. If you are in a country where the print book not available, then you can buy the Kindle book, which is available online in any country, and you can just read it on your computer and mobile devices. You DO NOT have to own a Kindle to read a Kindle format book. You can freely download Kindle software on your desired format and purchase the book online. For next 5 days, the kindle book is available at 3.99 in USA, and in other countries, the price is anywhere between 3.99 and 5.99. The price will go up by USD 2 everywhere across the world after 1st November, 2011. Here is the link to download Kindle Software for free PC, WP7, and in marketplace for various other mobile devices. I thank you for giving warm response to SQL Server Wait Stats book. I am motivated to write the next expanded version of this book. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, Database, Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article

How do I populate multiple records of data into a PDF form like a mail-merge?

- by user38801

I have Acrobat Pro, and I have a PDF with a form on it. Assuming the fields in the form correspond to a data source (like rows in an RDBMS table or xml file), I want to then print multiple copies of the PDF file, with each copy having the values of a different row in the data source. It is preferable to directly interface with an actual database, rather than having to save an XML file every time I do this. If this involves programming that's cool too, I only posted here because the question didn't seem appropriate for StackOverflow. Thanks!

Read the article

How do I get a PDF link in a Word document to open in the default browser?

- by Tweek

I'm trying to create a Word document with links to resources on the web. If I create a hyperlink to a regular HTML file, when I click on the link, it opens in my default browser (Google Chrome) as expected. However, if I click on a link to a PDF file on a website, it opens in Internet Explorer. Before it opens, I also get the following prompt: Microsoft Office Opening http://www.example.com/example.pdf Some files can contain viruses or otherwise be harmful to your computer. It is important to be certain that this file is from a trustworthy source. Would you like to open this file? OK Cancel I'm using Office 2010, but I'm asking for a user who is using Office 2007 and is experiencing the same issue. (His default browser is Firefox.) We're both on Windows 7.

Read the article

How do I send a PDF in a MemoryStream to the printer in .Net?

- by Ryan ONeill

I have a PDF created in memory using iTextSharp and contained in a MemoryStream. I now need to translate that MemoryStream PDF into something the printer understands. I've used Report Server in the past to render the pages to the printer format but I cant use it for this project. Is there a native .Net way of doing this? For example, GhostScript would be OK if it was a .Net assembly but I don't want to bundle any non .Net stuff along with my installer. The PrintDocument class in .Net is great for sending content to the printer but I still need to translate it from a PDF stream into GDI at the page level. Any good hints? Thanks in advance Ryan

Read the article

how to display a binary content of image/pdf in java script?

- by Ka-rocks

I have a binary content of image/pdf in java script variable downloaded from server. There will be indication server about the typr of the file. I have to display the content in respective file format. If it is image , i have to display the image. If it is a pdf, i have to open the content in pdf format. and so on. How to parse the binary content and display it? I have searched for it. But I couldn't find exact solution. I'm using jquery mobile framework. Pls help..

Read the article

What's wrong with my svn:ignore pattern?

- by boris callens

I have the pattern svn:ignore datasheets/*/*.pdf It is supposed to ignore all pdfs that are at an arbitrary depth under multiple "datasheet" directories under the current root folder. As an example: say I have a dir structure like this Websites -web1 -dataSheets -AT -ignore.pdf -BE -NL -ignore.pdf -FR -ignore.pdf -ignore2.pdf -licenseAgreements -important.pdf -web2 -datasheets -etc In this example the pattern needs to ignore all the ignore.pdfs without ingoring the important.pdf too. The shown pattern still includes all my pdf files. I know there are a bunch of similar questions, but none of them seem to tackle the problem with the various hierarchy levels.

Read the article

New User of UPK?

- by [email protected]

The UPK Developer comes with a variety of manuals to help support your organization in the development and deployment of content. The Developer manuals can be found in the \Documentation\Language Code\Reference folder where the Developer has been installed. As of 3.5.x the documentation can also be accessed via the Start menu, Start\Programs\User Productivity Kit\Documentation\Reference. Content Deployment.pdf: This manual provides information on how to deploy content to your audience. Content Development.pdf: This manual provides information on how to create, maintain, and publish content using the Developer. The content of this manual also appears in the Developer help system. Content Player.pdf: This manual provides instructions on how to view content using the Player. The content of this manual also appears in the Player help system. In-Application Support Guide.pdf: This manual provides information on how implement content-sensitive, in-application support for enterprise applications using Player content. Installation & Administration.pdf: This manual provides instructions for installing the Developer in a single-user or multi-user environment as well as information on how to add and manage users and content in a multi-user installation. An Administration help system also appears in the Developer for authors configured as administrators. This manual also provides instructions for installing and configuring Usage Tracking. Upgrade.pdf: This manual provides information on how to upgrade from a previous version to the current version. Usage Tracking Administration & Reporting.pdf: This manual provides instructions on how to manage users and usage tracking reports. - Kathryn Lustenberger, Oracle UPK Outbound Product Management

Search Results

Search found 4479 results on 180 pages for 'pdf scraping'.

Page 68/180 | < Previous Page | 64 65 66 67 68 69 70 71 72 73 74 75 | Next Page >

- by Tim

- by Gus

- by gojira

- by udaya

- by Jay

- by MK

- by Menopia

- by Nate

- by Jason S

- by phpwns

- by hugemeow

- by Tim

- by Kumiko Fujita

- by Radhika

- by jj.amonit

- by Josh Kerr

- by Sébastien Nussbaumer

- by pinaldave

- by user38801

- by Tweek

- by Ryan ONeill

- by Ka-rocks

- by boris callens

- by [email protected]

< Previous Page | 64 65 66 67 68 69 70 71 72 73 74 75 | Next Page >