Search Results

Search found 4479 results on 180 pages for 'pdf scraping'.

Page 5/180 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

vector quality of svg and pdf

- by Kasper

I'm converting pdf files to svg as it is easier to use svg files on webpages. I first thought the quality of svg must be similar to pdf, as they are both vector graphics. However, now I look a little better on it, it seems that pdf is a bit superior: (https://dl.dropboxusercontent.com/u/58922976/Photos/1.png) I wonder if I could change this in some way. Is this because pdf vectors are just better quality ? Or is this because chrome renders svg in lower quality than adobe reader renders pdf ? Is this a setting in the svg file that I could change ? Here is the pdf file: https://dl.dropboxusercontent.com/u/58922976/syllabusLinAlg2012.59.pdf And here is the svg file: (https://dl.dropboxusercontent.com/u/58922976/syllabusLinAlg2012.59.svg) I've made this svg file in illustrator, and only chrome is able to use the embedded svg fonts. So firefox and internet explorer won't give the expected result.

Read the article
Merge two PDF files containing even and odd pages of a book

- by Yurij73

I have two searchable PDF documents, say even.pdf and odd.pdf which contain even and odd pages of a book, respectively. I can decompile each PDF to separate files 001.pdf 002.pdf 003.pdf, et cetera. The question is how to merge them? They are both even and odd sequences numbered 1, 2, 3. If the numbering in the decompile process with pdftk were different, e.g. 1, 3, 5 for even and 2, 4, 6 for odd instead of 1, 2, 3, 4, I could simply merge them. Can I do this any other way?

Read the article
PDF Encrypted, Hidden Watermark

- by Dave Jarvis

Background Using LaTeX to write a book. When a user purchases the book, the PDF will be generated automatically. Problem The PDF should have a watermark that includes the person's name and contact information. Question What software meets the following criteria: Applies encrypted, undetectable watermarks to a PDF Open Source Platform independent (Linux, Windows) Fast (marks a 200 page PDF in under 1 second) Batch processing (exclusively command-line driven) Collusion-attack resistant Non-fragile (e.g., PDF - EPS - PDF still contains the watermark) Well documented (shows example usages) Ideas & Resources Some thoughts and findings: Natural language processing (NLP) watermarks. Apply steganography on a randomly selected image. http://openstego.sourceforge.net/cmdline.html The problem with NLP is that grammatical errors can be introduced. The problem with steganography is that the images are sourced from an image cache, and so recreating that cache with watermarked images will impart a delay when generating the PDF (I could just delete one image from the cache, but that's not an elegant solution). Thank you!

Read the article
Converting PDF eBooks into a Kindle format

- by Ender

Over the past couple of years I've amassed quite a collection of guides, tutorials and ebooks in PDF format. A lot of these are quite useful for work, especially PDF documentation, and rather than have to be at a computer every time I want to read how to do something in Sitecore or to read through a software testing ebook I'd like to do it on my brand-spanking-new Kindle. However, even though there is now a native PDF reader on the Kindle due to the nature of PDF's they are practically unreadable. The text doesn't wrap due to how PDF's are sized and so far after a bunch of Google searches I've yet to find a viable solution to get my PDF's converted into a readable Kindle format. Sometimes these books have code or pictures/tables in them, but most of the time they're text-heavy and to be honest I'd be surprised if there wasn't a free tool to handle the converting of PDF to one of the (seemingly many) Kindle formats. So, can anyone help me out with this?

Read the article
Sharepoint .PDF contents displaying as 'searchtext.xml' in searches

- by Green Muffins

Hi Experts, I recently used installed ifilter in my sharepoint farm to enable searching of the contents of .pdf documents. All went well, except if I search for contents of any .pdf file, they appear in the search results with document title "searchtext.xml", and the link to the document gives a giant page of the .pdf contents in an .xml looking browser page. :s I have added .pdf filetypes to the search, so I am unsure why it is reading them incorrectly.. if I search for a .pdf document title such as 'document.pdf' it will display the result as a html page, though the link does follow to a readable .pdf file. Any help?

Read the article
merge two parts of pdf in one

- by Yurij73

I have two searchable pdf documents say even.pdf and odd.pdf which contains respectively even and odd pages of a book. I can decompile each pdf to separate files 001.pdf 002.pdf oo3.pdf ....The question is how to merge them? They are both even and odd sequences numbered 1,2,3. If it where other numbering on decompile stage with pdftk for even 1,3,5 and for odd 2,4,6 instead of existing order 1,2,3, 4.. i coulde simple merge them, but i ignore this method of numbering with pdftk. May be i need to do the task in other way?

Read the article
Dynamic PDF Generation from HTML form tool

- by user289833

I would like to create a service that would take 2 fields (name & company name) from an HTML form and place it in a PDF document (a completion certificate that the user can print/save etc.) How would you recommend doing this?

Read the article
Convertion html/pdf

- by Guido

Hi, I would to know the command in a PHP script to convert an html output in pdf or any non-modifiable file format. Thanks

Read the article
Using "Microsoft Save as PDF" add-in programmatically without installing Word

- by jayrdub

The sample code in this article for creating a PDF from a Word doc works great if you have word installed on the machine. http://msdn.microsoft.com/en-us/library/bb412305.aspx I'm curious if it is possible to do this without having to install Word.

Read the article
pdf <pre> equivalent

- by ddowns

I'm trying to put some code examples in a pdf, but copying them out messes up the formatting and rearranges the lines, there's a lot of manual cleanup needed after pasting. Is there a equivalent to html's pre for PDFs? For "this" block of text respect line breaks, spacing, and copy as plain text like its shown. The closest thing I can see is adding note annotations next to every code example.

Read the article
Drupal print module: controlling layout in PDF files?

- by WhyKiki

So I'm trying to fix the layout of the PDF files for a specific content type. I'm messing around in the tpl.php file and can't find a way to modify individual fields. There's just the massive $print['content'] variable that contains all of the page content. So, is there a way to access each field?

Read the article
How to convert pdf to png ?

- by lisyqiao

( www.pdftopng.net )*PDF to PNG*Converter is the professional software that can convert PDF to PNG with high quality and fast speed. Besides, Free PDF to PNG can convert PNG to the other images like PDF to JPG, PDF to GIF, PDF to TIFF, PDF to BMP and so on. What's more, PDF to PNG can customize your output settings including output type, output color and page range etc.

Read the article
Co-ordinates of a element in a pdf file using iText

- by Arun P Johny

Hi all, I'm creating a pdf file using BIRT reporting library. Later I need to digitally sign these files. I'm using iText to digitally sign the document. The issue I'm facing is, I need to place the signature in different places in different reports. I already have the code to digitally sign the document, now I'm always placing the signature at the bottom of last page in every report. Eventually I need each report to say where I need to place the signature. Then I've to read the location using iText and then place the signature at that location. Is this possible to achieve using BIRT and iText Thanks

Read the article
Insert PDF image in MS Word

- by serhio

Hello. I have a .doc witch I will convert in PDF. In this .doc I has an image. When I convert the doc to PDF and then zoom it, the images became ugly pixel-ized. I found a tool that converted my bitmap .png image to vectorial .PDF image. Now how could I import the PDF image in MS Word (that finally I will convert to PDF once again)?

Read the article
Shell extension to display thumbnails of PDF files

- by hamilton

Foxit PDF, doesn't have a shell extension to display thumbnails of PDF files in Windows Explorer (the thumbnails are shown instead of PDF document icons). Is there a shell extension that do that? i.e to see thumbnails instead of PDF icon. BTW, PDFXchange and Adobe have a shell extension such that the thumbnails are shown instead of PDF document icons.

Read the article
Rotate PDF document

- by Rogier

We have created thousands of PDF files that are printed as a label on a special label printer. Printing these labels is ok, but some of the label paper are quarter turned and the PDF are printed incorrectly. There is a possibility to rotate the page before printing. But is it possible to rotate a PDF file and save it again as a PDF file? And there are thousands of PDF files, is it also possible to do this is a batch program?

Read the article
Quarter turn pdf document

- by Rogier

We have created thousands of pdf files that are printed as a label on a special label printer. Printing these labels is ok, but some of the label paper are quarter turned and the pdf are printed incorrectly. There is a possibility to rotate the page before printing. But is it possible to rotate a pdf file and save it again as a pdf file? And there are thousands of pdf files, is it also possible to do this is a batch program?

Read the article
JQGrdi PDF Export

- by thanigai

Originally posted on: http://geekswithblogs.net/thanigai/archive/2013/06/17/jqgrdi-pdf-export.aspxJQGrid PDF Export The aim of this article is to address the PDF export from client side grid frameworks. The solution is done using the ASP.Net MVC 4 and VisualStudio 2012. The article assumes the developer to have a fair amount of knowledge on ASP.Net MVC and C#. Tools Used Visual Studio 2012 ASP.Net MVC 4 Nuget Package Manager JQGrid is one of the client grid framework built on top of the JQuery framework. It helps in building a beautiful grid with paging, sorting and exiting options. There are also other features available as extension plugins and developers can write their own if needed. You can download the JQgrid from the JQGrid homepage or as NUget package. I have given below the command to download the JQGrid through the package manager console. From the tools menu select “Library Package Manager” and then select “Package Manager Console”. I have given the screenshot below. This command will pull down the latest JQGrid package and adds them in the script folder. Once the script is downloaded and referenced in the project update the bundleconfig file to add the script reference in the pages. Bundleconfig can be found in the App_Start folder in the project structure. bundles .Add (newStyleBundle(“~/Content/jqgrid”).Include (“~/Content/ui.jqgrid.css”)); bundles.Add( newScriptBundle( “~/bundles/jquerygrid”) .Include( “~/Scripts/jqGrid/jquery.jqGrid*”)); Once added the config’s refer the bundles to the Views/Shared/LayoutPage.cshtml. Add the following lines to the head section of the page. @Styles.Render(“~/Content/jqgrid”) Add the following lines to the end of the page before html close tags. @Scripts.Render(“~/bundles/jquery”) @Scripts.Render(“~/bundles/jqueryui”) @Scripts.Render(“ ~/bundles/jquerygrid”) That’s all to be done from the view perspective. Once these steps are done the developer can start coding for the JQGrid. In this example we will modify the HomeController for the demo. The index action will be the default action. We will add an argument for this index action. Let it be nullable bool. It’s just to mark the pdf request. In the Index.cshtml we will add a table tag with an id “ gridTable “. We will use this table for making the grid. Since JQGrid is an extension for the JQUery we will initialize the grid setting at the script section of the page. This script section is marked at the end of the page to improve performance. The script section is placed just below the bundle reference for JQuery and JQueryUI. This is the one of improvement factors from “ why slow” provided by yahoo. < tableid=“gridTable”class=“scroll”></ table> < inputtype=“button”value=“Export PDF”onclick=“exportPDF();“/> @section scripts { <scripttype=“text/javascript”> $(document).ready(function(){$(“#gridTable”).jqGrid({datatype:“json”,url:‘@Url.Action(“GetCustomerDetails”)‘,mtype:‘GET’,colNames:["CustomerID","CustomerName","Location","PrimaryBusiness"],colModel:[{name:"CustomerID",width:40,index:"CustomerID",align:"center"},{name:"CustomerName",width:40,index:"CustomerName",align:"center"},{name:"Location",width:40,index:"Location",align:"center"},{name:"PrimaryBusiness",width:40,index:"PrimaryBusiness",align:"center"},],height:250,autowidth:true,sortorder:“asc”,rowNum:10,rowList:[5,10,15,20],sortname:“CustomerID”,viewrecords:true});}); function exportPDF (){ document . location = ‘ @ Url . Action ( “Index” ) ?pdf=true’ ; } </ script > } The exportPDF methos just sets the document location to the Index action method with PDF Boolean as true just to mark for download PDF. An inmemory list collection is used for demo purpose. The GetCustomerDetailsmethod is the server side action method that will provide the data as JSON list. We will see the method explanation below. [ HttpGet] publicJsonResultGetCustomerDetails(){ varresult=new { total=1, page=1, records=customerList.Count(), rows=( customerList.Select( e=>new { id=e.CustomerID, cell=newstring[]{ e.CustomerID.ToString(), e.CustomerName, e.Location, e.PrimaryBusiness}})) .ToArray()}; returnJson( result, JsonRequestBehavior.AllowGet); } JQGrid can understand the response data from server in certain format. The server method shown above is taking care of formatting the response so that JQGrid understand the data properly. The response data should contain totalpages, current page, full record count, rows of data with id and remaining columns as string array. The response is built using an anonymous object and will be sent as a MVC JsonResult. Since we are using HttpGet it’s better to mark the attribute as HttpGet and also the JSON requestbehavious as AllowGet. The inmemory list is initialized in the homecontroller constructor for reference. Public class HomeController : Controller{ private readonly Ilist < CustomerViewModel > customerList ; public HomeController (){ customerList=newList<CustomerViewModel>() { newCustomerViewModel{ CustomerID=100, CustomerName=“Sundar”, Location=“Chennai”, PrimaryBusiness=“Teacing”}, newCustomerViewModel{ CustomerID=101, CustomerName=“Sudhagar”, Location=“Chennai”, PrimaryBusiness=“Software”}, newCustomerViewModel{ CustomerID=102, CustomerName=“Thivagar”, Location=“China”, PrimaryBusiness=“SAP”}, }; } publicActionResultIndex( bool?pdf){ if ( !pdf.HasValue){ returnView( customerList);} else{ stringfilePath=Server.MapPath( “Content”) +“Sample.pdf”; ExportPDF( customerList, new string[]{ “CustomerID”, “CustomerName”, “Location”, “PrimaryBusiness” }, filePath); return File ( filePath , “application/pdf” , “list.pdf” ); }} The index actionmethod has a Boolean argument named “pdf”. It’s used to indicate for PDF download. When the application starts this method is first hit for initial page request. For PDF operation a filename is generated and then sent to the ExportPDF method which will take care of generating the PDF from the datasource. The ExportPDF method is listed below. Private static void ExportPDF<TSource>(IList<TSource>customerList,string [] columns, string filePath){ FontheaderFont=FontFactory.GetFont( “Verdana”, 10, Color.WHITE); Fontrowfont=FontFactory.GetFont( “Verdana”, 10, Color.BLUE); Documentdocument=newDocument( PageSize.A4); PdfWriter writer = PdfWriter . GetInstance ( document , new FileStream ( filePath , FileMode . OpenOrCreate )); document.Open(); PdfPTabletable=newPdfPTable( columns.Length); foreach ( varcolumnincolumns){ PdfPCellcell=newPdfPCell( newPhrase( column, headerFont)); cell.BackgroundColor=Color.BLACK; table.AddCell( cell); } foreach ( var item in customerList ) { foreach ( varcolumnincolumns){ stringvalue=item.GetType() .GetProperty( column) .GetValue( item) .ToString(); PdfPCellcell5=newPdfPCell( newPhrase( value, rowfont)); table.AddCell( cell5); } } document.Add( table); document.Close(); } iTextSharp is one of the pioneer in PDF export. It’s an opensource library readily available as NUget library. This command will pulldown latest available library. I am using the version 4.1.2.0. The latest version may have changed. There are three main things in this library. Document This is the document class which takes care of creating the document sheet with particular size. We have used A4 size. There is also an option to define the rectangle size. This document instance will be further used in next methods for reference. PdfWriter PdfWriter takes the filename and the document as the reference. This class enables the document class to generate the PDF content and save them in a file. Font Using the FONT class the developer can control the font features. Since I need a nice looking font I am giving the Verdana font. Following this PdfPTable and PdfPCell are used for generating the normal table layout. We have created two set of fonts for header and footer. Font headerFont=FontFactory .GetFont(“Verdana”, 10, Color .WHITE); Font rowfont=FontFactory .GetFont(“Verdana”, 10, Color .BLUE); We are getting the header columns as string array. Columns argument array is looped and header is generated. We are using the headerfont for this purpose. PdfWriter writer=PdfWriter .GetInstance(document, newFileStream (filePath, FileMode.OpenOrCreate)); document.Open(); PdfPTabletable=newPdfPTable( columns.Length); foreach ( varcolumnincolumns){ PdfPCellcell=newPdfPCell( newPhrase( column, headerFont)); cell.BackgroundColor=Color.BLACK; table.AddCell( cell); } Then reflection is used to generate the row wise details and form the grid. foreach (var item in customerList){ foreach ( varcolumnincolumns) { stringvalue=item.GetType() .GetProperty( column) .GetValue( item) .ToString(); PdfPCellcell5=newPdfPCell( newPhrase( value, rowfont)); table.AddCell( cell5); } } document . Add ( table ); document . Close (); Once the process id done the pdf table is added to the document and document is closed to write all the changes to the filepath given. Then the control moves to the controller which will take care of sending the response as a JSON result with a filename. If the file name is not given then the PDF will open in the same page otherwise a popup will open up asking whether to save the file or open file. Return File(filePath, “application/pdf”,“list.pdf”); The final result screen is shown below. PDF file opened below to show the output. Conclusion: This is how the export pdf is done for JQGrid. The problem area that is addressed here is the clientside grid frameworks won’t support PDF’s export. In that time it’s better to have a fine grained control over the data and generated PDF. iTextSharp has helped us to achieve our goal.

Read the article
How to dynamically generate PDF documents

- by Thomas

I want to build a web application for generating stylish PDF documents. The layout should be based on a design templates and the data should come dynamically from the database. Ideally I want to design the template in a "publishing like" tool with placeholders and replace these placeholders by the web application with the data from the database. Think of something like an invoice generator, where a customer could choose from different invoice templates and the invoice data itself coming from the DB. Thanks for your ideas!

Read the article
Draw text on a loaded pdf file with Zend Framework

- by Rick de Graaf

Hello, I'm trying to load a existing pdf file, and fill this with database information. Loading the file and everything is working, except for writing data to the loaded page. It doesn't write text to the loaded page. If I add a new page en use a foreach to apply drawing to all pages, all added pages are written, except for the loaded one. Below is the code I'm using: $pdf = Zend_Pdf::load('./documents/agreements/_root/gegevens.pdf'); // Load pdf $pdf->pages = array_reverse($pdf->pages); // reverse pages $pdf->pages[] = new Zend_Pdf_Page(Zend_Pdf_Page::SIZE_A4); // Add a page (A4) $font = Zend_Pdf_Font::fontWithName(Zend_Pdf_Font::FONT_HELVETICA); // Set font foreach($pdf->pages as $page) // Apply settings+text to every page (total of 2) { $page->setFont($font, 36); $page->setAlpha(0.25); $page->drawText('LALALALALALALA', 62, 260, 'UTF-8'); } $pdf->save('./documents/agreements/Gegevens_'.$this->school_id.'.pdf'); // Save file

Read the article
PHPForm Generate PDF Send to Email

- by tom

I'm a beginner in PHP I was wondering if this is easy to do or if i'd have to outsource this to a programmer - Basically when a user fills in the PHP Form and submits it I need this to generate as a PDF which will then email/attach to MY email and NOT the user who submitted this form. I have looked at tcpdf, fpdi but i dont think any of those scripts allow me to do this specifically as from what i heard it generates a download link for the user, and that is not what i need. If anyone can help me it would be greatly appreciated. Regards Tom

Read the article
A PDF viewer for large margins in fullscreen

- by jmn

I am looking for a way to pleasantly read PDF files on my widescreen (22" 1680x1050) monitor. My problem with all pdf the PDF-viewer applications I have tried is that they do not handle wide and high margins well. If I go to fullscreen mode in my viewer and zoom in so that the extra margins are cropped, I can view the pages nicely, the annoyance however is that I have to reposition the pages every time I navigate to another page. I am sure there must be a way to make a PDF viewer that can solve this problem and perhaps there is one you know of? I am aware of something called PDF Reflow in Acrobat Reader but that only works with certain specific (tagged) files. I want a PDF viewer with a smarter zoom/next page function or an automatic margin-crop function. Is there such a thing?

Read the article
A PDF viewer for large margins in fullscreen

- by jmn

I am looking for a way to pleasantly read PDF files on my widescreen (22" 1680x1050) monitor. My problem with all pdf the PDF-viewer applications I have tried is that they do not handle wide and high margins well. If I go to fullscreen mode in my viewer and zoom in so that the extra margins are cropped, I can view the pages nicely, the annoyance however is that I have to reposition the pages every time I navigate to another page. I am sure there must be a way to make a PDF viewer that can solve this problem and perhaps there is one you know of? I am aware of something called PDF Reflow in Acrobat Reader but that only works with certain specific (tagged) files. I want a PDF viewer with a smarter zoom/next page function or an automatic margin-crop function. Is there such a thing?

Read the article
Converting DOCX files to PDF via SSH without losing formatting

- by Reado

I'm struggling to find a solution that will allow me to convert a DOCX file to a PDF without losing or malforming the formatting of the document on CentOS 5.7. I have tried CUPS-PDF but it doesn't work; spool files appear in the /var/spool folder but nothing happens after that. OpenOffice and LibreOffice converted a DOCX to PDF but the formatting was all wrong. However if I print the DOCX to a Windows PDF printer from my Windows 7 workstation, it outputs to PDF absolutely fine. So why can't Linux do the same? I tried to print via CUPS to the Windows PDF printer (shared) but the document appears in the queue as "Remote Downlevel Document" and doesn't print. This only happens when I print from Linux.

Read the article
Save a PDF created with FPDF php library in a MySQL blob field

- by Davide Gualano

I need to create a pdf file with the fpdf library and save it in a blob field in my MySQL database. The problem is, when I try to retrieve the file from the blob field and send it to the browser for the download, the donwloaded file is corrupted and does not display correctly. The same pdf file is correctly displayed if I send it immediately to the browser without storing it in the db, so it seems some of the data gets corrupted when is inserted in the db. My code is something like this: $pdf = new MyPDF(); //class that extends FPDF and create te pdf file $content = $pdf->Output("", "S"); //return the pdf file content as string $sql = "insert into mytable(myblobfield) values('".addslashes($content)."')"; mysql_query($sql); to store the pdf, and like this: $sql = "select myblobfield from mytable where id = '1'"; $result = mysql_query($sql); $rs = mysql_fetch_assoc($result); $content = stripslashes($rs['myblobfield']); header('Content-Type: application/pdf'); header("Content-Length: ".strlen(content)); header('Content-Disposition: attachment; filename=myfile.pdf'); print $content; to send it to the browser for downloading. What am I doing wrong? If I change my code to: $pdf = new MyPDF(); $pdf->Output(); //send the pdf to the browser the file is correctly displayed, so I assume that is correctly generated and the problem is in the storing in the db. Thanks in advance.

Read the article

Search Results

Search found 4479 results on 180 pages for 'pdf scraping'.

Page 5/180 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Kasper

- by Yurij73

- by Dave Jarvis

- by Ender

- by Green Muffins

- by Yurij73

- by user289833

- by Guido

- by jayrdub

- by ddowns

- by WhyKiki

- by lisyqiao

- by Arun P Johny

- by serhio

- by hamilton

- by Rogier

- by Rogier

- by thanigai

- by Thomas

- by Rick de Graaf

- by tom

- by jmn

- by jmn

- by Reado

- by Davide Gualano

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >