Search Results

Search found 7251 results on 291 pages for 'pdf parsing'.

Page 3/291 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Obey the MediaBox/CropBox in PDF when using ghostscript to render a PDF to a png

- by gordonwatts

I've been using ghostscript to convert my single figure plots rendered in PDF to png: gswin32c -sDEVICE=png16m -r300x300 -sOutputFile=junk.png -dBATCH -dNOPAUSE Figure_001-a.pdf This works in the sense I get a png out and it contians the plot. But it contains a huge amount of white space as well (an example source image: http://cdsweb.cern.ch/record/1258681/files/Figure_001-a.pdf). If you view it in Acrobat you'll note there is no white space around the plot. If you use the above command line you'll find the plot is only about 1/3 of the space. When doing the same thing with an eps file I run into the same problem. However, there is the command-line parameter -dEPSCrop that one can pass to get the PS rendering engine to pay attention to the BoundingBox. I need the similar argument for rendering PDF's. I was not able to find it in docs (nor even the EPSCrop, actually).

Read the article
Convert PDF to PDF/A-1

- by AZtec

I know this probably is not strictly a programming-question (well maybe it is, i don't know) but i'm having serious problems trying to convert a regular pdf (with hyperlinks, bookmarks, images, embedded fonts etc.) into a PDF/A-1 format. I get all kinds of errors when i check it with pdfaPilot. How can i prepare a pdf so no problems will occur when i try to convert to PDF/A-1. Most problems can be fixed with pdfaPilot but apparently not all. One of the problems i get is with the XMP Metadata which are "not properly defined". Wat exactly does this mean, and can i do something to prevent this. Another one is: "Syntax problem: Array with more than 8191 elements" (i hope this one is solvable) I hope someone can help me out here, since i'm in a tight spot right now with deadlines that are killing me.

Read the article
Free Java library for converting existing PDF to PDF/A

- by Shervin

Hi. I am trying to convert PDF to PDF/A. Currently I can do this using OpenOffice pdf viewer plugin together with Jodconverter 2. But this is pretty cumbersome to do. Does anybody of any open source / free Java libraries I can use to do this.

Read the article
PDF files made using inkscape doesnot show everything when opened in windows

- by Manu P Nair

I made a small vector graphic using inkscape, converted it to pdf. Then i opened the pdf in windows for printing purposes. Many of the lines and curves I made in inkscape were missing. Then I tried the same graphics in coreldraw. Converted it to pdf. Then i opened the file in ubuntu. All lines and curves were there. I want to use ubuntu for all my works. But this problem makes it difficult for me as I have to take the pdf to a printer who works only with windows.

Read the article
firefox addon to save web page as pdf [closed]

- by Jayapal Chandran

Is there a firefox addon to save a webpage as a pdf file? I want to free service if available. In chrome save as pdf works after pressing Ctrl + P but this services is not available in firefox. You may ask why not use chrome. I am using yslow to generate reports and yslow does not show the printable view option were as firefox show it. But firefox does not have print/save as pdf but chrome does save as pdf.

Read the article
Auto convert odt to pdf

- by Gautam K

I am creating a few documents in Libre office and I have to always send them as .pdf. but each and every time I forget to export it as pdf , So is there any way to auto convert the .odt document into pdf every time I save the document ? I have only about 4 docs , I keep making changes on them , So each and every time I make a change and save the odt I need that change to be updated in the corresponding pdf file . Ps : I understand that unoconv can be used to convert via command line but is there a way to automatically do it ? Another Ps : I found out that there is something called inotify and inotify-tools and that can be used to trigger events when a file changes . But I have no idea on how to use it .

Read the article
How to organize my 1000s of PDF?

- by mmb

I have a huge collection of PDF. Mostly it consists of research papers, of self-created documents but also of scanned documents. Right now I drop them all in one folder and give them precise names with tags in the filename. But even that gets impractical, so I am looking for a PDF library management application. I am thinking of something like Yep for Mac, with the following features: PDF cover browsing (with large preview, larger than Nautilus allows) tagging of PDF (data should be readable cross-platform) possibility to share across network (thus rather flat files than database) if possible: cross-platform Mendeley seemed to be a good choice, but I am not only having academic papers and don't want to fill it all metadata that is required there. The only alternative I could find thus far is Shoka, but the features are limited and developments seems to have stopped already.

Read the article
Parsing scripts that use curly braces

- by Keikoku

To get an idea of what I'm doing, I am writing a python parser that will parse directx .x text files. The problem I have deals with how the files are formatted. Although I'm writing it in python, I'm looking for general algorithms for dealing with this sort of parsing. .x files define data using templates. The format of a template is template_name { [some_data] } The goal I have is to parse the file line-by-line and whenever I come across a template, I will deal with it accordingly. My initial approach was to check if a line contains an opening or closing brace. If it's an open brace, then I will check what the template name is. Now the catch here is that the open brace doesn't have to occur on the same line as the template name. It could just as well be template_name { [some_data] } So if I were to use my "open brace exists" criteria, it won't work for any files that use the latter format. A lot of languages also use curly braces (though I'm not sure when people would be parsing the scripts themselves), so I was wondering if anyone knows how to accurately get the template name (or in some other languages, it could just as well be a function name, though there aren't any keywords to look for)

Read the article
Rails and Prawn PDF - add current item ID to filename?

- by dannymcc

Hi Everyone, I have two PDFs that are made "on the fly" using Prawn PDF. The PDFs are called jobsheet.pdf and discharge.pdf - their URL's are: railsroot/kases/IDNO/jobsheet.pdf railsroot/kases/IDNO/discharge.pdf I am trying to work out how to automagically append the filename with the ID number: railsroot/kases/IDNO/jobsheet_IDNO.pdf railsroot/kases/IDNO/discharge_IDNO.pdf To create the PDFs the code is as follows: Kases Controller def jobsheet @kase = Kase.find(params[:id]) respond_to do |format| format.html {} # jobsheet.html.erb format.xml { render :xml => @kase } format.pdf { render :layout => false } prawnto :prawn => { :background => "#{RAILS_ROOT}/public/images/jobsheet.png", :left_margin => 0, :right_margin => 0, :top_margin => 0, :bottom_margin => 0, :page_size => 'A4' } end end # GET /kases/1 # GET /kases/1.xml def discharge @kase = Kase.find(params[:id]) respond_to do |format| format.html { } # discharge.html.erb format.xml { render :xml => @kase } format.pdf { render :layout => false } prawnto :prawn => { :background => "#{RAILS_ROOT}/public/images/discharge.png", :left_margin => 0, :right_margin => 0, :top_margin => 0, :bottom_margin => 0, :page_size => 'A4' } end end Routes map.resources :kases, :member => { :discharge => :get } map.resources :kases, :member => { :jobsheet => :get } To view the PDF's I use the following links: jobsheet_kase_path(@kase, :format => 'pdf') discharge_kase_path(@kase, :format => 'pdf') Is this even possible? Thanks, Danny

Read the article
How to print documents to pdf

- by Artur Carvalho

Hi there, I like to print PDFs of my documents. I've been using PDFCreator. Is this a good choice, are there any better? Thanks

Read the article
How to download a full website as PDF?

- by MartyIX

I'm trying to make an offline version of a web site and I'm looking for a tool that would do the task automatically for the whole web site (circa 1000 pages of HTML + images). Is there anything like that and free? I know it is quite challenge for a program but maybe I'll be lucky :). EDIT: It should be a program for Windows.

Read the article
Seeking reporting or templating tool to generate large formatted PDF reports from dataset

- by Mr. Tacos

Say I have some data in MySQL or a big ole CSV file. I also have a report. It's a PDF, call it 100 pages long. I need to generate variations on this PDF for slices of the data. More specific example: I have a CSV file with each StackOverflow user in a row and each column contains various statistics about that user. I have a report called "Your StackOverflow Performance". Its got lots of text, always the same, but each section contains something like: "You Vs. The Average StackOverflow Poster on this metric". I want a table that appears there that has the average data, which is the same in every run of the PDF, in one column. In the second column, I want your data, which is different for each PDF/row in the CSV file/user of StackOverflow. I'm pretty sure people use things like Crystal for this? Is there something in MS SQL Server that's good for this? An open source template language? I'm not even really sure if what I need is called a 'reporting' tool (since I don't really need to do any crunching, the data in this case is being crunched by a series of scripts and SPSS, I don't need bands and subbands and so on) or 'templating'. Is there even such a thing as templating PDFs? Natch, I'd be fine with something that generates output easily scriptable to PDF, like eps, but not something like HTML. The report formatting is fussy and done and externally determined and handed down from on high. It's print-oriented, not webby. Thanks in advance.

Read the article
Which is the best PDF library for PHP?

- by Darryl Hein

I'm wondering which is the best PDF creation library for PHP, mainly for creating PDFs from scratch (not as much HTML to PDF)? I have worked with FPDF for quite a while now, but it's getting quite old and hasn't had much for updates. I found TCPDF the other day (thanks you another question on SO). It seems very good and is based on FPDF so I don't think it'd be a big transition. FPDI also supports TCPDF which is nice as I have used it before and found it be useful. I have also seen DOMPDF but it too hasn't had many updates for quite some time and is lacking a lot of functionality for general PDF generation. Zend (Zend_Pdf) as well as many other libraries have their own PDF libraries or extend another one, but you often have to setup the entire library, which for existing projects can be a problem. What other libraries are there and what have your experiences with them been with the above or other libraries?

Read the article
PDF Disable Anti-alias on Lines

- by Travis

I'm creating a dynamically generated PDF using FPDF. My PDF requires many exactly horizontal/vertical lines in a grid and when rendered they are anti-aliased and look very fuzzy and unacceptable to the client. I need to remove the anti-aliasing for these(or all) lines in the doc. I know this is possible because it's shown correctly in the adobe pdf specs itself http://www.adobe.com/devnet/acrobat/pdfs/PDF32000_2008.pdf (warning: big file) see the box in page 2 for how this should look. How would I duplicate the box shown on this page?

Read the article
Ruby library for manipulating existing PDF

- by simonwh

I'm searching for a library to edit already existing PDF's and add a watermark to each page, for example. Could also be blank every other page etc. There seem to be a few PDF libraries out there, but only very few of them can edit existing PDF's and I'm a bit lost on which way to go. Any recommendations? Thank you.

Read the article
Alternative to latex / a way to typeset good looking documents from Java to PDF

- by drasto

I'm working on application in Java that will maintain database of song lyrics in plain text and print out some songbooks/chordbooks(that is create PDF file from selected songs). I was planing that the Java application will generate source code for pdflatex and after compiling this source user will get PDF file. Lately I've run into a lot of problems because of latex limitation: fixed memory size (some pictures will also be drawn to PDF) - error when exceeded, no way to query end of line or and of page dynamically, it's very hard to override latex placement algorithm in a complex way,... see also some my other questions regarding latex. I come to conclusion that latex is not good option for automated PDF generation. So I need replacement. I need to be able to typeset: Chords over lyrics when the lyrics are in variable char width so I need to be able to measure text width Chord diagrams that means I'll have to draw quite complex pictures Each song on separate double page Different fonts etc. Thanks for all answers

Read the article
Filling in PDF Forms with ASP.NET and iTextSharp

The Portable Document Format (PDF) is a popular file format for documents. PDF files are a popular document format for two primary reasons: first, because the PDF standard is an open standard, there are many vendors that provide PDF readers across virtually all operating systems, and many proprietary programs, such as Microsoft Word, include a "Save as PDF" option. Consequently, PDFs server as a sort of common currency of exchange. A person writing a document using Microsoft Word for Windows can save the document as a PDF, which can then be read by others whether or not they are using Windows and whether or not they have Microsoft Word installed. Second, PDF files are self-contained. Each PDF file includes its complete text, fonts, images, input fields, and other content. This means that even complicated documents with many images, an intricate layout, and with user interface elements like textboxes and checkboxes can be encapsulated in a single PDF file. Due to their ubiquity and layout capabilities, it's not uncommon for a websites to use PDF technology. For example, when purchasing goods at an online store you may be offered the ability to download an invoice as a PDF file. PDFs also support form fields, which are user interface elements like textboxes, checkboxes, comboboxes, and the like. These form fields can be entered by a user viewing the PDF or, with a bit of code, they can be entered programmatically. This article is the first in a multi-part series that examines how to programmatically work with PDF files from an ASP.NET application using iTextSharp, a .NET open source library for PDF generation. This installment shows how to use iTextSharp to open an existing PDF document with form fields, fill those form fields with user-supplied values, and then save the combined output to a new PDF file. Read on to learn more! Read More >

Read the article
Convert a colored PDF into a white/black

- by polslinux

On Debian Sid, I have a PDF with a blue background and yellow font. I've searched a lot on Super User but i haven't found anything useful for me. I have tried to convert the PDF into a grayscale one with: gs -o grayscale.pdf -sDEVICE=pdfwrite -sColorConversionStrategy=Gray -sProcessColorModel=DeviceGray -dCompatibilityLevel=1.4 colored.pdf The problem is that I obtain a PDF whit white fonts and dark grey background so I cannot print it. After that I tried: convert -density 96x96 gs2.pdf -density 96x96 -negate -compress zip inv.pdf I got a PDF with black fonts (and this is okay) and grey background (and this is not okay). What can I do to obtain a PDF with white background and black fonts?

Read the article
Including BLOB images in your PDF Reports

- by thatjeffsmith

Earlier this year we walked through how to work with BLOBs in Oracle SQL Developer. So you already know how to INSERT, UPDATE and view the BLOBs stored in your tables. But now I want to show you how to include those images in your PDF reports. You know how to work with SQL Developer reports, right? No? OK, let’s do a quick run down memory lane then: How to Build a Bar Chart Child reports – click on parent record for on-the-fly children records Alright, so if you have a GRID report that contains a BLOB column, you have the option of including the BLOB contents when you create a PDF export: At design time, specify how you want the BLOB content to be treated when you export to PDF Note that you must specify the treatment of the BLOBs in the report design. You won’t be prompted when you launch the Export wizard dialog. When you open your PDF, there will be a link to the image. Click it. Click then confirm. It will launch the default image viewer on your machine. I hope your pictures are more excited than mine.

Read the article
Flattening PDF transparency

- by Jan

I have a PDF, made with Inkscape, that uses transparent colors. This image shall be used in a LaTeX document. While preserving the transparency is nice for editing, it can be a problem for printing. Printing usually involves PDF to PS conversion. Since Postscript does not support transparency, this requires either flatting, i.e. creating a vector graphic that works without transparency or rastering, i.e. rendering a bitmap image. When a PDF document containing such a figure is printed (or converted to PS) using Evince (or Cairo or Ghostscript), the whole page gets rendered as a bitmap, rendering fonts ugly (different from other pages). (Adobe Acrobat handles such PDFs well.) Unfortunately, converting the PDF figures to EPS (before including them with LaTeX) doesn't help much, because both pdftops and pdf2ps (again, Cairo or Ghostscript) rasterize the image, i.e. render a bitmap (saved as EPS). (This is slightly better, because it doesn't affect the whole page, but I'd still prefer a vector graphics.) How can I flatten transparency with Inkscape or other software on Linux?

Read the article
Ruby libraries for parsing .doc files?

- by Platinum Azure

Hi all, I was just wondering if anyone knew of any good libraries for parsing .doc files (and similar formats, like .odt) to extract text, yet also keep formatting information where possible for display on a website. Capability of doing similarly for PDFs would be a bonus, but I'm not looking as much for that. This is for a Rails project, if that helps at all. Thanks in advance!

Read the article
How to Use Ghostscript DLL to convert PDF to PDF/A

- by imgen

How to user GhostScript DLL to convert PDF to PDF/A. I know I kind of have to call the exported function of gsdll32.dll whose name is gsapi_init_with_args, but how do i pass the right arguments? BTW, i'm using C#.

Read the article
android: open a pdf from my app using the built in pdf viewer

- by mtmurdock

I want to be able to open a pdf file in my app using the android's built in pdf viewer app, but i dont know how to start other apps. I'm sure i have to call start activity, i just dont know how to identify the app im opening and how to pass the file to that specific app. Anyone have a clue?

Read the article
Is there a PDF parser for PHP?

- by elviejo

Hi I know about several PDF Generators for php (fpdf, dompdf, etc.) What I want to know is about a parser. For reasons beyond my control, certain information I need is only in a table inside a pdf and I need to extract that table and convert it to an array. Any suggestions?

Read the article
Convert Docx or Odt to Pdf

- by luxifer

Hi there, I need to find a way to convert docx or odt files to pdf on a linux web server. Therefore I'm not willing to install openoffice.org for obvious reasons. I've tried Google but it failed for me, so I'm here :-) I can't imagine there's no other solution to this problem than to install a huge chunk of binaries given that a) there are (or at least should be) lot's of packages which can read docx or at least odt and b) there are as many packages which can write pdf files What am I missing here? scratching head Regards, luxifer ps edit: I don't want to use a web service - neither free or paid edit 2: at this point it would also help to convert the docx back to doc so I could use wvpdf to generate the pdf... edit 3: of course it would also help if i could do search and replace on a doc file in the first place; or xps for that matter

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >