Search Results

Search found 4479 results on 180 pages for 'pdf scraping'.

Page 11/180 | < Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18 | Next Page >

save as pdf in linux

- by Neilvert Noval

I have seen how simple it is for Mac OS to generate pdf from a document without additional software to install. But I am looking for this functionality in Linux. One scenario, for example, if I have myDocument.txt that contain an article, how can I convert this into pdf? My next question is, assuming that myDocument.txt is a 3-paged document, will it generate a 3-page continuous pdf and not just 3 separate pdfs? Any tools for linux that does this? (GUI is fine, but commandline is preferable)

Read the article
Hyperlinks on images in PDF from Word 2010

- by Bristol

I've got a Word 2010 document that I'm trying to convert to a PDF with "Save As...", preserving hyperlinks. Something odd is going on: Hyperlinks on inline text, or images that are inline, work fine. Hyperlinks on images with layout "in front of" text don't work in the PDF, same for hyperlinked drawing shapes. What I'm trying to do is make a "clickmap" image by putting an image on the page and overlaying parts of it with transparent shapes that hyperlink to different URLs. This isn't working, and the transparency has nothing to do with it - hyperlinks in the PDF seem only to work on "in line with text" elements. Am I missing something, or is there a better way to do this?

Read the article
How to convert a power point pdf to a pdf that is easy to read on kindle?

- by SpaceTrucker

I have several power point presentations as pdfs. I would like to read them on the original kindle in landscape format. When I read the original on the kindle then a single slide won't fit on the kindles display. I thought the easiest way to convert the pdf was to repring it with a pdf printer. However I don't know the paper size to use. I already tried using Calibre as suggested by this question. However the output is not usable because of formatting issues. So what paper size should I use for the pdf printer to reprint them in landscape format or are there any other tools I could use for that task?

Read the article
OSX pdf-kit vs Linux poppler or pdf/x

- by Tahnoon Pasha

I keep reading and hearing that the reason that there is no good pdf editing software for Linux is that the libraries are not as well developed. That is why there is no equivalent for Skim or Preview in Linux. I had a look a the pdf-kit documentation and the poppler documentation and they looked very similar to my admittedly non-technical view. Could someone explain to me why the OSX libraries (eg) are so much easier to write projects like Skim in than the linux ones. I'm not sure if the same applies to OSX projects to NVAlt, but it seems to be a common theme - I'd just like to understand what is behind the thesis that OSX is easier to code these projects in, and what would be involved in changing it. (I'm not disputing the value of Okular or Evince and the like, just noting that they don't have the richness of functionality of Skim, Preview or even things like Goodreader on the Ipad).

Read the article
How to unmangle PDF format into a usable text or spreadsheet document?

- by Chuck

Upon requesting some daily/hourly sales data from a coworker who is responsible for such requests, I was given a series of PDF files. The point of sale program that is used, for some reason, answers requests for this type of information in the form of PDF files. The issue: The PDF files look to be in a format that should easily be copy and pasted into a spreadsheet. There are three columns that look to be neatly organized across two pages. When copy/pasting the first page, all three columns from the PDF's first page are dumped into a single column consisting of the Date followed by the Hours for the transactions on that day. The end of this Date/Time information is followed by all of the Total Sales values that should be attached a Date and Time of the transaction. (NOTE: There are no duplicated Dates in the Date column, ie, Multiple transactions for a day only have one yyyy/mm/dd listed for the first row but not the following rows.) While it was a huge pain, it was possible to, in about four or five steps, get the single column of data broken out into three columns that matched the PDF. The second page of the PDF file, when attempting to copy/paste into a spreadsheet, creates a single column with the first third of the cells being the Dates from the PDF, the second third of the cells being the Hours of the transactions and the final third of the cells being filled with the Total Sales. After the copy/paste there is no way to figure out which Hours belong to which Dates or Total Sales due to the lack of the duplicated Dates in the Date column as mentioned above. My PDF-fu is next to non-existent. I've just now started to work with PDF editors and some www.convertmyPDFforfree.com websites, so far, with absolutely nothing remotely coming anywhere near usable output. (Both methods have so far done nothing but product blank documents.) Before I go back and pester my co-worker into figuring out a way to create a report in some other format than PDF, is there any method by which to take the data that looks to be formatted correctly in a PDF and copy/paste it into a spreadsheet that will look the same? I appreciate any help that can be made available. The sales data isn't so sensitive that I couldn't part with a bit to let somebody actually see what it is that needs to be dealt with, just let me know. The PDF's are less than 100kb each so sending them shouldn't be a burden to any interested party.

Read the article
Show PDF in iPad using CGPDF APIs

- by AJ

I have learned Apple has release CGPDF APIs in SDK 3.2 for drawing PDF context. What I understand from these APIs is that you can draw a PDF to a data object or a PDF file. You can then export it, may be, to your sandbox's directory OR add as an attachment in the mail. But I am not sure if we can use these APIs to read a PDF from application bundle and show it to the user page-by-page on the screen. What I want to do is open a PDF of a magazine in a magazine reader app. I was also wondering if we can identify the links in a PDF file and open them in the app. Let me know if have done OR doing anything like this. Thanks AJ

Read the article
Screen scraping in C# using HtmlAgilityPack.

In my example, you can scraping complete page or a part of page.

Read the article
Apple iPad and PDF support

- by STeN

Hi, I have few questions related to the PDF and its use on the Apple iPad: 1) Does the iPad support all Quartz PDF functions (i.e. all CGPDFxxx functions/classes)? 2) Does the iPad support the PDF Kit? 3) Is it possible with any of one of both APIs, based on the coordinates of the finger touch to detect the underlying PDF item (e.g. article, text, annotations) ? 4) What is the difference between the Quartz PDF functions and PDF Kit? Thanks a lot Regards, STeN

Read the article
Is there a way to extracting semantic informations from PDF? (converting PDF to pure XHTML)

- by Eonil

Hi. I'm finding a way to extract semantic structural informations (like title, heading, paragraph or lists) from PDF. Because I want to get a pure structural data from PDF. Finally, I want to create an pure XHTML from the PDF. With only structural informations. No design or layout. I know, PDF can be created without any structural information. I don't consider those PDFs. Only regularly well-structured PDFs are considered. I'm new to PDF. So I don't know it offers regular semantic structure or not. If it exists, it's library will offer it. So I want to know whether PDF spec has those information, and best way to get those information if exists.

Read the article
Using ImageMagick to create an image from a PDF...efficiently

- by bigsweater

I'm using ImageMagick to create a tiny JPG thumbnail image of an already-uploaded PDF. The code works fine. It's a WordPress widget, though this isn't necessarily WordPress specific. I'm unfamiliar with ImageMagick, so I was hoping somebody could tell me if this looks terrible or isn't following some best practices of some sort, or if I'm risking crashing the server. My questions, specifically, are: Is that image cached, or does the server have to re-generate the image every time somebody views the page? If it isn't cached, what's the best way to make sure the server doesn't have to regenerate the thumbnail? I tried to create a separate folder (/thumbs) for ImageMagick to put all the images in, instead of cluttering up the WP upload folders with images of PDFs. It kept throwing a permission error, despite 777 permissions on the folder in my testing environment. Why? Do the source/destination directories have to be the same? Am I doing anything incorrectly/inefficiently here that needs to be improved? The whole widget is on Pastebin: http://pastebin.com/WnSTEDm7 Relevant code: <?php if ( $url ) { $pdf = $url; $info = pathinfo($pdf); $filename = basename($pdf,'.'.$info['extension']); $uploads = wp_upload_dir(); $file_path = str_replace( $uploads['baseurl'], $uploads['basedir'], $url ); $dest_path = str_replace( '.pdf', '.jpg', $file_path ); $dest_url = str_replace( '.pdf', '.jpg', $pdf ); exec("convert \"{$file_path}[0]\" -colorspace RGB -geometry 60 $dest_path"); ?> <div class="entry"> <div class="widgetImg"> <p><a href="<?php echo $url; ?>" title="<?php echo $filename; ?>"><?php echo "<img src='".$dest_url."' alt='".$filename."' class='blueBorder' />"; ?></a></p> </div> <div class="widgetText"> <?php echo wpautop( $desc ); ?> <p><a class="downloadLink" href="<?php echo $url; ?>" title="<?php echo $filename; ?>">Download</a></p> </div> </div> <?php } ?> As you can see, the widget grabs whatever PDF is attached to the current page being viewed, creates an image of the first page of the PDF, stores it, then links to it in HTML. Thanks for any and all help!

Read the article
Play Framework: Generate PDF from template that uses Javascript for graphing

- by digiarnie

I have a template that has some Javascript used to generate graphs in the browser. I would like to use that same template to create a PDF and send as an attachment in an e-mail. In this scenario, there would be no browser/client interaction. I am using the PDF module that is available from the Play website and I have managed to get the PDF rendering to work. The only issue is that the graphs don't show up in the PDF but all other static text does. I'm assuming the graphs aren't appearing in the PDF due to the Javascript not being executed prior to the PDF generation. Does anyone have any ideas on how to get around this problem?

Read the article
Convert Byte [] to PDF

- by Sri Kumar

Hello All, With help of this question C# 4.0: Convert pdf to byte[] and vice versa i was able to convert byte[] to PDF. But the problem here is not all the contents were written in PDF. Byte array length is 25990 approx. Only 21 to 26 KB size PDF file was created. When i try to open the PDF it says file is corrupted. What could be the reason? I tried the BinaryWriter but it creates PDF of 0 KB.

Read the article
[LaTex]: Add in the TOC an included PDF

- by ILoveMyLatexReport

In my document I include a PDF using \includepdf[pages=-]{./mypdf.pdf} The problem I'm having is how to add a TOC entry for this pdf. It supposed to be an appendix. I tried adding a new section in the appendix but of course the section name can't be printed on the same page than the included pdf, so the resulting TOC line directs to a wrong page. if I use \addcontentsline I loose the numbering and the page is wrong too because the included pdf actually starts at the next page... I'm a bit lost here so I would really appreciate if someone knows how to do this. note: the pdf I try to include was not generated from LaTex. Thanks in advance.

Read the article
Planning to create PDF files in Ruby on Rails

- by deau

Hi there, A Ruby on Rails app will have access to a number of images and fonts. The images are components of a visual layout which will be stored separately as a set of rules. The rules specify document dimensions along with which images are used and where. The app needs to take these rules, fetch the images, and generate a PDF that is ready for local printing or emailing. The fonts will also be important. The user needs to customize the layout by inputting text which will be included in the PDF. The PDF must therefore also contain the desired font so that the document renders identically across different machines. Each PDF may have many pages. Each page may have different dimensions but this is not essential. Either way, the ability to manipulate the dimensions and margins given by the PDF is essential. The only thing that needs to be regularly changed is the text. If this is takes too much development then the app can store the layouts in 3rd party PDFs and edit the textual content directly. Eventually though, this will prove too restrictive on the apps intended functionality so I would prefer the app to generate the PDF's itself. I have never worked with PDFs before and, for the most part, I've never had to output anything to the user outside their monitor. A printed medium could require a very different approach to get the best results. If anyone has any advice on how to model the PDF format this it would be really appreciated. The technical aspects of printing such as bleed, resolution and colour have already been factored in to the layouts and images. I am aware that PDF is a proprietary file format and I want to use free or open source software. I have seen a number of Ruby libraries for generating PDF files but because I am new on this scene I have no way to reliably compare them and too little time to implement and test them all. I also have the option of using C to handle this feature and if this is process intensive then that might be preferred. What should I be thinking about and how should I be planning to implement this?

Read the article
How to know which fonts are used in selected part of a PDF document

- by Mehper C. Palavuzlar

I'm using Foxit Reader as default PDF viewer. How can I see what type of font is used for a selected part of a PDF document?

Read the article
Modifying y-axis label in pdf Diagram

- by Andrea

Hi all, I have mislabelled the y-axis in my pdf graph that I generated from an Matlab file. Unfortuantely, I cant find anymore the data that I used to generate the graph. So I am wondering if there is an free tool that allows me to rename the y-axis in my pdf diagram? Are there any free tools available that could be helpful or is this something "impossible" to do? Many thanks for your help, Andrea

Read the article
Software for reading PDF ebooks?

- by Sridhar Ratnakumar

I have a bunch of PDF ebooks ... while I know that traditional PDF readers (Acrobat, Preview) can be used to read them, I wonder if there are ebook software specifically tailored for long starring at the computer screen? Like white-on-black (night reading)? Is there any? Preferably a software that runs on MacOSX. If not, then Windows would be nice too.

Read the article
About the security of adding a signature to a PDF file

- by ????

We can add a "bitmap" or image signature to a PDF file, either by using Adobe Acrobat or by Mac's Preview app, but I wonder, besides always encrypting it with a password before sending it by email to the other party, how valid and secure is it? The reason is, if the signature is a bitmap, then there is nothing that prevents anybody copying and pasting that image to other documents, or even, if a cheque is written to anybody at all (such as to the landlord), then there is nothing that prevents the signature from being scanned and copied and pasted to any other PDF documents as well.

Read the article
Batch crop PDF pages?

- by boost

I have a PDF document containing pages which have crop marks on them. I'd like to copy these pages to another PDF without the crop marks. I'm assuming I have to crop-out the crop marks but is there any way to do this in batch rather than interactively?

Read the article
PDF printer in Wine on Ubuntu?

- by Arkapravo

I wish to convert (print) my MS Word files to pdf on the fly. I am on Ubuntu 9.10 and using Wine 1.1.40. Can someone help? I have heard that a pdf printer can be installed using Wine Cups.

Read the article
How to save rotated Adobe pdf file?

- by WilliamKF

I received an Adobe pdf scan of a document that displays upside-down. I rotated it inside Adobe Acrobat and choose Save-As to make a new document, however, the rotation is not saved and when I open the new document, it is upside-down again. How can I correct this upside-down document as a new pdf file?

Read the article
How to enlarge a PDF document on Kindle?

- by Gustavo

Kindle doesn't zoom in a PDF document, so I'd like to know if there is a way to enlarge the file myself before it being displayed on the kindle screen. I've tried to convert some PDF files to the .azw kindle format, but the images weren't converted. So I have to find another way.

Read the article
how to extract fonts from pdf?

- by Joey

Is there a way to extract fonts from pdf files? I know that usually embedded fonts in pdf files are only subsets of the fonts. Anyway is there way to do this?

Read the article
Print a series of PDF files

- by Tim Coker

Is there a way to print several PDF files at once? I have a bunch of individual files that I want to print (about 42). Printing each one is tedious. Does anyone know a way to print a whole series at once? Maybe something like a PDF reader with a "Print All" function? I ask because this isn't the first time I've run into this problem and have never been able to find a good solution...

Read the article
problem with nitro pdf

- by Nrew

Im converting lots of .doc files into .pdf using nitro pdf express.The program isn't finish converting the 37 files yet maybe 10 are converted and are in the output directory already. But when I cancelled it, even the converted one's are deleted. Can I still find it somewhere or are they gone for good.

Read the article

< Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18 | Next Page >