html to pdf - Page 9 - Developer IT

Why using Acrobat 10 resaving a PDF file that was 4MB will become 3MB?

- by Jian Lin

I had some PDF files and just try to open it and do some highlighting using Acrobat 10 (also called Adode Reader X)... After highlighting, I save the file (using a different filename), and now the file change from 4MB to 3MB... is it just compression? Or making the images have lower clarity? (thought I cannot see any difference). What is the reason? If it is just compression, then why wasn't it done before, as winzip technology is quite mature more than even 10, 12 year ago.

Read the article

a[type="application/pdf"] vs a[href$=".pdf"]

- by metal-gear-solid

What is the difference between these 2 selectors a[type="application/pdf"] and a[href$=".pdf"] a[type="application/pdf"] { background-image: url(/images/pdf.gif); padding-left: 20px; } a[href$=".pdf"] { background-image: url(/images/pdf.gif); padding-left: 20px; }

Read the article

PDF Form Field Manipulation

- by 108039818756939362532

I'm making a web interface to autofill pdf forms with user data from a database. The admin needs to be able to upload a pdf (right now targeted at IRS pdf forms) and then associate the fields in the pdf with data fields in the database. I need a way to help the admin associate the field names (stuff like "topmostSubform[0].Page2[0].p2-t66[0]") with the the data fields in the database. I'm looking for a way to modify the PDF programatically to in some way provide this information. Basically I'm open to suggestions on how I might make the field names appear in an obvious manner on a modified version of the original pdf. The closest I've gotten is being able to insert Tooltips into the fields in the pdf by just editting the raw pdf line by line. However when editting the pdf in this manner the field names are gibberish, and so I can't just use them. An optimal solution would be anything that could automatically parse a pdf and set each field's tooltip to be the fields name. Anything that can be run from the command line, or any python tool, or just a basic how to correctly parse a field's name from a raw pdf file would be amazing.

Read the article

Dynamically generated PDF files working in most readers except Adobe Reader

- by Shane

I'm trying to dynamically generate PDFs from user input, where I basically print the user input and overlay it on an existing PDF that I did not create. It works, with one major exception. Adobe Reader doesn't read it properly, on Windows or on Linux. QuickOffice on my phone doesn't read it either. So I thought I'd trace the path of me creating the files - 1 - Original PDF of background PDF 1.2 made with Adobe Distiller with the LZW encoding. I didn't make this. 2 - PDF of background PDF 1.4 made with Ghostscript. I used pdf2ps then ps2pdf on the above to strip LZW so that the reportlab and pyPDF libraries would recognize it. Note that this file looks "fuzzy," like a bad scan, in Adobe Reader, but looks fine in other readers. 3 - PDF of user-input text formatted to be combined with background PDF 1.3 made with Reportlab from user input. Opens properly and looks good in every reader I've tried. 4 - Finished PDF PDF 1.3 made from PyPDF's mergePage() function on 2 and 3. Does not open in: Adobe Reader for Windows Adobe Reader for Linux QuickOffice for Android Opens perfectly in: Google Docs' PDF viewer on the web evince for linux ghostscript viewer for linux Foxit reader for Windows Preview for Mac Are there known issues that I should know about? I don't know exactly what "flate" is, but from the internet I gather that it's some sort of open source alternative to LZW for PDF compression? Could that be causing my problem? If so, are there any libraries I could use to fix the cause in my code?

Read the article

Creating PDF Documents with ASP.NET and iTextSharp

The Portable Document Format (PDF) is a popular file format for documents. Due to their ubiquity and layout capabilities, it's not uncommon for a websites to use PDF technology. For example, an eCommerce store may offer a "printable receipt" option that, when selected, displays a PDF file within the browser. Last week's article, Filling in PDF Forms with ASP.NET and iTextSharp, looked at how to work with a special kind of PDF document, namely one that has one or more fields defined. A PDF document can contain various types of user interface elements, which are referred to as fields. For instance, there is a text field, a checkbox field, a combobox field, and more. Typically, the person viewing the PDF on her computer interacts with the document's fields; however, it is possible to enumerate and fill a PDF's fields programmatically, as we saw in last week's article. This article continues our investigation into iTextSharp, a .NET open source library for PDF generation, showing how to use iTextSharp to create PDF documents from scratch. We start with an example of how to programmatically define and piece together paragraphs, tables, and images into a single PDF file. Following that, we explore how to use iTextSharp's built-in capabilities to convert HTML into PDF. Read on to learn more! Read More >

Read the article

From escaped html -> to regular html? - Python

- by RadiantHex

Hi folks, I used BeautifulSoup to handle XML files that I have collected through a REST API. The responses contain HTML code, but BeautifulSoup escapes all the HTML tags so it can be displayed nicely. Unfortunately I need the HTML code. How would I go on about transforming the escaped HTML into proper markup? Help would be very much appreciated!

Read the article

Problems in "Save as PDF" plugin with Arabic numbers

- by Mohamed Mohsen

I use the "Save as PDF" plugin with Word 2007 to generate a PDF document from a DOCX document. It works great except that the Arabic numbers in the Word file have been converted to English numbers in the PDF document. Kindly find two links containing two screen shots explaining the problem. The first image is the generated PDF file with the English numbers highlighted. The second image is the original word file with the Arabic numbers highlighted. Update: Thanks very much Isaac, ChrisF and Wil. I changed the Numeral at word to Context and confirmed that all the numbers are Arabic at the Word file. I still have the problem as the PDF file still have English numbers. (Note: The Arabic numbers called Hindi numbers). I also tried changing the font to Tahoma with no hope.

Read the article

ifilter not working with MOSS 2007, cant crawl .pdf

- by SORRYPROFESSEROFYEARNING

Installed ifilter and followed the guides: http://msmvps.com/blogs/sundar_narasiman/archive/2008/02/06/configuring-moss-2007-to-search-pdf-documents-install-and-configure-pdf-ifilters.aspx and the accompanying link to the MS hotfix.. I have initiated multiple crawls that don't show any .pdf documents, let alone the contents of the .pdfs (I did constantly upload test documents with real content). In the 'file types' menu of the shared servies, it didn't show the pdf icon as I think it was meant to, it also lists 'pdf' as filetype 'AcroExch.Document', is this correct? Any ideas anyone?

Read the article

Apple Automator "New PDF from Images" maintaining same filename

- by mech

I will potentially have 26k of old legacy PICT images to transfer first to PDF for migration. I am using Apple Automator and also the "Dispense Items Incrementally" to loop through it. However, I can't seem to let "New PDF from Images" to remember the original filename. Anyone able to offer some advice :) FYI, I am transforming it to PDF because I can't do it using ImageMagick to convert directly to my ultimate JPEG format. Due to the fact that my PICT was created very long ago and thus has some convert: improper image header error. See this ticket for more information. Thus I am doing a intermediate convert PICT to PDF first, then convert that PDF to JPEG :) The only thing left is the naming of the "Output File Name" which do not allow me to identify original filename. See the screen here:

Read the article

Libre Office/Writer PDF export: white borders appear between lines even after setting borders to none

- by Yttric

My document prints ok (exactly as it appears on the screen in Libre Office), but when I export to PDF and view the PDF on screen there are white borders around each text or picture object. Here's a sample snapshot from PDF/Preview: http://imgur.com/TWip5 I've tried selecting a paragraph and changing the border property to None as described in Libre Office help (http://help.libreoffice.org/Common/Borders), setting the "Line arrangement Default" to "Set no borders". But borders set by the Format dialog don't correspond to the borders I see in PDF/Preview. In PDF/Preview the border appears on line boundaries. Borders set in Format appear around each picture, for example. What am I doing wrong?

Read the article

Finding Image resolution in PDF file?

- by Dave

I have a problem of having some users creating very large PDFs. On the other hands I have PDF sent from our fax machines that are really small in size and totally printable. My question is Is there any way I can find the resolution (DPI) of the PDF. I search the internet, could not find any answer. Checked the properties of the file, this information was not stored there, at least in my case. What is the optimum resolution of converting text file into image PDF. 96dpi, 300dpi or more ? Fun question. Can I resize a PDF which was scanned with high dpi into smaller dpi? I know some answers might not be available as I have already searched the internet and could not find answers. Note: My PDF are entirely images, text to images. I am also familiar with primoPDF (free) something you can experiment with

Read the article

Stripping Non-Text from a Scanned, OCRd PDF

- by Daniel S.

I have a PDF created from a scanned document. OCR was used to recognize text. In Acrobat, if I select text, and click 'copy with formatting', I can paste the formatted text into Word, so it seems that fonts and colors are also embedded in the document in addition to just plain text and possibly the size. Is there any way to use this information to create a PDF that just contains the formatted OCRd text, without the scanned image. Currently, my document only shows the scanned image, and the text is on an invisible layer. I would like to create a PDF document that removes the image that was scanned, and displays the formatted text that is currently hidden. The following post has a section on "How can we make the invisible text visible?" PDF has an extra blank in all words after running through Ghostscript However, doing this does not show the correct text formatting (that is retained when pasting in Word), and I also would like to remove the scanned image so that the final PDF just contains formatted (color, font, size) vector fonts, and no images.

Read the article

PowerPoint 2007 animated slides are only partially converted to PDF

- by Tim

I have recently encountered a problem with PowerPoint 2007. When I use "Save as PDF/XPS" to create a PDF version of my presentation, some slides are only partially included in the resulting PDF file. For example, this: is reduced to this: So far, I have only encountered this with slides that contain animation elements, but which part of the elements remain in the PDF version appears not to have anything to do with the order in which the animated elements appear, so that might just be a coincidence. When viewing the affected slides in Acrobat Reader, it complains about this file containing invalid elements, and that I should complain to whoever generated the PDF file... Perhaps it has something to do with the Office 2007 Service Pack 3, because these problems started only after it had been installed. Has anyone noticed something similar? Is there a workaround?

Read the article

Converting massive images to PDF, without crashing applications

- by BloodyIron

I'm trying to work with a large-format scanner, and we are scanning very long documents. Example, one of our documents we cut into two pieces, and one of those pieces is 3633x82486 in resolution. My application, Scanning Master 21+, which comes with the device (Graphtec CSX300-09) can output PDF, however when I try to save to PDF it complains about file being too large. I can successfully output to BMP however. GIMP can even open this BMP, after taking a while to load it. The resulting files range from 200MB - 1.2GB in size. Acrobat refuses to open the BMP format, saying it isn't supported or is damaged (which I know is not true). As I mentioned, the PDF plugin for GIMP crashes when I try to export to PDF. I'm really not sure what is the best tool for this job. So what is the best tool to produce PDF documents of very large images?

Read the article

Copying first 1000 PDF files having single, double quotes in their name to another folder

- by racer_ace

I am having this folder with PDFs into it and I need to process 1000 at a time. So I need to move them into another folder, process them and delete them. For this I tried using $ find . -maxdepth 1 -type f |head -1000|xargs cp -t $destdir It gives error on single and double quotes in filename. There are thousands of files and I have no idea how many of them has these quotes in them. Can anyone help me find a solution? And I tried with the -0 option, it did not work

Read the article

How to embed PDF in a web page using Acrobat Reader instead of Acrobat.

- by Lachlan Roche

I have a pdf form that uses Acrobat 8 features. The form contains Javascript that interacts with the hosting web page. Some of my Windows users have both Adobe Acrobat and Acrobat Reader installed, and need Adobe Acrobat to be the default handler for pdf files. The users with Adobe Acrobat 7 are unable to use the form, even though they might have Acrobat Reader 8 or 9 installed. Currently, the PDF is embedded like this: <object id="host" data="/path/to/document.pdf" type="application/pdf" width="900" height="550" ></object>

Read the article

Showing HTML comment strings () in HTML files

- by Andrei

Hello all. I'm building a source code search engine, and I'm returning the results on a HTML page (aspx to be exact, but the view logic is in HTML). When someone searches a string, I also return the whole line of code where this string can be found in a file. However, some lines of code come from HTML/aspx files and these lines contain HTML specific comments (). When I try to print this line on the HTML page, it interprets it as a comment and does not show it on the screen....how should I go about solving this so that it actually shows up? Any help would be welcomed. Thanks. edit: err...i see now that firebug could help me with this:

Read the article

Adobe Acrobat: How to batch to combine multiple pdf files?

- by Andrei Andre

I have 3 folders: Folder 1 Folder 2 Folder 3 In each folder I have 5 pdf files: Folder 1 file1.pdf file2.pdf Folder 2 file1.pdf file2.pdf Folder 3 file1.pdf file2.pdf I want that in each folder to have a combined file of those two files: Folder 1 binder.pdf Folder 2 binder.pdf Folder 3 binder.pdf Any idea? Don't tell to do it manually. This case is just to explain you my problem. Think that I have hundreds of folders. :) Maybe I can use another tool instead of Adobe Acrobat?!

Read the article

Open Source PDF reader for windows as an alternative to Adobe reader

- by Tom Feiner

With the latest javascript vulnerabilities in Adobe reader and bloat it has aquired over the years, I've been thinking of moving the network I'm in charge of to a different product for PDF reading on Windows. The ideal PDF reader should be something that is: Small in size (Adobe reader is more than 200MB these days after installation). As secure by default as possible (For example, javascript disabled by default). Nice looking and easy to use interface. Not bloated with features (I just want to read PDFs, that's it). Does not install any toolbars/unwanted add ons/spyware. Does not display any ads while viewing PDFs. Preferably Open Source. (this pretty much ensures no ads). Full Unicode support. Idealy , something like evince from gnome, will be the best option, but unfortunately that's not available on Windows. Foxit is an option, as it is small, and has a nice interface. But it still has javascript enabled by default which might lead to vulnerabilities - and it installs a toolbar , and displays ads while reading PDFs which is distracting. There is a site dedicated to Open Source PDF readers, pdfreaders.org, however, the Windows pdf readers each have their problems, mostly the interface is not as convenient (as evince, adobe or foxit). Here's a list of all PDF software from WikiPedia. There's a "Viewers" section for each OS. What Windows PDF reader would you recommend ?

Read the article

How do you parse an HTML in vb.net

- by tooleb

I would like to know if there is a simple way to parse HTML in vb.net. I know that HTML is not sctrict subset of XML, but it would be nice if it could be treated that way. Is there anything out there that would let me parse HTML in an XML-like way in VB.net?

Read the article

Fast, lightweight HTML parser for C++

- by Jen

I'm looking for a fast, lightweight open-source HTML parser -- something along the lines of a non-validating SAX parser (except, of course, for HTML). The answers to this question cover a parser that generates a DOM (don't want that), and these answers suggest conforming the HTML to XML before sending it to Xerxes (can't do that in my case). Any suggestions?

Read the article

How to parse malformed HTML in python, using standard libraries

- by bukzor

There are so many html and xml libraries built into python, that it's hard to believe there's no support for real-world HTML parsing. I've found plenty of great third-party libraries for this task, but this question is about the python standard library. Requirements: Use only Python standard library components (I'm currently using v2.6) DOM support Handle HTML entities ( ) Handle partial documents (like: Hello, <iWorld</i!) Bonus points: XPATH support Handle unclosed/malformed tags. (<bigdoes anyone here know <html ???

Read the article

linking jpeg image in html to php code

- by Avtar Brar

im creating a website which includes a button (JPEG image) that will locate ('a ref') a php file. I need the php to be called when the 'email.jpg' button is clicked on but only shows up as pure text in a web browser. any ideas on how to resolve this? any help is much appreciated! thanks MAIN HTML SITE CODE <div id="content-container"> <p align="center"><a href="video.mp4" class="html5lightbox" data-width="720" data-height="404"><img src="bg.jpg" width="1023" height="820" id="imgvideo" /></a> <div align="center"> <table width="1027" height="46" border="0" cellpadding="0px"> <tr> <td><a href="mail.php"><img src="email.png" width="130" height="46" /></a></td> </tr> </table> </div> </div> PHP FILE CODE (mail.php) <?php /*EMAIL TEMPLATE BEGINS*/ $imgSrc = 'bg.jpg'; $imgDesc = 'test_sell_new/'; $imgTitle = 'bg.jpg'; $subjectPara1 = 'Now Available'; $subjectPara2 = NULL; $subjectPara3 = NULL; $subjectPara4 = NULL; $subjectPara5 = NULL; $message = '<!DOCTYPE HTML>'. '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"'. '"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">'. '<html xmlns="http://www.w3.org/1999/xhtml">'. '<head>'. '<title>Available Now</title>'. '<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />'. '</head>'. '<body style="background-color:#ffffff; padding:0; margin:-10px 0 0 0; _margin:0 0 0 0; *margin:0 0 0 0; text-align:center;">'. '<table border="0" cellpadding="0" cellspacing="0" width="1024" style="background-color:#ffffff; font-family:Arial, Helvetica, sans-serif; font-size:10px; padding-top:0px; margin:auto;">'. '<tr>'. '.<td><a href="index.html" style="text-decoration:none"><p style="color:#000000; text-align:center; margin:10px 0 0 0; *margin:0 0 0 0; _margin:0 0 0 0; font-family:Arial, Helvetica, sans-serif; font-size:11px;">Having trouble viewing this message? Click here.</p></a></td>'. '</tr>'. '<tr>'. '<td>'. '<table border="0" cellspacing="0" cellpadding="0" width="100%" height="820">'. '<tr>'. '<td>'. '<a href="index.html"><img src="bg.jpg" width="1024" height="820" border="0" /></a><br />'. '</td>'. '</tr>'. '</table>'. '</td>'. '</tr>'. '</table>'. '</body>'. /*EMAIL TEMPLATE ENDS*/ $to = '[email protected]'; $subject = 'IT WORKS!'; $from = '[email protected]'; $headers = "From: " . $from . "\r\n"; $headers .= "Reply-To: ". $from . "\r\n"; $headers .= "CC: [email protected]\r\n"; $headers .= "MIME-Version: 1.0\r\n"; $headers .= "Content-Type: text/html; charset=ISO-8859-1\r\n"; ?>

Read the article

How to retain headers for all the pages of an exported pdf in php?

- by udaya

Hi I am exporting data from php page to pdf when the datas exceeed the page limit the header is not available for the consecutive pages function where i call the export to pdf is function changeDetails() { $bType = $this-input-post('textvalue'); if($bType == "pdf") { $this->load->library('table'); $this->load->plugin('to_pdf'); $data['countrytoword'] = $this->AddEditmodel1->export(); $this->table->set_heading('Country','State','Town','Name'); $out = $this->table->generate($data['countrytoword']); $html = $this->load->view( 'newpdf',$data, true); pdf_create($html, $cur_date); } } This is my view page from which i export data to pdf Name Country State Town Here I am getting the result as page:1 Name country State Town udaya india Tamilnadu kovai chandru srilanka columbo aaaaa page:2 vivek england gggkj gjgjkj in the page 2 i dont get the headers name, country ,state and town

Read the article

Converting Creole to HTML, PDF, DOCX, ..

- by Marko Apfel

Challenge We documented a project on Github with the Wiki there. For most articles we used Creole as markup language. Now we have to deliver a lot of the content to our client in an usual format like PDF or DOCX. So we need a automatism to extract all relevant content, merge it together and convert the stuff to a new format. Problem One of the most popular toolsets to convert between several formats is Pandoc. But unfortunally Pandoc does not support Creole (see the converting matrix). Approach So we need an intermediate step: Converting from Creole to a supported Pandoc format. Creolo/c is a Creole to Html converter and does exactly what we need. After converting our Creole content to Html we could use Pandoc for all the subsequent tasks. Solution Getting the Creole stuff First at all we need the Creole content on our locale machines. This is easy. Because the Github Wiki themselves is a Git repository we could clone it to our machine. In the working copy we see now all the files and the suffix gives us the hint for the markup language. Converting and Merging Creole content to Html Because we would like all content from several Creole files in one HTML file, we have to convert and merge all the input files to one output file. Creole/c has an option (-b) to generate only the Html-stuff below a Html <Body>-tag. And this is hook for us to start. We have to create manually the additional preluding Html-tags (<html>, <head>, ..), then we merge all needed Creole content to our output file and last we add the closing tags. This could be done straightforward with a little bit old DOS magic: REM === Generate the intro tags === ECHO ^<html^> > %TMP%\output.html ECHO ^<head^> >> %TMP%\output.html ECHO ^<meta name="generator" content="creole/c"^> >> %TMP%\output.html ECHO ^</head^> >> %TMP%\output.html ECHO ^<body^> >> %TMP%\output.html REM === Mix in all interesting Creole stuff with creole/c === .\Creole-C\bin\creole.exe -b .\..\datamodel+overview.creole >> %TMP%\output.html .\Creole-C\bin\creole.exe -b .\..\datamodel+domain+CvdCaptureMode.creole >> %TMP%\output.html .\Creole-C\bin\creole.exe -b .\..\datamodel+domain+CvdDamageReducingActivity.creole >> %TMP%\output.html .\Creole-C\bin\creole.exe -b .\..\datamodel+lookup+IncidentDamageCodes.creole >> %TMP%\output.html .\Creole-C\bin\creole.exe -b .\..\datamodel+table+Attachments.creole >> %TMP%\output.html .\Creole-C\bin\creole.exe -b .\..\datamodel+table+TrafficLights.creole >> %TMP%\output.html REM === Generate the outro tags === ECHO ^</body^> >> %TMP%\output.html ECHO ^</html^> >> %TMP%\output.html REM === Convert the Html file to Docx with Pandoc === .\Pandoc\bin\pandoc.exe -o .\Database-Schema.docx %TMP%\output.html Some explanation for this The first ECHO call creates the file. Therefore the beginning <html> tag is send via > to a temporary working file. All following calls add content to the existing file via >>. The tag-characters < and > must be escaped. This is done by the caret sign (^). We use a file in the default temporary folder (%TMP%) to avoid writing in our current folders. (better for continuous integration) Both toolsets (Creole/c and Pandoc) are copied to a versioned tools folder in the Wiki. This is committable and no problem after pushing – Github does not do anything with it. In this folder is also the batch (Export-Docx.bat) for all the steps. Pandoc recognizes the conversion by the suffixes of the file names. So it is enough to specify only the input and output files.

Search Results

Search found 32919 results on 1317 pages for 'html to pdf'.

Page 9/1317 | < Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16 | Next Page >

- by Jian Lin

- by metal-gear-solid

- by 108039818756939362532

- by Shane

- by RadiantHex

- by Mohamed Mohsen

- by SORRYPROFESSEROFYEARNING

- by mech

- by Yttric

- by Dave

- by Daniel S.

- by Tim

- by BloodyIron

- by racer_ace

- by Lachlan Roche

- by Andrei

- by Andrei Andre

- by Tom Feiner

- by tooleb

- by Jen

- by bukzor

- by Avtar Brar

- by udaya

- by Marko Apfel

< Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16 | Next Page >