Search Results

Search found 4479 results on 180 pages for 'pdf scraping'.

Page 16/180 | < Previous Page | 12 13 14 15 16 17 18 19 20 21 22 23 | Next Page >

Removing the password from a PDF file

- by Alister

I have a couple of ebooks as PDFs with passwords, however my ebook reader (sony prs600) doesn't seem to support PDFs with passwords. What is the easiest of removing the password from a PDF (I know the password, which presumably helps a lot). It's a bit annoying buying a book and then only being able to read it in front of a computer.

Read the article
Open Source PDF editor to make files non-printable

- by Nissan Fan

I need an open source solution to simply make any PDF unprintable. I know there's questions about editors, but I need this particular feature only and editing the content isn't a concern.

Read the article
Adjust colours in a PDF

- by user1035

I'd like to make colour adjustments to an existing PDF file, the equivalent of Photoshop's adjustments, and save a new version with the altered colours. I'm after more than a colourspace conversion. I'd like to take a file that's black and white, and convert it into green, blue, yellow, pink etc versions. It's upwards of 100 pages full of text and graphic elements, so doing it by hand isn't really an option. Is there any way of achieving this?

Read the article
PDF form (not) saving

- by gregseth

I've created a form in a PDF with Adobe Acrobat Pro. When empy, I want to use it as a template which the user opens, fills in, and saves as a copy to preserve the blank state of the template. Here's the trick : I found both ways to make the document read only - the user can't save the form value, only print them make the document writeable, but in this case the document acting as a template can be modified too. Any ideas? Thanks.

Read the article
Batch convert multiple PDF to Image on Mac

- by RdSchull

What would be the quickest way to convert a bunch of pdf files to jpeg files? I know I can open Preview and Save As image, but that could take a long time. Thanks for any help.

Read the article
Valid links in PDF Documents

- by Cosi

I am looking for a application which I can use to check my PDF documents that include links to ensure that they are still up-to-date?

Read the article
Scrape zipcode table for different urls based on county

- by Dr.Venkman

I used lxml and ran into a wall as my new computer wont install lxml and the code doesnt work. I know this is simple - maybe some one can help with a beautiful soup script. this is my code: import codecs import lxml as lh from selenium import webdriver import time import re results = [] city = [ 'amador'] state = [ 'CA'] for state in states: for city in citys: browser = webdriver.Firefox() link2 = 'http://www.getzips.com/cgi-bin/ziplook.exe?What=3&County='+ city +'&State=' + state + '&Submit=Look+It+Up' browser.get(link2) bcontent = browser.page_source zipcode = bcontent[bcontent.find('<td width="15%"'):bcontent.find('<p>')+0] if len(zipcode) > 0: print zipcode else: print 'none' browser.quit() Thanks for the help

Read the article
How does Cell Minute Tracker work?

- by embedded

It's been a mystery how does Cell Minute Tracker manage to fetch AT&T users data. Maybe someone here has the long waited answer. I'm really curious rather they got a confirmation to scrape user’s cellular report And how they can fire up multiple requests to AT&T site without being banned? I'm waiting for someone who could shed some light on this mystery Thanks link: http://www.uquery.com/apps/311637771-cell-minute-tracker-for-att

Read the article
How to scrape user's data without being banned by the server?

- by embedded

I'm developing a site which monitors user's date. It uses the cURL over PHP. It first gets authorized using cookie and then parses the required data. My problem is that it needs to fire multiple requests to the server (for all registered users) and this may Get me banned by the remote server. I would like to know if there is something I could do to prevent being banned. (This activity is legal - the users have provided their login information) Thanks

Read the article
How can I totally flatten a PDF in Mac OS on the command line?

- by Matthew Leingang

I use Mac OS X Snow Leopard. I have a PDF with form fields, annotations, and stamps on it. I would like to freeze (or "flatten") that PDF so that the form fields can't be changed and the annotations/stamps are no longer editable. Since I actually have many of these PDFs, I want to do this automatically on the command line. Some things I've tried/considered, with their degree of success: Open in Preview and Print to File. This creates a totally flat PDF without changing the file size. The only way to automate seems to be to write a kludgy UI-based AppleScript, though, which I've been trying to avoid. Open in Acrobat Pro and use a JavaScript function to flatten. Again, not sure how to automate this on the command line. Use pdftk with the flatten option. But this only flattens form fields, not stamps and other annotations. Use cupsfilter which can create PDF from many file formats. Like pdftk this flattened only the form fields. Use cups-pdf to hook into the Mac's printserver and save a PDF file instead of print. I used the macports version. The resulting file is flat but huge. I tried this on an 8MB file; the flattened PDF was 358MB! Perhaps this can be combined with a ghostscript call as in Ubuntu Tip:Howto reduce PDF file size from command line. Any other suggestions would be appreciated.

Read the article
how to write barcode in html format when using tcpdf

- by JewelThief

I am using TCPDF to generate PDF file using following command $pdf-writeHTML($htmlcontent, true, 0, true, 0); TCPDF also provides a way to create barcode with following commands $pdf-Cell(0, 0, 'C39+', 0, 1); $pdf-write1DBarcode('Code 39', 'C39+', '', '', 80, 15, 0.4, $style, 'N'); $pdf-Ln(); I want to be able to write barcode as part of the HTML code above. Is there easy way? I can potentially call a barcode image inthe writeHTML code above, but not sure how to use above barcode function ( or any in TCPDF) which would allow me to create image and then get that image into HTML generation

Read the article
Print same text several time to one page

- by RiaD

I have a odt(or pdf, or ps) file. [Really I have odt, but I can easily convert it], it consist of 1 page. No I want to print it to another pdf 4 times to 1 page. There is an option pages per side, so If I copy-paste 4 times my document and set this option to 4 I'll have my expected result. But I want to do it without copy-paste because it's quite annoying to copy-and-paste before each printing. Is it simpler way?

Read the article
How to Edit PDFs?

- by snowguy

I typically have two needs: Scenario A. Change a single PDF page. In this case I have a PDF but not the original source file used to create the PDF. I don't want to try to recreate the document from scratch. I'd like to open the PDF and change a few things. A good example of this scenario: I was responsible for planning a big event at a campground site, I had a PDF of the site. I wanted to start with that document, highlight some parts, add some labels, remove some parts that weren't relevant. or Scenario B. Combine PDFs or extract information from a PDF This scenario usually arises because I want a single PDF deliverable that is made up of parts that are best created in different programs. In this case I have the source files for all the documents but they don't play well enough together to easily create a single PDF deliverable. For part of it, I may want to use Libre Office Writer. For another page I may want to use Gimp. Still another page I may use Libre Office Calc. I could use Writer as the master document and embed images or the Calc object into that, but for ultimate control, you can't beat separate PDF documents that are then combined. What are the best tools / processes for editing PDFs in Ubuntu?

Read the article
Embedded pdf object steals focus and will not let it go

- by Kristian Hebert

Hi guys, I was given the task of adding some usability to one of our applications, ie. make sure that every controll has a shortcut key, and that they can be reached by "tabbing" through the page. The gui runs in a IE. control on a winform, and consists of asp.net pages, so basically it is just asp.net always running in internet explorer. My problem is that one of the pages has an embeded pdf in it, like so: <object tabindex="-1" height="273" width="663" visible="false" type="Application/pdf" data="showpdf.ashx#navpanes=0"></object> showpdf.ashx is an httphandler, that streams the pdf contents to the response. It does not handle focus in any way. Now when I run this page, the pdf application steals focus, no matter what I do to set it to another control. And when it takes focus, I cannot take it back with the keyboard. Only a mouseclick on the page will set it to another control. I have tried to set focus in code behind OnPreRender, or in jevescript, but no luck. It seems that the http handler always runs after all the other code, and it sets focus on the pdf object. Any thought would be greatly appreciated.

Read the article
download html file as pdf using abcpdf

- by nandini

how can i download html file as pdf using abcpdf in asp.net,c#

Read the article
Imagemagick PDF to JPG conversion failing

- by sbressler

I'm trying to convert the first page of a PDF to a JPG. I'm pretty sure I got this to work with certain PDFs, but is it really possible that certain PDFs are made incorrectly and cannot be converted? I tried running this first: $ convert 10-03-26.pdf[1] test.jpg And I got the follow: Error: /syntaxerror in readxref Operand stack: Execution stack: %interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push 1 3 %oparray_pop 1 3 %oparray_pop --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- Dictionary stack: --dict:1062/1417(ro)(G)-- --dict:0/20(G)-- --dict:73/200(L)-- --dict:73/200(L)-- --dict:97/127(ro)(G)-- --dict:229/230(ro)(G)-- --dict:14/15(L)-- Current allocation mode is local ESP Ghostscript 7.07.1: Unrecoverable error, exit code 1 convert: Postscript delegate failed `10-03-26.pdf'. Running this instead: $ convert -verbose -colorspace rgb '10-03-26.pdf[1]' test.jpg I get the following: Error: /syntaxerror in readxref Operand stack: Execution stack: %interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push 1 3 %oparray_pop 1 3 %oparray_pop --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- Dictionary stack: --dict:1062/1417(ro)(G)-- --dict:0/20(G)-- --dict:73/200(L)-- --dict:73/200(L)-- --dict:97/127(ro)(G)-- --dict:229/230(ro)(G)-- --dict:14/15(L)-- Current allocation mode is local ESP Ghostscript 7.07.1: Unrecoverable error, exit code 1 "gs" -q -dBATCH -dSAFER -dMaxBitmap=500000000 -dNOPAUSE -dAlignToPixels=0 "-sDEVICE=pnmraw" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-g792x1611" "-r72x72" -dFirstPage=2 -dLastPage=2 "-sOutputFile=/tmp/magick-XXU3T44P" "-f/tmp/magick-XXoMKL8Z" "-f/tmp/magic2eec1F"Start of Image Define Huffman Table 0x00 0 1 5 1 1 1 1 1 1 0 0 0 0 0 0 0 Define Huffman Table 0x01 0 3 1 1 1 1 1 1 1 1 1 0 0 0 0 0 Define Huffman Table 0x10 0 2 1 3 3 2 4 3 5 5 4 4 0 0 1 125 Define Huffman Table 0x11 0 2 1 2 4 4 3 4 7 5 4 4 0 1 2 119 End Of Image convert: Postscript delegate failed `10-03-26.pdf'. Why would the conversion fail? Thanks!

Read the article
Use a System.Drawing.Printing.PrintDocument to generate a PDF in memory

- by MarkB29

Does anyone know if the following is possible and if so what the best way of doing it is for free? I am generating a PrintDocument in a project I am currently working on and displaying a print dialog box so a user can choose which printer they want to use etc. The is currently a windows form application and if a user wants to print to a PDF they can select to print to CutePDF or something similar. However I am now putting a ASP.Net web frontend on the application and want to use the same code to generate the PrintDocument but want to print it to a PDF on the fly and serve it up via the Response stream in the format of a PDF download. So my question is....How can I use the current PrintDocument and generate a PDF in memory from it?? Thanks

Read the article
Load PDF from Memory ASP.Net

- by Sandhurst

I am using ITextSharp to generate pdf on the fly and then saving it to disk and display it using Frame. The Frame has an attribute called src where I pass the generated file name. This all is working fine what I want to achieve is passing the generated pdf file to Frame without saving it to disk. HtmlToPdfBuilder builder = new HtmlToPdfBuilder(PageSize.LETTER); HtmlPdfPage first = builder.AddPage(); //import an entire sheet builder.ImportStylesheet(Request.PhysicalApplicationPath + "CSS\\Stylesheet.css"); string coupon = CreateCoupon(); first.AppendHtml(coupon); byte[] file = builder.RenderPdf(); File.WriteAllBytes(Request.PhysicalApplicationPath+"final.pdf", file); printable.Attributes["src"] = "final.pdf";

Read the article
pdf viewer for pyqt4 application?

- by japs

Hi all, I'm writing a Python+Qt4 application that would ideally need to pop up a window every once in a while, to display pdf documents and allow very basic operations, namely scrolling through the different pages and printing the document. I've found the reportLab to create pdf files, but nothing about pdf viewers. Does anyone knows anything that might help. i was really hoping for the existence of something like the QWebView widget... thanks in advance to all

Read the article
Problem to display a pdf from my JSF Portlet of Liferay

- by Stefano

I use liferay 5.2 with jsf-portlet. From the page I want to press a button to generate one PDF. In managedbean i build pdf and I want to show it in response. In a ByteArrayOutputStream named outputStream i have my pdf built with JasperReport. I write: PortletResponse portletResponse = (PortletResponse)externalCtx.getResponse(); HttpServletResponse httpResponse = PortalUtil.getHttpServletResponse(portletResponse); ServletOutputStream out = httpResponse.getOutputStream(); String filename="Pdf" + System.currentTimeMillis()+".pdf"; httpResponse.reset(); httpResponse.setContentType("application/pdf"); httpResponse.setHeader("Content-Disposition", "attachment; filename=\""+ filename + "\""); httpResponse.setContentLength(outputStream.size()); outputStream.writeTo(out); out.flush(); out.close(); I do not see anything output! In jboss log i read: IllegaStateException.... What is wrong? LOG 11:03:19,716 INFO [STDOUT] 11:03:19,716 ERROR [IncludeTag] Current URL /web/organo-di-governo/datawarehouse?p_p_id=1_WAR_Portlet_Datawarehouse_INSTANCE_D7s7&p_p_lifecycle=1&p_p_state=normal&p_p_mode=view&p_p_col_id=column-2&p_p_col_count=1&_1_WAR_Portlet_Datawarehouse_INSTANCE_D7s7_com.sun.faces.portlet.VIEW_ID=%2Fview.xhtml&_1_WAR_Portlet_Datawarehouse_INSTANCE_D7s7_com.sun.faces.portlet.NAME_SPACE=_1_WAR_Portlet_Datawarehouse_INSTANCE_D7s7_ generates exception: null 11:03:19,717 INFO [STDOUT] 11:03:19,717 ERROR [IncludeTag] java.lang.IllegalStateException at com.liferay.portal.servlet.filters.strip.StripResponse.getWriter(StripResponse.java:85) at org.apache.jasper.runtime.JspWriterImpl.initOut(JspWriterImpl.java:125) at org.apache.jasper.runtime.JspWriterImpl.flushBuffer(JspWriterImpl.java:118) at org.apache.jasper.runtime.JspWriterImpl.write(JspWriterImpl.java:326) at org.apache.jasper.runtime.JspWriterImpl.write(JspWriterImpl.java:342) at org.apache.jasper.runtime.JspWriterImpl.print(JspWriterImpl.java:468) at com.liferay.taglib.util.ThemeUtil.includeVM(ThemeUtil.java:208) at com.liferay.taglib.util.ThemeUtil.include(ThemeUtil.java:68) at com.liferay.taglib.util.IncludeTag.doEndTag(IncludeTag.java:59) at org.apache.jsp.html.common.themes.portal_jsp._jspx_meth_liferay_002dtheme_005finclude_005f1(portal_jsp.java:816) at org.apache.jsp.html.common.themes.portal_jsp._jspx_meth_c_005fotherwise_005f0(portal_jsp.java:788) at org.apache.jsp.html.common.themes.portal_jsp._jspService(portal_jsp.java:724) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:373) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:336) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) 11:03:19,718 ERROR [[jsp]] Servlet.service() for servlet jsp threw exception java.lang.IllegalStateException 11:03:19,719 ERROR [[Main Servlet]] Servlet.service() for servlet Main Servlet threw exception java.lang.IllegalStateException at com.liferay.portal.servlet.filters.strip.StripResponse.getWriter(StripResponse.java:85) at org.apache.jasper.runtime.JspWriterImpl.initOut(JspWriterImpl.java:125) at org.apache.jasper.runtime.JspWriterImpl.flushBuffer(JspWriterImpl.java:118) 11:03:19,722 INFO [STDOUT] 11:03:19,720 ERROR [OpenSSOFilter] org.apache.jasper.JasperException: java.lang.IllegalStateException org.apache.jasper.JasperException: java.lang.IllegalStateException at org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:521) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:409) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:336) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265) 11:03:19,722 INFO [STDOUT] n.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96) Caused by: java.lang.IllegalStateException at com.liferay.portal.servlet.filters.strip.StripResponse.getWriter(StripResponse.java:85) at org.apache.jasper.runtime.JspWriterImpl.initOut(JspWriterImpl.java:125) at org.apache.jasper.runtime.JspWriterImpl.flushBuffer(JspWriterImpl.java:118)

Read the article
Password protected PDF using C#

- by balaweblog

I am creating a pdf document using C# code in my process. I need to protect the docuemnt with some standard password like "123456" or some account number. I need to do this without any reference dlls like pdf writer. I am generating the PDF file using SQL Reporting services reports. Is there are easiest way.

Read the article
CSS to PDF, using THEAD for repeating header on new page

- by behrk2

Hey everyone, I have CSS and HTML that I will be converting into PDF. I want to specify a header on each page that, in the PDF, will repeat on each new page. I know that I can use THEAD to specify the header, however, is there a free html-to-pdf converter that will respect the THEAD tag? If not, are there any alternatives? Thanks...

Read the article
Extracting images from a PDF

- by sagar

My Query I want to extract only images from a PDF document, using Objective-C in an iPhone Application. My Efforts I have gone through the info on this link, which has details regarding different operators on PDF documents. I also studied this document from Apple about PDF parsing with Quartz. I also went through the entire PDF reference document from the Adobe site. According to that document, for each image there are the following operators: q Q BI EI I have created a table to get the image: myTable = CGPDFOperatorTableCreate(); CGPDFOperatorTableSetCallback(myTable, "q", arrayCallback2); CGPDFOperatorTableSetCallback(myTable, "TJ", arrayCallback); CGPDFOperatorTableSetCallback(myTable, "Tj", stringCallback); I use this method to get the image: void arrayCallback2(CGPDFScannerRef inScanner, void *userInfo) { // THIS DOESN'T WORK // CGPDFStreamRef stream; // represents a sequence of bytes // if (CGPDFDictionaryGetStream (d, "BI", &stream)){ // CGPDFDataFormat t=CGPDFDataFormatJPEG2000; // CFDataRef data = CGPDFStreamCopyData (stream, &t); // } } This method is called for the operator "q", but I don't know how to extract an image from it. What should be the solution for extracting the images from the PDF documents? Thanks in advance for your kind help.

Read the article
Mac, PDFKit, PDF/X

- by PhillipeTKern

I'm in need of generating and viewing PDF/X-1a. After spending quite some time I came to the conclusion that the only way (hopefully someone will prove me wrong) to achieve this is to use Cocoa. More context: I need to generate PDF/X-1a, that is, all the fonts embeded, spot colors, overprint, ... preferably from Python. But the only libraries which support such things are iText and perhaps and Apache FOP. The first one could be used with Jython, which is ok but not optimal. But then, I simply could not find any viable viewer. Poppler, xpdf, Sumatra, mupdf, ghostscirpt - all of them - just cannot handle large CMYK pdfs, lots of text, ... I really would like to use open sourced libraries but unfortuntatelly the only option I see right now is to buy Mac as I saw one could print (Save-as) to built-in PDF printer under PDF/X conformance and I expect Preview to be comparable, if not better, to Acrobat. But I'm not sure if is it even possible to programmatically access the MacOS libraries responsible for generating PDF? I'm asking because I heard not everything from MacOS is available via public API... (And what Python goes, I thought I could use PyObjC..?) Any others ideas are of course very welcomed!

Read the article
Delphi Load and Edit Pdf Documents

- by Pieter van Wyk

Hi Does anybody know of a product that allows loading and editing of PDF files into Delphi? We need to break apart a pdf document with multiple images (one per page) into single pdf's. Regards, Pieter

Read the article

< Previous Page | 12 13 14 15 16 17 18 19 20 21 22 23 | Next Page >