Search Results

Search found 4479 results on 180 pages for 'pdf scraping'.

Page 39/180 | < Previous Page | 35 36 37 38 39 40 41 42 43 44 45 46  | Next Page >

  • Dinamically creating a member ID card as pdf using PHP?

    - by aefxx
    I need to code a PHP script that would let me generate a pdf file which displays a member ID card (something like a credit card used to identify oneself) at a certain resolution. Let me explain: I do have the basic blueprint of the card in png file format. The script needs to drop in a member's name and birth day along with a serial. So far, no problem - there are plenty of good working PHP librarys out there. My problem is to ensure that the resulting pdf (the generated image of the card, to be precise) meets a certain resolution (preferably 300dpi), so that printing it would look right. Any ideas? EDIT I solved it using the TCPDF library which let's you scale images at a certain resolution. Get it here: http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=tcpdf

    Read the article

  • Dynamically creating a member ID card as pdf using PHP?

    - by aefxx
    I need to code a PHP script that would let me generate a pdf file which displays a member ID card (something like a credit card used to identify oneself) at a certain resolution. Let me explain: I do have the basic blueprint of the card in png file format. The script needs to drop in a member's name and birthday along with a serial. So far, no problem - there are plenty of good working PHP libraries out there. My problem is to ensure that the resulting pdf (the generated image of the card, to be precise) meets a certain resolution (preferably 300dpi), so that printing it would look right. Any ideas? EDIT I solved it using the TCPDF library which lets you scale images at a certain resolution. Get it here: http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=tcpdf

    Read the article

  • how to deal with pdf annotation with ipad in objective c?

    - by Sarah
    Hello, I know that it may sound a silly question but i am really very confused. I am to work with one application that is having operations like PDF loading,annotation, scrolling,zooming and other such functions. Now my question is that i am little bit confused about what template i should use as i went through Quartz 2D Programming Guide and was little bit confused whether i'll be able to apply the above shown functions with the same guideline,as it displays the pdf page on the whole screen. Or is there any other way around? Please help me..Can i use UIWebView for the same functions as i listed above? I ll be grateful if you can help me. Thank you.

    Read the article

  • Is it not possible to print a pdf from a hyperlink?

    - by andrew
    I have looked for weeks and I keep hitting dead ends. I know you can create a text or image link and tell it to "print page" in a browser. But so far, I can't get it to print a document, specifically a pdf. I would like the print dialog to show after the link is clicked and yes, the pdf linked to has been printed. Why does this seem to be such an impossible feat? I have seen it work in a Flash movie, but since I cannot access the native file I cannot see how it was done. Any advice? Thanks.

    Read the article

  • How to generate a PDF of dynamic HTML content?

    - by chris Frisina
    I am trying to be able to allow users to generate content dynamically, and have that information be in a , and then allow that specific to be exportable to a pdf. I have got Joomla up and running (with the appropriate mySQL and ANT) locally with the Web2PDF extension, but how would I get those running on my domain (hosted by Dreamhost). Are there any other approaches you might recommend. The content is generated by JS and JQuery, and formatted with CSS and HTML. Other considerations: Web2PDF generates a PDF on the entire content, (pulling the entire page's HTML, not just the specific <div>.

    Read the article

  • How to know if a PDF contains only images or has been OCR scanned for searching?

    - by Bratch
    I have a bunch of PDF files that came from scanned documents. The files contain a mix of images and text. Some were scanned as images with no OCR, so each PDF page is one large image, even where the whole page is entirely text. Others were scanned with OCR and contain images and searchable text where text is present. In many cases even words in the images were made searchable. I want to make an automated process to recognize the text in all of the scanned documents using OCR, with Acrobat 8 Pro, but I don't want to re-OCR the files that have already been through the OCR process in the past. Does anyone know if there is a way to tell which ones contain only images, and which ones already contain searchable text? I'm planning on doing this in C# or VB.NET but I don't think being able to tell the two kinds of files apart is language dependent.

    Read the article

  • Why do some PDFs lag in Adobe Acrobat?

    - by Coldblackice
    I have a handful of PDFs open. One of them in particular is extremely laggy, almost to the point of being unreadable. When I scroll through its pages, it's almost like an extreme version of v-sync being turned off. Very choppy. Overall system resources are plentiful, and all of the other PDFs cruise up and down with no stuttering or problems. I've tried closing and reopening the problem PDF to no avail. It's a small PDF, only 3MB in size, with no graphics (only programming code snippets). Surely, it must be some type of problem with the specific PDF (I'll try opening it in another PDF-viewing program, rather than Acrobat X). Possible corruption? Could there be some type of GPU/hardware-acceleration intervening going on? I've never heard of such with PDF-viewing.

    Read the article

  • How to Split a Big Postscript file (3000 pages) into one individual file per page (using Windows 7)?

    - by Pablo
    Hi, I'm having trouble doing the following: I have a big PDF file that I converted to postscript (for commercial printing). The resulting file is too big to be processed by the printer (machine). I've been trying to find a way to either: Convert from the original (many pages) PDF file to many Postscript file (one postcript file per PDF page in original PDF file(. Convert from PDF to PS (or even EPS). - I managed to do this Then split the PS file into a collection of smaller files. I've tried using Ghostscript, but it is all gibberish to me. Thanks. PS. If you have a good GS tutorial (for dummies?), please share the link.

    Read the article

  • Web scraping with Python

    - by Jack
    I'm currently trying to scrape a website that has fairly poorly-formatted HTML (often missing closing tags, no use of classes or ids so it's incredibly difficult to go straight to the element you want, etc.). I've been using BeautifulSoup with some success so far but every once and a while (though quite rarely), I run into a page where BeautifulSoup creates the HTML tree a bit differently from (for example) Firefox or Webkit. While this is understandable as the formatting of the HTML leaves this ambiguous, if I were able to get the same parse tree as Firefox or Webkit produces I would be able to parse things much more easily. The problems are usually something like the site opens a <b> tag twice and when BeautifulSoup sees the second <b> tag, it immediately closes the first while Firefox and Webkit nest the <b> tags. Is there a web scraping library for Python (or even any other language (I'm getting desperate)) that can reproduce the parse tree generated by Firefox or WebKit (or at least get closer than BeautifulSoup in cases of ambiguity).

    Read the article

  • Web scraping with Python

    - by Jack
    I'm currently trying to scrape a website that has fairly poorly-formatted HTML (often missing closing tags, no use of classes or ids so it's incredibly difficult to go straight to the element you want, etc.). I've been using BeautifulSoup with some success so far but every once and a while (though quite rarely), I run into a page where BeautifulSoup creates the HTML tree a bit differently from (for example) Firefox or Webkit. While this is understandable as the formatting of the HTML leaves this ambiguous, if I were able to get the same parse tree as Firefox or Webkit produces I would be able to parse things much more easily. The problems are usually something like the site opens a <b> tag twice and when BeautifulSoup sees the second <b> tag, it immediately closes the first while Firefox and Webkit nest the <b> tags. Is there a web scraping library for Python (or even any other language (I'm getting desperate)) that can reproduce the parse tree generated by Firefox or WebKit (or at least get closer than BeautifulSoup in cases of ambiguity).

    Read the article

  • a question on webpage data scraping using Java

    - by Gemma
    Hi there. I am now trying to implement a simple HTML webpage scraper using Java.Now I have a small problem. Suppose I have the following HTML fragment. <div id="sr-h-left" class="sr-comp"> <a class="link-gray-underline" id="compare_header" rel="nofollow" href="javascript:i18nCompareProd('/serv/main/buyer/ProductCompare.jsp?nxtg=41980a1c051f-0942A6ADCF43B802'); " Compare Showing 1 - 30 of 1,439 matches, The data I am interested is the integer 1.439 shown at the bottom.I am just wondering how can I get that integer out of the HTML. I am now considering using a regular expression,and then use the java.util.Pattern to help get the data out,but still not very clear about the process. I would be grateful if you guys could give me some hint or idea on this data scraping. Thanks a lot.

    Read the article

  • Scraping paginated items from a website using scrapy

    - by Mridang Agarwalla
    I'm using scrapy to scrape items from a site. I'm not being able to implement this scraping pattern. The site I'm trying to scrape is a forum and I scrape the site once a day. Each page has a table containing posts. New posts are added to the top of the table and as more and more posts are posted to the site, the older posts go further into the pages due to pagination. This is a very simple scenario and we will assume that the order of the posts never change. I would like to scrape this site and scrape all the "new" records until the last scraped post from yesterday is encountered. I have configured my spider to paginate endlessly and when it encounters yesterday's last scraped post, it should stop. How can implement this? (My Scrapy installation works with my Django installation using django-dynamic-scraper )

    Read the article

  • Document conversion and viewing, what are the cutting edge solutions?

    - by DigitalLawyer
    Goal: building a web application where a user can: Upload a document (doc, docx, pdf, additional office formats a +) View that document in a browser, preferably in html Download the document (in doc, pdf, additional open formats a +) Current solution: Ruby on Rails Application on Rackspace Users can upload doc and pdf files (AWS) Files can be downloaded in the format in which they were uploaded Thumbnail generation ([doc, pdf] - pdf - png) is done through AbiWord. Certain doc files do not convert well. Documents can be viewed in embedded Google docs viewer (https://docs.google.com/viewer). Certain doc files cannot be displayed. Little flexibility. Potential improvements: Document viewing in pdf through pdf.js Viewing in html (+ annotation) through Crocodoc I'd be glad to hear other users' experiences, and will add good recommendations to this list.

    Read the article

  • Como Exportar Crystal Reports a Excel, Word, Rich Text, PDF ó HTML

    - by jaullo
    Cuando trabajamos con reportes siempre requerimos la funcionalidad de exportación. En crystal reports para asp.net, realizar esta tarea es sumamente sencillo. Sin embargo la pregunta más grande que salta siempre, es como realizarlo utilizando código Behind. Para poder acceder a las librerias de crystal y sus componentes, primero debemos importar los espacios de nombres: Normal 0 21 false false false ES X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Tabla normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Imports CrystalDecisions.CrystalReports.Engine Imports CrystalDecisions.Shared  CrystalDecisions.CrystalReports.Engine, nos servirá para poder manejar nuestro reportDocument y CrystalDecisions.Shared, será el medio que utilicemos para la exportación. Así que, veamos como podemos exportar nuestro informe sin tener que enviarlo a la impresora, recordemos que por defecto crystal reports ya tiene la opcion de exportar a PDF sin embargo debemos hacerlo tal como si fueramos a imprimir y que es lo que evitaremos acá. Colocamos un botón en nuestra pagina asp Normal 0 21 false false false ES X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Tabla normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} <asp:Button ID="btntopdf" runat="server" Text="Exportar a PDF" /> Y en nuestro boton deberemos ejecutar la siguiente rutina: Normal 0 21 false false false ES X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Tabla normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Protected Sub btntodpf_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles btntopdf.Click          'Cargar reporte. Enlazando a la fuente de datos.        LoadReporte()          'Mas adelante veremos que estas lineas las podemos obviar        Response.Buffer = False        Response.Clear()  'ClearContent, ClearHeaders          reporteDoc.ExportToHttpResponse(ExportFormatType.PortableDocFormat, Response, True, "NombreArchivo")       End Sub LoadReport, es el encargado de llenar nuestro crystal con la fuente de datos. Está fue la primer forma de exporta nuestro crystal reports, pero no es la única, así que vamos a ver otra forma en la cual utilizaremos el metodo v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} Normal 0 false 21 false false false ES X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Tabla normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} ExportToHttpResponse  Para este metodo, nuestro código en el botón cambia relativamente, pero antes de ello, daremos un repaso a los metodos utilizados. Nuestro primer parametro FormatType es un valor de tipo ExportFormatType, que puede corresponder a cualquiera de los metodos que enumeramos a continuación: CrystalReport: El formato al cual se exporta es de Tipo CrystalReport. Excel: El formato al cual se exporta es de tipo Excel ExcelRecord: El formato al cual se exporta es de Tipo Excel Record. NoFormat: No se ha especificado un formato de exportación. PortableDocFormat: El formato al cual se exporta es de Tipo PDF.  No voy a enumerar todos, pues me imagino que ya sabrán la idea de cada uno de los formatos, los numerados arriba son los mas importantes. Nuestro segundo parametro el objeto response nos permite adozar el archivo. Y por último, nuestro tercer parametro, definirá si debe ir como un objeto adjunto o no. Si lo colocamos en TRUE, estaremos enviando nuestro archivo como parametro, esto hará que no necesitemos las siguientes líneas de código: Normal 0 21 false false false ES X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Tabla normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Response.Buffer = False Response.Clear()   Con esto realizado, ya contamos con la posibilidad de enviar el archivo directamente al cliente.   Ahora si, veamos cuanto se ha reducido nuestro código: Unicamente nos quedan dos líneas de código en nuestro botón Normal 0 21 false false false ES X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Tabla normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}        'Cargar reporte. Enlazando a la fuente de datos.        LoadReport()          reporteDoc.ExportToHttpResponse(ExportFormatType.PortableDocFormat, Response, True, "NombreArchivo")   Para finalizar, nada mas decir que espero esto les sea de ayuda y por supuesto,  que les facilite la vida con el uso de crystal reports.

    Read the article

  • Purpose of Adobe PDF Link Helper

    - by user770750
    I have an idea of what this browser add-on does. Adobe PDF Browser Control (AcroPDF.dll) Apparently, if I disable this one, PDFs embedded in a page with the embed or object tag fail to function properly. So, its pretty clear as to its function. However, I can't find anywhere accurate documentation on what this add-on below does. Adobe PDF Link Helper (AcroIEHelperShim.dll) IE9 (with Reader X) seems to work flawlessly with it disabled. PDF's still open within the browser. Only if I uncheck Display PDF in Browser in Readers preferences does that cease. I played around on an XP VM with IE7 and Reader X... no isssues noticed when disabled. Does anyone know the purpose of this add-on? At one time I believed it was necissary for the 'within browser' functionality to work, though that was never verified. Something change?

    Read the article

  • Can PaperPort be used to convert a non OCR PDF to OCR PDF?

    - by Senseful
    My scanner came with the following software: ScanSoft PageViewer ScanSoft PaperPort ScanDirect I believe it also comes with a basic version of OmniPage I'm not sure which of these programs is the one that actually performs the OCR. When I scan a document, it can perform OCR on it and convert it to a searchable PDF. Is there any way I can take an existing image or PDF file and run the same OCR engine on it in order to create a searchable PDF?

    Read the article

  • Problems in "Save as PDF" plugin with Arabic numbers

    - by Mohamed Mohsen
    I use the "Save as PDF" plugin with Microsoft word 2007 to generate a PDF document from DOCX document. It works great except that the Arabic numbers in the word file have been converted to English numbers in the PDF document. Kindly find two links containing two screen shots explaining the problem. http://img27.imageshack.us/img27/2893/englishpdf.jpg http://img4.imageshack.us/img4/1857/arabicword.jpg The first image is the generated PDF file with the English numbers highlighted. The second image is the original word file with the Arabic numbers highlighted. Thanks in advance

    Read the article

  • Printer "ripping" forever (network printer)

    - by Julien Gorenflot
    Since I installed Ubuntu 11.10, printing is a disaster. I did not have the problem with Lucid Lynx (Ubuntu 10.04), but maybe it just comes from the fact that someone else had installed it for me, and possibly it configured better. When I print a pdf, even 2 pages, my printer (SHARP MX 2300N) stays in rippen for hours. "Rippen" is a German word, not really sure how to translate. Google translate says, The English equivalent is "Rib". And eventually, sometimes, the pages finally get printed. But in between my whole floor is very angry because they also need the printer. Additionally, I don't always have the whole day for waiting for my pages. I remember that when printing I used to be asked if I wanted to reduce transparency effects, which does not seem to happen anymore after I installed Ubuntu 11.10. Is there any connection? Not sure, because I don't think it was for pdf files.

    Read the article

  • Pros and cons of creating a print friendly page to remove the use of pdfs?

    - by Phil
    the company I work for has a one page invoice that uses the library tcpdf. they wanted to do some design changes that I found are just incredibly difficult for setting up in .pdf format. Using html/css I could easily create the page and have it print very nicely, but I have a feeling that I am over looking something. What are the pros and cons of setting up a page just for printing? What are the pros and cons of putting out a .pdf? I could also use the CSS inline so that if they wanted to download it and open it they could.

    Read the article

  • bad practice to create a print friendly page to remove the use of pdfs?

    - by Phil
    the company I work for has a one page invoice that uses the library tcpdf. they wanted to do some design changes that I found are just incredibly difficult for setting up in .pdf format. using html/css I could easily create the page and have it print very nicely, but I have a feeling that I am over looking something. is it a good practice to set up a page just for printing? and if not, is it at least better than putting out a ugly .pdf? I could also use the CSS inline so that if they wanted to download it and open it they could.

    Read the article

< Previous Page | 35 36 37 38 39 40 41 42 43 44 45 46  | Next Page >