pdf scraping - Page 36 - Developer IT

PDF Where does Acrobat reader save information about last view settings?

- by Alex Kachanov

I usually check "Restore last view settings" in preferencs of Acrobat Reader. But I'm actually wondering where Acrobat Reader saves this information? In Windows Registry? In PDF file itself? On my local disk?

Read the article

How can I sign a PDF document quickly and cheaply?

- by user35981

I need to sign a PDF document. However, Acrobat reader does not let me sign documents. I just need to sign the document, not edit it. Do I need to buy the full Acrobat software? Or is there a better, simpler way?

Read the article

How to keep word document, html and pdf documentation aligned

- by dendini

Is there a way to write documentation in a WYSIWYG editor which can then export into HTML, WORD and PDF and keep copies synchronized? This documentation are mostly technical notes and some contextual help for some softwares so they must contain images and some styling, they are not programmer's documentation (API list or functions list) for which probably a program like Javadoc or Doxygen would be the best choice. For example how do companies with hundreds different software lines and thousands of programmers deal with this? I have several solutions but they all seem lacking in some aspect: Latex/Tex : very good pdf and html export, not very user friendly and no full-blown WYSIWYG editor available. LibreOffice/OpenOffice : full blown WYSIWYG editor however html export not so good (need to edit manually exported html which needs to be maintained separately ) Mediawiki or any other wiki : could be keeping documentation in wikitext format, so html is automatically generated, pdf exportation is quite good with many available plugins. Again however need some formation for the staff to use it and need to setup a server for this. Notice I'm not asking for software A vs software B, I'm asking for general advice, big companies procedures for documentation and yes some software product names if available.

Read the article

Free 48 Page PDF Guide to Mastering Social Networking with Gwibber [Linux]

- by Asian Angel

Are you using Gwibber on your Linux system but not making full use of its’ potential? This free 48 page PDF guide will show you how to use and tweak Gwibber for the best performance when it comes to working with your social networks. Photo courtesy of Gwibber Blog. Examples of sections included in the guide are: Installing and getting started with Gwibber Becoming familiar with Gwibber’s UI Broadcasting and interacting on your social networks through Gwibber Filtering the flow of information Customizing the interface And more Here is an excerpt from the section on Filtering the Flow of Information. The step by step instructions combined with helpful, labeled screenshots make this a nice guide to have for anyone wanting to get the most out of Gwibber for their social networking needs. Note: Gwibber works with Twitter, identi.ca, StatusNet, Facebook, FriendFeed, Digg, Flickr, and Qaiku. Download the Master Social Networking with Gwibber PDF Guide [via OMG! Ubuntu!] *Note: In this instance this is a direct download of the PDF Guide itself. Visit the Gwibber Homepage HTG Explains: What Are Character Encodings and How Do They Differ?How To Make Disposable Sleeves for Your In-Ear MonitorsMacs Don’t Make You Creative! So Why Do Artists Really Love Apple?

Read the article

Symfony2 - PdfBundle not working

- by ElPiter

Using Symfony2 and PdfBundle to generate dynamically PDF files, I don't get to generate the files indeed. Following documentation instructions, I have set up all the bundle thing: autoload.php: 'Ps' => __DIR__.'/../vendor/bundles', 'PHPPdf' => __DIR__.'/../vendor/PHPPdf/lib', 'Imagine' => array(__DIR__.'/../vendor/PHPPdf/lib', __DIR__.'/../vendor/PHPPdf/lib/vendor/Imagine/lib'), 'Zend' => __DIR__.'/../vendor/PHPPdf/lib/vendor/Zend/library', 'ZendPdf' => __DIR__.'/../vendor/PHPPdf/lib/vendor/ZendPdf/library', AppKernel.php: ... new Ps\PdfBundle\PsPdfBundle(), ... I guess all the setting up is correctly configured, as I am not getting any "library not found" nor anything on that way... So, after all that, I am doing this in the controller: ... use Ps\PdfBundle\Annotation\Pdf; ... /** * @Pdf() * @Route ("/pdf", name="_pdf") * @Template() */ public function generateInvoicePDFAction($name = 'Pedro') { return $this->render('AcmeStoreBundle:Shop:generateInvoice.pdf.twig', array( 'name' => $name, )); } And having this twig file: <pdf> <dynamic-page> Hello {{ name }}! </dynamic-page> </pdf> Well. Somehow, what I just get in my page is just the normal html generated as if it was a normal Response rendering. The Pdf() annotation is supposed to give the "special" behavior of creating the PDF file instead of rendering normal HTML. So, having the above code, when I request the route http://www.mysite.com/*...*/pdf, all what I get is the following HTML rendered: <pdf> <dynamic-page> Hello Pedro! </dynamic-page> </pdf> (so a blank HTML page with just the words Hello Pedro! on it. Any clue? Am I doing anything wrong? Is it mandatory to have the alternative *.html.twig apart from the *.pdf.twig version? I don't think so... :(

Read the article

SQL Developer: BLOBs and the External Editor

- by thatjeffsmith

We already know how easy it is to view images and plain text with the BLOB editor, yes? But what if I have in my column a bunch of PDFs stored? I want to see that stuff without having to save the file, finding it, and then opening it. Why can’t I just automatically open it directly from the database? Well, it seems you can. Here’s how. External Editors Step 1: Make sure you have the file types and associated editors defined in the preferences. External editors available from the BLOB viewer Based on what’s going on in your OS, you’ll have several of these already defined. If not, it’s pretty simple to add them manually. Now, assuming you’ve got some fun data loaded up, let’s try it out. A PDF As you can see in the screenshot above, PDF is mapped to Adobe Reader. I just happen to have a PDF loaded into a BLOB, let’s send it to the external editor. Click on the hyperlinked text to load the PDF straight to Adobe Here’s it working in action (click on the image to see the animation): If it’s a big file, you will see a dialog where we’re downloading the data. Now if I were to edit said document and save it back to the database via the ‘Load’ mechanism, then we’ve come full circle.

Read the article

Scraping *.aspx content using Python

- by tomato

I'm having difficulties scraping dynamically generated table in ASPX. Trying to scrape the gas prices from a site like this GasPrices. I can extract all the information in the gas price table (address, time submitted etc.), except for the actual gas price. Is there a way I could scrape the gas prices? i.e. somehow get a text representation of it. I'm not very familiar with ASP/ASPX - but what's being generated now is not showing up in the final HTML. I'm using Python to do the scraping, but that's irrelevant unless there's a specific library... Thanks in advance.

Read the article

Scraping with multiple IP, in java.

- by Titi Wangsa bin Damhore

Well basically I have a scraping application. It scrapes around n items per minute. currently i have only one IP. The site i'm scraping allows me 3 connections per IP. I'm thinking about getting another IP. so i'll be able to get 6 connections. in theory i should be able to get n items in 40 seconds, more or less. currently i'm using java (commons-httpcore) to get the job done. I'm not sure if this is java question or an OS question. my machine has IP 1 and IP 2 how do i connect to, say, www.microsoft.com, using IP 1 and using IP2? how can i specify, which ip i want to use to do a connection?

Read the article

Best way to process a queue in C# (PDF treatment)

- by Bartdude

First of all let me expose what I would like to do : I already dispose of a long-time running webapp developed in ASP.NET (C#) 2.0. In this app, users can upload standard PDF files (text+pics). The app is running in production on a Windows Server 2003 and has a dedicated database server (SQL server 2008) also running Windows Server 2003. I myself am a quite experienced web developer, but never actually programmed anything non-web (or at least nothing serious). I plan on adding a functionality to the webapp for which I would need a jpg snapshot of each page of the PDF. Creating these "thumbnails" isn't the big deal as such, I already do it inside my webapp using ghostscript. I've only done it on 1 page documents for now though, and the new functionality will need to process bigger documents. In order for this process to be transparent aswell for the admins as the final users, I would like to implement some kind of queue to delay the processing of the thumbnails. There again, no problem to create the queue, it will consist of records in a table, with enough info to find the pdf document back. Then I will need to process this queue, and that's were my interrogations start. Obviously the best solution to process it isn't an ASP script or so, so I will have to get out of my known environment. No problem, but I have no idea which direction to go. Therefore, a few questions : What should I develop ? I presumably need something that is "standby" on the server, runs when needed, then returns to idle state until further notice.Should I be looking into Windows service ? Is there another more appropriate type of project ? Depending on the first answer, what will be the approach ? Should I have somehow SQL server "tell" the program/service/... to process the queue, or should I have that program/service/... periodically check the state of the queue and treat new items. In both case, which functionality can I use ? we're not talking about hundreds of PDF a day (max 50 maybe), I can totally afford to treat the queue 1 item at a time. Can you confirm I don't have to look much further on threads and so ? (I found a lot of answers talking about threads in queue treatment, but it looks quite overkill for my needs) Maybe linked to the previous question : what about concurrent call to the program, whatever it is ? Let's suppose it is currently running, and a new record comes in the queue, what should be the behaviour ? I don't need much detailed answers and would already be happy with answers like "You can do the processing with a service, and yes it's possible to have sqlserver on machine A trigger a service start on machine B" or "You have to develop xxx and then use the scheduler to run it every xxx minutes". I don't mind reading articles and so, but I can hardly afford to spend too much time learning stuff to finally realize I went the wrong way for this project, so basically I'm trying to narrow down the scope of matters I need to investigate. Thanks for reading me, I hope I'll find some helping hands on here :-)

Read the article

find command in Linux to locate pdf files

- by Martin

My goal is to find all pdf files on a remote machine, so I resort to the useful command find. So I type find ~ *.pdf or find ~ "*.pdf" and I get nothing. I do the same on my machine and I get nothing. I do a regular search from the menu on my machine and I find quite a few pdf files. Would somebody please tell me what am I doing wrong?

Read the article

How to open a second instance of the pdf x-change viewer?

- by rumtscho

Every time I open a new document in the pdf x-change viewer, it gets opened in the same window. I want to place different documents on different places on the screen (independently of each other, not just tiling the window for side-by-side view), but cannot do it. Even going to the start menu and starting the reader again doesn't open a new instance of the application. Can I force the program to open a new instance or a new independent window? If this is impossible, which other free reader does what I need and also lets me make changes to the file? I don't need to edit the text itself, but I want to be able to add comments, underline and highlight text, and add some graphic elements (e.g. a circle or a freehand line).

Read the article

Do I need a license to create pdf files? [closed]

- by Fire-Dragon-DoL

I hope this is the correct place where I could ask this question. My mother is an accountant with a degree in economics. She works as a freelancer and she needs some licenses for her job. The biggest problem is adobe acrobat standard, which costs 400€, quite a lot. I want understand if she must buy it to create pdf files or she can use some free (even for commercial use) programs that she has because of her job (the chamber of commerce provide some advantages to accountants). She is actually using PDFCreator, which as I can read is free for business usage (open source also!!): http://sourceforge.net/projects/pdfcreator/ Thanks for any suggestion

Read the article

Adobe Reader issue in Ubuntu 13.10

- by Ridwan Ahmed Khan

I have downloaded adobe reader 9.5.5 and installed it using gdebi.Now if I click on any pdf it is not starting.I tried "acroread" in terminal and it is showing me this error /opt/Adobe/Reader9/Reader/intellinux/bin/acroread: error while loading shared libraries: libxml2.so.2: cannot open shared object file: No such file or directory Then I have installed libxml2.But still it is showing the same above error. My system OS is ubuntu 13.10. Is there any solution to my problem for using Adobe reader or any other alternative pdf reader other than foxit and default(evince) or okular by using which I can highlight any text in my pdf?

Read the article

blurry images with mogrify & convert

- by user140393

Does anyone know why this image is so blurry? I did a convert from pdf to png and it turned out like that. Before deleting imagemagick and it's entire toolset from software-center most of my image programs were displaying like this image. Now for the most part it's just blurry, though a couple still display like that such as gimp. I am running in xfce maybe it's to do with the distribution enviornment. Main issue is the absurd blurriness. I reinstalled all additional packages that were available for imagemagick in the software-center I use convert *.pdf *.png & mogrify -format png *.pdf to convert Now on the other hand if I converted the file to djvu and converted that to a tif. The images have no problem converting. More so it does not generate an oversized tif file of around 25mb compared to 3mb with djvu which is super clear & no blurriness.

Read the article

Drupal 7: Documents as a node/block/field

- by WernerCD

I'm working on my first Drupal site. I've progressed in learning the basics . I still have a lot to learn tho. Using FileViewer I can load a PDF saved in a field, for view content of various types. I haven't found something that does the same for Word Docs, Excel, PDF, etc. Does anyone know of something that works in Drupal 7 to load documents other than PDF like FileViewer does inside a browser? Or like Scribd does (Scribd is hosted. I am behind a firewall with limited access for users. So I don't want to use a Scribd like service.)

Read the article

How do I set a wine program (ex. Foxit Reader for Windows) as the default program?

- by To Do

I regularly annotate pdf files and unfortunately there is no good linux pdf reader that supports decent annotations. Evince has a very rudimentary and buggy annotation feature. So I'm stuck using a Windows viewer through wine. This works pretty well but, when I simply right-click a file (in this case a pdf), properties, open with and selected Foxit Reader, the Unity Launcher icon remained the wine icon instead of the application icon. Has anyone set a wine program as the default program for any file? Any ideas?

Read the article

Is there any PDF parser written in objective-c or c?

- by user549683

I'm writing a pdf reader iPhone application. I know how to show pdf file in view using CGPDF** classes in iOS. What I want to do now is to search text in pdf file, and highlight the searched text. So, I need a library which can detect what text is in what position. Besides, I want the library able to handle unicode and Chinese characters. I've searched for a few days but still cannot find anything suitable. I've tried xpdf, but it is written in c++. I don't know how to use c++ code in iPhone app. I've also tried http://www.codeproject.com/KB/cpp/ExtractPDFText.aspx but it does not handle Chinese characters. I've tried to code by myself, but the encoding in PDF is really complicated. For example, I don't know what to refer to when I want to decode the text by the following font: 8 0 obj << /Type /Font /Subtype /Type0 /Encoding /Identity-H /BaseFont /RNXJTV+PMingLiU /DescendantFonts [ 157 0 R ] >> endobj 157 0 obj << /Type /Font /Subtype /CIDFontType2 /BaseFont /RNXJTV+PMingLiU /CIDSystemInfo << /Registry (Adobe) /Ordering (CNS1) /Supplement 0 >> /FontDescriptor 158 0 R /W 161 0 R /DW 1000 /CIDToGIDMap 162 0 R >> endobj 158 0 obj << /Type /FontDescriptor /Ascent 801 /CapHeight 711 /Descent -199 /Flags 32 /FontBBox [0 -199 999 801] /FontName /RNXJTV+PMingLiU /ItalicAngle 0 /StemV 0 /Leading 199 /MaxWidth 1000 /XHeight 533 /FontFile2 159 0 R >> endobj

Read the article

What is the best way to create PDF reports with iText and zip them together?

- by Suresh S

I have to create a pdf report using apache itext api and report should be zipped .For example there is a report to be generated for people staying in a location of a state. for each state there are many locations, for each location , details of eachh people under the location should be saved as pdf (for each people) finally all the pdf for a location should be zipped ,this way finally all zip files for all locations should be zipped and placed under the zip file for states . my question is how best we can develop code in java. i want a skeleton framework to do the above functionality . i thought of using recursion method.also let me know from experienced users of zip api, will there be any error during creation of many zip files.

Read the article

openoffice document (odt) to PDF with commad line on Linux?

- by Data-Base

Hi, we are building a PHP script that we need at work to create reports in PDFs the reports will be created by using templates from postgrSQL. so far I found that it can be done with the use of php and odt (openoffice) files [http://www.odtphp.com/] (do you have any other suggestions?) now how I can convert the results to PDF so teachers will get the final reports as PDF any tips? the server has no GUI and I want to make it as simple as possible we tried using PHP to PDF directly with FPDF [http://www.fpdf.org/] but it is really a CPU killer! cheers

Read the article

Adobe publie deux méthodes pour contrer l'exploitation de PDF malicieux mise à jour la semaine derni

Mise à jour du 08/04/10 Adobe publie deux méthodes pour contrer l'exploitation de PDF malicieux Mise à jour la semaine dernière dans un "proof of concept" Suite au "proof of concept" (POC) de Didier Stevens qui montrait comment réaliser une attaque en utilisant un PDF malicieux (une méthode qui, en ce qui concerne Adobe, impliquait une forte part d'"ingénierie sociale", autrement dit de manipulation de l'utilisateur par l'affichage d'un message modifiée) , Adobe a décidé d'apporter des modifications à ses applications (Acrobat et Reader). En attendant que celles-ci soient effectives, la société vient d'éd...

Read the article

The Case of the Extra Page: Rendering Reporting Services as PDF

- by smisner

I had to troubleshoot a problem with a mysterious extra page appearing in a PDF this week. My first thought was that it was likely to caused by one of the most common problems that people encounter when developing reports that eventually get rendered as PDF is getting blank pages inserted into the PDF document. The cause of the blank pages is usually related to sizing. You can learn more at Understanding Pagination in Reporting Services in Books Online. When designing a report, you have to be really careful with the layout of items in the body. As you move items around, the body will expand to accommodate the space you're using and you might eventually tighten everything back up again, but the body doesn't automatically collapse. One of my favorite things to do in Reporting Services 2005 - which I dubbed the "vacu-pack" method - was to just erase the size property of the Body and let it auto-calculate the new size, squeezing out all the extra space. Alas, that method no longer works beginning with Reporting Services 2008. Even when you make sure the body size is as small as possible (with no unnecessary extra space along the top, bottom, left, or right side of the body), it's important to calculate the body size plus header plus footer plus the margins and ensure that the calculated height and width do not exceed the report's height and width (shown as the page in the illustration above). This won't matter if users always render reports online, but they'll get extra pages in a PDF document if the report's height and width are smaller than the calculate space. Beginning the Investigation In the situation that I was troubleshooting, I checked the properties: Item Property Value Body Height 6.25in Width 10.5in Page Header Height 1in Page Footer Height 0.25in Report Left Margin 0.1in Right Margin 0.1in Top Margin 0.05in Bottom Margin 0.05in Page Size - Height 8.5in Page Size - Width 11in So I calculated the total width using Body Width + Left Margin + Right Margin and came up with a value of 10.7 inches. And then I calculated the total height using Body Height + Page Header Height + Page Footer Height + Top Margin + Bottom Margin and got 7.6 inches. Well, page sizing couldn't be the reason for the extra page in my report because 10.7 inches is smaller than the report's width of 11 inches and 7.6 inches is smaller than the report's height of 8.5 inches. I had to look elsewhere to find the culprit. Conducting the Third Degree My next thought was to focus on the rendering size of the items in the report. I've adapted my problem to use the Adventure Works database. At the top of the report are two charts, and then below each chart is a rectangle that contains a table. In the real-life scenario, there were some graphics present as a background for the tables which fit within the rectangles that were about 3 inches high so the visual space of the rectangles matched the visual space of the charts - also about 3 inches high. But there was also a huge amount of white space at the bottom of the page, and as I mentioned at the beginning of this post, a second page which was blank except for the footer that appeared at the bottom. Placing a textbox beneath the rectangles to see if they would appear on the first page resulted the textbox's appearance on the second page. For some reason, the rectangles wanted a buffer zone beneath them. What's going on? Taking the Suspect into Custody My next step was to see what was really going on with the rectangle. The graphic appeared to be correctly sized, but the behavior in the report indicated the rectangle was growing. So I added a border to the rectangle to see what it was doing. When I added borders, I could see that the size of each rectangle was growing to accommodate the table it contains. The rectangle on the right is slightly larger than the one on the left because the table on the right contains an extra row. The rectangle is trying to preserve the whitespace that appears in the layout, as shown below. Closing the Case Now that I knew what the problem was, what could I do about it? Because of the graphic in the rectangle (not shown), I couldn't eliminate the use of the rectangles and just show the tables. But fortunately, there is a report property that comes to the rescue: ConsumeContainerWhitespace (accessible only in the Properties window). I set the value of this property to True. Problem solved. Now the rectangles remain fixed at the configured size and don't grow vertically to preserve the whitespace. Case closed.

Read the article

Protecting PDF files and XDO.CFG

- by Greg Kelly

Protecting PDF files and XDO.CFG Security related properties can be overridden at runtime through PeopleCode as all other XMLP properties using the SetRuntimeProperties() method on the ReportDefn class. This is documented in PeopleBooks. Basically this method need to be called right before calling the processReport() method: . . &asPropName = CreateArrayRept("", 0); &asPropValue = CreateArrayRept("", 0); &asPropName.Push("pdf-open-password"); &asPropValue.Push("test"); &oRptDefn.SetRuntimeProperties(&asPropName, &asPropValue); &oRptDefn.ProcessReport(&sTemplateId, %Language_User, &dAsOfDate, &sOutputFormat); Of course users should not hardcode the password value in the code, instead, if password is stored encrypted in the database or somewhere else, they can use Decrypt() api

Read the article

Dynamic PDF output from your .NET project with ReportLab PLUS

Report Markup Language is an XML-style language for creating PDF documents. We've just written a sample ASP.NET project demonstrating how to use ReportLab's RML2PDF to create PDF documents from inside your .NET project. Create great looking custom dynamic PDFs from your website or application with the minimum of fuss. Download the sample project from here: RML with Microsoft .NET...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article

Crystal reports: Dynamically bind reports and export as PDF or any format

In this article you will learn how to use Crystal reports: Dynamically bind reports and export as PDF or any format.

Read the article

Gmail Now Searches Inside PDF, Word, and PowerPoint Attachments

- by Jason Fitzpatrick

Gmail has long had a robust system for searching within the subjects and bodies of your emails, now you can search inside select attachments–PDF, Word, and PowerPoint attachments are all searchable. Prior to this update, Gmail could search inside of HTML attachments but lacked more advanced attachment querying abilities. Now when you search your Gmail account you’ll see search results for not only the subject and body contents but also the contents of popular formats like PDF and Word documents. Don’t forget to take advantage of advanced search terms to speed up your query. If you know the information you need is in an attachment but can’t remember which email, include “has:attachment” in your search to only peek inside emails with attachments. [via GadgetBox] HTG Explains: How Antivirus Software Works HTG Explains: Why Deleted Files Can Be Recovered and How You Can Prevent It HTG Explains: What Are the Sys Rq, Scroll Lock, and Pause/Break Keys on My Keyboard?

Search Results

Search found 4479 results on 180 pages for 'pdf scraping'.

Page 36/180 | < Previous Page | 32 33 34 35 36 37 38 39 40 41 42 43 | Next Page >

- by Alex Kachanov

- by user35981

- by dendini

- by Asian Angel

- by ElPiter

- by thatjeffsmith

- by tomato

- by Titi Wangsa bin Damhore

- by Bartdude

- by Martin

- by rumtscho

- by Fire-Dragon-DoL

- by Ridwan Ahmed Khan

- by user140393

- by WernerCD

- by To Do

- by user549683

- by Suresh S

- by Data-Base

- by smisner

- by Greg Kelly

- by Jason Fitzpatrick

< Previous Page | 32 33 34 35 36 37 38 39 40 41 42 43 | Next Page >