Search Results

Search found 4479 results on 180 pages for 'pdf scraping'.

Page 38/180 | < Previous Page | 34 35 36 37 38 39 40 41 42 43 44 45 | Next Page >

Issue in Webscrapping in C# : Downloading and parsing zipped text files

- by user64094

I am writing an webscrapper, to do the download content from a website. Traversing to the website/URL, triggers the creation of a temporary URL. This new URL has a zipped text file. This zipped file is to be downloaded and parsed. I have written a scrapper in C# using WebClient and its function - DownloadFileAsync(). The zipped file is read from the designated location on a trapped DownloadFileCompleted event. My issue : The Windows 'Open/Save dialog is triggered". This requires user input and the automation is disrupted. Can you suggest a way to bypass the issue ? I am cool with rewriting the code using any alternate libraries. :) Thanks for reading,

Read the article
How does one decrypt a PDF with an owner password, but no user password?

- by Tony Meyer

Although the PDF specification is available from Adobe, it's not exactly the simplest document to read through. PDF allows documents to be encrypted so that either a user password and/or an owner password is required to do various things with the document (display, print, etc). A common use is to lock a PDF so that end users can read it without entering any password, but a password is required to do anything else. I'm trying to parse PDFs that are locked in this way (to get the same privileges as you would get opening them in any reader). Using an empty string as the user password doesn't work, but it seems (section 3.5.2 of the spec) that there has to be a user password to create the hash for the admin password. What I would like is either an explanation of how to do this, or any code that I can read (ideally Python, C, or C++, but anything readable will do) that does this so that I can understand what I'm meant to be doing. Standalone code, rather than reading through (e.g.) the gsview source, would be best.

Read the article
Can I print an HTMLLoader (pdf) in Adobe Air?

- by Stephano

I'm using AlivePDF to create a PDF file, then save it to the desktop. I can then use an HTMLLoader to display my lovely PDF file. Now, the print button in Adobe Reader works fine. However, there will be young children using the app, so I'd like to have a big "Print" button right above it. I figured I could just start up a print job and feed it my HTMLLoader. Am I doing something wrong here, cause I can't seem to get any output? note: variable "stuff" below is my HTMLLoader. I also have access to the PDF file if that comes in handy. private function print():void { var myPrintJob:PrintJob=new PrintJob(); var result:Boolean=myPrintJob.start(); if (result && stuff != null) { var rect:Rectangle=new Rectangle(0, 0, 2550, 3300); var opt:PrintJobOptions=new PrintJobOptions(true); myPrintJob.addPage(stuff, rect, opt); myPrintJob.send(); } else { //User does not have printer or user canceled print action } }

Read the article
Can I print an HTMLLoader (pdf) in Adobe Air?

- by Stephano

I'm using AlivePDF to create a PDF file, then save it to the desktop. I can then use an HTMLLoader to display my lovely PDF file. Now, the print button in Adobe Reader works fine. However, there will be young children using the app, so I'd like to have a big "Print" button right above it. I figured I could just start up a print job and feed it my HTMLLoader. Am I doing something wrong here, cause I can't seem to get any output? note: variable "stuff" below is my HTMLLoader. I also have access to the PDF file if that comes in handy. private function print():void { var myPrintJob:PrintJob=new PrintJob(); var result:Boolean=myPrintJob.start(); if (result && stuff != null) { var rect:Rectangle=new Rectangle(0, 0, 2550, 3300); var opt:PrintJobOptions=new PrintJobOptions(true); myPrintJob.addPage(stuff, rect, opt); myPrintJob.send(); } else { //User does not have printer or user canceled print action } }

Read the article
Generating PDF Files With iTextSharp

- by Ricardo Peres

I recently had the need to generate a PDF file containing a table where some of the cells included images. Of course, I used iTextSharp to do it. Because it has some obscure parts, I decided to publish a simplified version of the code I used. using iTextSharp; using iTextSharp.text; using iTextSharp.text.pdf; using iTextSharp.text.html; //... protected void OnGeneratePdfClick() { String text = "Multi\nline\ntext"; String name = "Some Name"; String number = "12345"; Int32 rows = 7; Int32 cols = 3; Single headerHeight = 47f; Single footerHeight = 45f; Single rowHeight = 107.4f; String pdfName = String.Format("Labels - {0}", name); PdfPTable table = new PdfPTable(3) { WidthPercentage = 100, HeaderRows = 1 }; PdfPCell headerCell = new PdfPCell(new Phrase("Header")) { Colspan = cols, FixedHeight = headerHeight, HorizontalAlignment = Element.ALIGN_CENTER, BorderWidth = 0f }; table.AddCell(headerCell); FontFactory.RegisterDirectory(@"C:\WINDOWS\Fonts"); //required for the Verdana font Font cellFont = FontFactory.GetFont("Verdana", 6f, Font.NORMAL); for (Int32 r = 0; r SyntaxHighlighter.config.clipboardSwf = 'http://alexgorbatchev.com/pub/sh/2.0.320/scripts/clipboard.swf'; SyntaxHighlighter.brushes.CSharp.aliases = ['c#', 'c-sharp', 'csharp']; SyntaxHighlighter.all();

Read the article
Unwanted font Helvetica in PDF from Jasper

- by Fabio

When I create a PDF from a Jasper Report, the resulting PDF declare to use "Helvetica" font, even if it doesn't contain text. Unfortunately I cannot embed "Helvetica" font, because it is not among the Windows fonts. Based on the PDF/A rules, I need to embed all the fonts in the PDF file. How can I create from Jasper a PDF that doesn't declare to use Helvetica? Thank you in advance. Fabio

Read the article
Displaying pdf from the local directory or database in a <object tag

- by nilesh

Hi All, Is there a way to display pdf from local directory or database in a object tag. My problem is I am trying to display pdf print dialog after pdf is loaded. This is possible if I load pdf using object tag, but currently my pdf is getting loaded dynamically using response.binarywrite. Any help will be highly appreciated. regards Nilesh

Read the article
Source for Names to use in web scraping

- by PyNEwbie

Can anyone suggest a good source of names that I can use to help analyze some tables on web pages. The first column of the tables I am scraping have names alone, names and titles or just titles. The names can be as varied as John Smith to Vikram Saksena. I have been poking around for a compiled list of words that can be found in proper names.

Read the article
search APIs versus screen scraping

- by vbNewbie

I would like to know as a newbie programmer what the benefits are of using for example google search API or newest buzz API for data content gathering instead of screen scraping; obviously apart from the legal aspects.

Read the article
Free PDF viewers in ASP.net

- by rowmark

Hello experts, I have many PDF documents in binary format which is in the SQL Server 2008 database. I have a gridview in my ASP.net page. When a user clicks on any ID column of the record I need to open the pdf in the browser. Are there any free PDF viewer controls out there? How can I convert the binary PDF file and display as PDF in the browser. Thanks

Read the article
Update existing pdf with C#

- by Prashant

I have an existing pdf file which needs to be updated with information which varies with each client. There are around 50 clients. I have to update the pdf with these. How can I acheive the same in C#. The pdf has to be shown in the browser (only IE) Is there a third party dll which could be used, which would parse the pdf. Then write to the pdf.

Read the article
Embedded .swf file in .pfd-Ubuntu 10.04

- by Thanos

I have just finished a presentation in LaTeX. In this very .pdf file I have included a .swf animation(done with adobe flash CS5 in windows) which starts when you click on it. While I have already installed a relevant player(swfdec flash player) neither document viewer nor okular are able to reproduce it. I tried with my player to make sure that the file is not corrupted and the result was that it can be produced. I tried the same .pdf file in windows using adobe reader and there is no problem there. The embedded file can be reproduced with no problem. So I thought of installing adobe in ubuntu. I tried there to see if the problem was solved. Things got a bit better. Adobe could understand that there is something there, so when clicked I got a message that I had to get the proper player. When I clicked on a relevant button I expected to open my browser in a player's page. Instead nothing happened. If I place my mouse's cursor next to the space that defines my animation the is a "message" stating "Media File(application/x-shockwave flash)". The next step was to install Adobe Flash player, but I couldn't find the standallone player;only the browser's plugs... How can I get this .swf file play in pdf?

Read the article
Comb Over

- by Tim Dexter

Being some what follicly challenged, and to my wife's utter relief, the comb over is not something I have ever considered. The title is a tenuous reference to a formatting feature that Adobe offers in their PDF documents. The comb provides the ability to equally space a string of characters on a pre-defined form layout so that it fits neatly in the area. See the numbers above are being spaced correctly. Its not a function of the font but a property of the form field. For the first time, in a long time I had the chance to build a PDF template today to help out a colleague. I spotted the property and thought, hey, lets give it a whirl and see in Publisher supports it? Low and behold, Publisher handles the comb spacing in its PDF outputs. Exciting eh? OK, maybe not that exciting but I was very pleasantly surprise to see it working. I am reliably informed, by Leslie, BIP Evangelist and Tech Writer that, this feature was introduced from version 10.1.3.4.2 onwards. Official docs and no mention of comb overs here. Happy Combing!

Read the article
Wkhtmltopdf margin (top and bottom)

- by Kwarkjes

Iam using wkhtmltopdf 0.10.0 rc2 on a : Linux 3.2.0-24-generic #38-Ubuntu x86_64 GNU/Linux I can't create pdf's with margin-top or margin-bottom (no errors) I'm using the command bellow: wkhtmotopdf -T 50 -B 50 http://google.com ./test.pdf wkhtmotopdf --margin-top 50 --margin-bottom 50 page.html ./test.pdf When i try this: wkhtmotopdf -L 50 -R 50 -T 50 -B 50 page.html ./test.pdf Margin left and right works perfect (still no margin-top/margin-bottom) it dosn't matter wich URL or page i convert

Read the article
CGContext - PDF margin

- by Manoj Khaylia

Hi All I am showing PDF content on a view using this code using Quartz Sample // PDF page drawing expects a Lower-Left coordinate system, so we flip the coordinate system // before we start drawing. CGContextTranslateCTM(context, 0.0, self.bounds.size.height); CGContextScaleCTM(context, 1.0, -1.0); // Grab the first PDF page CGPDFPageRef page = CGPDFDocumentGetPage(pdf, pageNo); // We're about to modify the context CTM to draw the PDF page where we want it, so save the graphics state in case we want to do more drawing CGContextSaveGState(context); // CGPDFPageGetDrawingTransform provides an easy way to get the transform for a PDF page. It will scale down to fit, including any // base rotations necessary to display the PDF page correctly. CGAffineTransform pdfTransform = CGPDFPageGetDrawingTransform(page, kCGPDFCropBox, self.bounds, 0, true); // And apply the transform. CGContextConcatCTM(context, pdfTransform); // Finally, we draw the page and restore the graphics state for further manipulations! CGContextDrawPDFPage(context, page); CGContextRestoreGState(context); Using this all works fine I want to set the margin for the PDF context, bydefault it showing 50 px margin in every side.. I have tried CGContext methods but not got the appropriate one. Can any body help me regarding this Thanks Monaj

Read the article
How to make easily PDF version of a web?

- by MartyIX

I'm trying to make an offline version of a web and I'm looking for a tool that would do the task automatically for the whole web (circa 1000 pages of HTML + images). Is there anything like that and free? I know it is quite challenge for a program but maybe I'll be lucky :). EDIT: It should be a program for Windows. Thanks!

Read the article
Is it possible to add your own bookmarks/tabs to a PDF file?

- by Pure.Krome

Hi folks, I've purchases a few e-books and love it. Some come with a massive list of bookmarks (kewl!) and some not. Regardless, is there a way i can create my OWN bookmarks so i can jump to specific pages? I don't want to mess up the current list of official bookmarks that came with the e-books (where they were provided). It's like i want to add my own sticky note tabs so i can quickly jump between pages etc, without having to remember the page number. Also, this is for Adobe reader (the free thingy). If it's available in another program (eg. Foxit, please say so also :) ) cheers!

Read the article
is their an online service that allows you to upload and edit pdf documents?

- by Naveen Garg

Does acrobat.com allow it ? Doesn't seem to advertise it in the features list... Any others ?

Read the article
Batch convert AppleWorks files into PDF within originating folder, delete original file?

- by Manca Weeks

Probably AppleScript is the way to go with this - I have found scripts online that do this, but snag on oversize printable area and put files in the same folder - I need files to stay in the folder the source came from. If the script also deleted the original AppleWorks file, that would be even better, but not required. I have tried the last script from this post: https://discussions.apple.com/message/10127260#10127260#10127260 Any suggestions would be much appreciated.

Read the article
Why can't I 'justify' text that I have copied from PDF into MS Word?

- by Uday Kanth

I find it really annoying that when I copy text that looks good in Adobe Reader into Word, the sentences which are left-aligned by default won't change accordingly when I press 'Justify'. The only way I could get the result I need is to press back-spaces and Delete key to align the right border. Why is this? Here's an example from the Word document. The text is right- and center-aligning perfectly but Justify does not seem to work.

Read the article
PDFObject load event

- by Priyank

Hi. We are trying to load a pdf file in web browser using pdfobject javascript api. Currently the size of the pdf's that we are trying to display is close to 10MBs. This creates a long delay in displaying a PDF on web page; while the complete PDF gets downloaded. We need to remove this lag by achieving either of the alternatives: Show a progress bar until the PDF is actually displayed. We couldn't find an event which is triggered and can be used to find out if pdf is visible now. This lacking doesn't let us decide when to stop showing progress bar/spinner OR lazy load the PDF such that it gets displayed as soon as first page gets loaded. With that ateast user will have a visual indication as to something is happening. We couldn'find anything in pdf object that lets us do a lazy load. User alternative pdf rendering api; this is a low priority as we already have complete code in place; but in an event of first 2 alternatives not being met; we'd have to consider this option. So please feel free to suggest. Any other ideas as to how user interaction can be made more intuitive or pleasant; would be welcome. Cheers

Read the article
What does the JS function 'postMessage()' do when called on an html object tag?

- by Stephano

I was recently searching for a way to call the print function on a PDF I was displaying in adobe air. I solved this problem with a little help from this fellow, and by calling postMessage on my PDF like so: //this is the HTML I use to view my PDF <object id="PDFObj" data="test.pdf" type="application/pdf"/> ... //this actionscript lives in my air app var pdfObj:Object = htmlLoader.window.document.getElementById("PDFObj"); pdfObj.postMessage([message]); I've tried this in JavaScript as well, just to be sure it wasn't adobe sneaking in and helping me out... var obj = document.getElementById("PDFObj"); obj.postMessage([message]); Works well in JavaScript and in ActionScript. I looked up what the MDC had to say about postMessage, but all I found was window.postMessage. Now, the code works like a charm, and postMessage magically sends my message to my PDF's embedded JavaScript. However, I'm still not sure how I'm doing this. I found adobe talking about this method, but not really explaining it: HTML-PDF communication basics JavaScript in an HTML page can send a message to JavaScript in PDF content by calling the postMessage() method of the DOM object representing the PDF content. Any ideas how this is accomplished?

Read the article
How to restrict a content of string to less than 4MB and save that string in DB using C#

- by Pranay B

I'm working on a project where I need to get the Text data from pdf files and dump the whole text in a DB column. With the help of iTextsharp, I got the data and referred it String. But now I need to check whether the string exceeds the 4MB limit or not and if it is exceeding then accept the string data which is less than 4MB in size. This is my code: internal string ReadPdfFiles() { // variable to store file path string filePath = null; // open dialog box to select file OpenFileDialog file = new OpenFileDialog(); // dilog box title name file.Title = "Select Pdf File"; //files to be accepted by the user. file.Filter = "Pdf file (*.pdf)|*.pdf|All files (*.*)|*.*"; // set initial directory of computer system file.InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.Desktop); // set restore directory file.RestoreDirectory = true; // execute if block when dialog result box click ok button if (file.ShowDialog() == DialogResult.OK) { // store selected file path filePath = file.FileName.ToString(); } //file path /// use a string array and pass all the pdf for searching //String filePath = @"D:\Pranay\Documentation\Working on SSAS.pdf"; try { //creating an instance of PdfReader class using (PdfReader reader = new PdfReader(filePath)) { //creating an instance of StringBuilder class StringBuilder text = new StringBuilder(); //use loop to specify how many pages to read. //I started from 5th page as Piyush told for (int i = 5; i <= reader.NumberOfPages; i++) { //Read the pdf text.Append(PdfTextExtractor.GetTextFromPage(reader, i)); }//end of for(i) int k = 4096000; //Test whether the string exceeds the 4MB if (text.Length < k) { //return the string text1 = text.ToString(); } //end of if } //end of using } //end try catch (Exception ex) { MessageBox.Show(ex.Message, "Please Do select a pdf file!!", MessageBoxButtons.OK, MessageBoxIcon.Warning); } //end of catch return text1; } //end of ReadPdfFiles() method Do help me!

Read the article
Windows 8 UX Guidelines in one PDF

- by nmarun

There are quite a few things you need do to differently in order to write a great Windows 8 App. Although MSDN has it documented completely in their site , the sheer volume of other related information might overwhelm you. In order to make it easy, they have a single pdf with all the relevant information. The file will also serve as a ‘quick ref’ document whether you are developing using C#-XAML or HTML5-JS-CSS or C++-DirectX style. And yes, this has been updated for the RTM version....(read more)

Read the article
Bringing improved PDF support to Google Chrome

Millions of web users rely on PDF files every day to consume a wide variety of text and media content. To enable this, a number of plug-ins exist...

Read the article

< Previous Page | 34 35 36 37 38 39 40 41 42 43 44 45 | Next Page >