PDF search on the iPhone

Posted by pt2ph8 on Stack Overflow See other posts from Stack Overflow or by pt2ph8
Published on 2010-11-04T13:18:26Z Indexed on 2011/06/28 16:22 UTC
Read the original article Hit count: 212

Filed under:
|
|
|
|

After two days trying to read annotations from a PDF using Quartz, I've managed to do it and posted my code.

Now I'd like to do the same for another frequently asked question: searching PDF documents with Quartz. Same situation as before, this question has been asked many times with almost no practical answers. So I need some pointers first, as I still haven't implemented this myself.

What I tried:

I tried using CGPDFScannerScan handling the TJ and Tj operators - returns the right text on some PDF, whereas on other documents it returns mostly random letters. Maybe it's related to text encoding? Someone pointed out that text blocks (marked by BT/ET operators) should be handled instead, but I still haven't managed to do so. Anyone managed to extract text from any PDF?

After that, searching should be easy by storing all the text in a NSMutableString and using rangeOfString (if there's a better way please let me know).

But then how to highlight the result? I know there are a few operators to find the glyph sizes, so I could calculate the resulting rect based on those values, but I've been reading the spec for hours... it's a bloated mess and I'm going insane. Anyone with a practical explanation? Thanks.

© Stack Overflow or respective owner

Related posts about iphone

Related posts about objective-c