OCR combined with font recognition?

Posted by Adam on Stack Overflow See other posts from Stack Overflow or by Adam
Published on 2011-01-05T06:10:30Z Indexed on 2011/01/08 10:53 UTC
Read the original article Hit count: 242

Filed under:

ocr

I have a bold idea where a user could take an image like the following

alt text

and in a few seconds of processing, be able to edit a document which looks roughly the same.

The software would use WhatTheFont (or something similar) to recognize the fonts used, and OCR and other software to handle the font size, color, line-spacing, and of course the text content itself. In the case of the example image, there would be three separate "textboxes" produced, each starting at the upper left corner of the text, and extending as far to the bottom right as it could before running into another text box. So the user would then see something like this:

alt text

(The rectangles are just used to show the boundaries of each textbox.)

From here, the user would be able to edit the text in each of these boxes to create a new document.

Of course there are tons of obvious uses for such an application, especially on a mobile phone with a built in camera.

So my questions are the following:

I doubt the answer is yes, but does anything do this already?
If I'm going to try to build this, what should I write it in? Can I use Python?
What would be the best OCR libraries to start with?
Is there a service other than WhatTheFont for font recognition that has better API support?
Anybody want to help me build it? :)

etc. etc.

Update: One thing I wanted to mention (but forgot) is I would also like the background to be preserved. In other words, if the example above had an image behind the text, I'd like the document to use that image with text removed. I know this complicates things a lot because that would require some image editing techniques too (something akin to Photoshop CS5' "content-aware fill"). But if we can solve diminished reality on iPhones, I think we can figure this out!

Developer IT

OCR combined with font recognition? - Developer IT

OCR combined with font recognition?

software

ocr

Related posts about software

Terminology: Difference between software interface, software component, software unit, software modu

Miracle Traffic Bot - A Multi Use SEO Software - Why One Should Choose Multi Use Software

Tracking downloads of your software + software CDN?

software cd that allow to install software only once [closed]

What software license to use for commercial software?

Related posts about ocr

free open-source linux screenshot & ocr tool

OCR, OCR-B Fonts in PHP?

OCR with Neural network: data extraction

OCR: How to improve accuracy - existing libraries for removing non-text 'furniture', shapes, etc to

OCR an RSA key fob (security token)

Categories cloud