Save a binary file in SQL Server as BLOB and text (or get the text from Full-Text index)

Posted by Glennular on Stack Overflow See other posts from Stack Overflow or by Glennular
Published on 2010-03-26T19:24:31Z Indexed on 2010/03/26 20:23 UTC
Read the original article Hit count: 368

Currently we are saving files (PDF, DOC) into the database as BLOB fields. I would like to be able to retrieve the raw text of the file to be able to manipulate it for hit-highlighting and other functions.

Does anyone know of a simple way to either parse out the files and save the raw text on save, either via SQL or .net code. I have found that Adobe has a filtdump utility that will convert the PDF to text. Filtdump seems to be a command line tool, and i don't see a way to use a file stream. And what would the extractor be for Office documents and other file types?

-or-

Is there a way to pull out the raw text from the Full text index?

Note i am trying to build a .net & MSSql solution without having to use a third party tool such as Lucene

© Stack Overflow or respective owner

Related posts about ASP.NET

Related posts about c#