Asp.net library to extract plain text from docx, pptx, xlsx (for search index)
Posted
by Myster
on Stack Overflow
See other posts from Stack Overflow
or by Myster
Published on 2010-05-06T03:37:59Z
Indexed on
2010/05/17
4:00 UTC
Read the original article
Hit count: 323
Is there a pre-existing library to extract plain text form docx, pptx, and xlsx files?
I require this to populate a lucene.net index.
I've found this example which extracts text from docx and it seems to work ok. But before building my own solution based on this I was wondering if there's something already available for the other file formats?
© Stack Overflow or respective owner