How can I index HTML documents?

Posted by Swami on Stack Overflow See other posts from Stack Overflow or by Swami
Published on 2009-12-17T01:57:46Z Indexed on 2010/03/23 10:03 UTC
Read the original article Hit count: 397

I am using Lucene .NEt to do full-text searching. Till now I have been indexing PDF docs, but now I have a few webpages that I need to index. What's the best/easiest way to index HTML documents to add to my Lucene index? I am using .NET/C#

© Stack Overflow or respective owner

Related posts about lucene

Related posts about lucene.net