Are there libraries or techniques for collecting and weighing keywords from a block of text?
Posted
by Soviut
on Stack Overflow
See other posts from Stack Overflow
or by Soviut
Published on 2010-05-27T20:07:13Z
Indexed on
2010/05/27
20:11 UTC
Read the original article
Hit count: 187
I have a field in my database that can contain large blocks of text. I need to make this searchable but don't have the ability to use full text searching. Instead, on update, I want my business layer to process the block of text and extract keywords from it which I can save as searchable metadata. Ideally, these keywords could then be weighed based on the number of times they appear in the block of text. Naturally, words like "the", "and", "of", etc. should be discarded as they just add a lot of noise to the search.
Are there tools or libraries in Python that can do this filtering or should I roll my own?
© Stack Overflow or respective owner