Are there libraries or techniques for collecting and weighing keywords from a block of text?
- by Soviut
I have a field in my database that can contain large blocks of text. I need to make this searchable but don't have the ability to use full text searching. Instead, on update, I want my business layer to process the block of text and extract keywords from it which I can save as searchable metadata. Ideally, these keywords could then be weighed based on the number of times they appear in the block of text. Naturally, words like "the", "and", "of", etc. should be discarded as they just add a lot of noise to the search.
Are there tools or libraries in Python that can do this filtering or should I roll my own?