spam and dirty words comment post filtering in python (django)
- by sintaloo
Hi All,
My basic question is how to filter spam and dirty words in a comment post system under python (django).
I have a collection of phrases (approximately 3000 phrases) to be filtered.
Question (1), are there any existing open source python (or django) package/module/plugin which can handle this job? I knew there was one called Akismet. But from what I understood, it will not solve my problem. Akismet is just a web service and filter the words dictionary defined by Akismet. But I have my own collection of words. Please correct me if I am wrong.
Question (2), If there is no such open source package I can use, how to create my own one? The only thing I can think of it's to use regular expression and join all the word phrases with 'or' in a regular expression. but I have 3000 phrases, I think it won't work in term of performance and filter every comment post. any suggestions where should I start from?
Thank you very much for your help and time.