Algorithm detect repeating/similiar strings in a corpus of data -- say email subjects, in Python
- by RizwanK
I'm downloading a long list of my email subject lines , with the intent of finding email lists that I was a member of years ago, and would want to purge them from my Gmail account (which is getting pretty slow.)
I'm specifically thinking of newsletters that often come from the same address, and repeat the product/service/group's name in the…