Calculating probability that a string has been randomized? - Python
- by RadiantHex
Hi folks,
this is correlated to a question I asked earlier (question)
I have a list of manually created strings such as:
lucy87
gordan_king
fancy_unicorn77
joplucky_kanga90
base_belong_to_narwhals
and a list of randomized strings:
johnkdf
pancake90kgjd
fancy_jagookfk
manhattanljg
What gives away that the last set of strings are randomized is that sequences such as 'kjg', 'jgf', 'lkd', ... .
Any clever way I could separate strings that contain these apparently randomized strings from the crowd?
I guess that this plays a lot on the fact that certain characters are more likely to be placed next to others (e.g. 'co', 'ka', 'ja', ...).
Any ideas on this one? Kylotan mentioned Reverend, but I am not sure if it can be used fr such purpose.
Help would be much appreciated!