Finding partial substrings within a string
- by Peter Chang
I have two strings which must be compared for similarity. The algorithm must be designed to find the maximal similarity. In this instance, the ordering matters, but intervening (or missing) characters do not. Edit distance cannot be used in this case for various reasons.
The situation is basically as follows:
string 1: ABCDEFG
string 2: AFENBCDGRDLFG
the resulting algorithm would find the substrings A, BCD, FG
I currently have a recursive solution, but because this must be run on massive amounts of data, any improvements would be greatly appreciated