On counting pairs of words that differ by one letter
- by Quintofron
Let us consider n words, each of length k. Those words consist of letters over an alphabet (whose cardinality is n) with defined order. The task is to derive an O(nk) algorithm to count the number of pairs of words that differ by one position (no matter which one exactly, as long as it's only a single position).
For instance, in the following set of words (n = 5, k = 4):
abcd, abdd, adcb, adcd, aecd
there are 5 such pairs: (abcd, abdd), (abcd, adcd), (abcd, aecd), (adcb, adcd), (adcd, aecd).
So far I've managed to find an algorithm that solves a slightly easier problem: counting the number of pairs of words that differ by one GIVEN position (i-th). In order to do this I swap the letter at the ith position with the last letter within each word, perform a Radix sort (ignoring the last position in each word - formerly the ith position), linearly detect words whose letters at the first 1 to k-1 positions are the same, eventually count the number of occurrences of each letter at the last (originally ith) position within each set of duplicates and calculate the desired pairs (the last part is simple).
However, the algorithm above doesn't seem to be applicable to the main problem (under the O(nk) constraint) - at least not without some modifications. Any idea how to solve this?