Visually and audibly unambiguous subset of the Latin alphabet?
- by elliot42
Imagine you give someone a card with the code "5SBDO0" on it.
In some fonts, the letter "S" is difficult to visually distinguish from the number five, (as with number zero and letter "O").
Reading the code out loud, it might be difficult to distinguish "B" from "D", necessitating saying "B as in boy," "D as in dog," or using a "phonetic alphabet" instead.
What's the biggest subset of letters and numbers that will, in most cases, both look unambiguous visually and sound unambiguous when read aloud?
Background:
We want to generate a short string that can encode as many values as possible while still being easy to communicate.
Imagine you have a 6-character string, "123456". In base 10 this can encode 10^6 values.
In hex "1B23DF" you can encode 16^6 values in the same number of characters, but this can sound ambiguous when read aloud. ("B" vs. "D")
Likewise for any string of N characters, you get (size of alphabet)^N values.
The string is limited to a length of about six characters, due to wanting to fit easily within the capacity of human working memory capacity.
Thus to find the max number of values we can encode, we need to find that largest unambiguous set of letters/numbers. There's no reason we can't consider the letters G-Z, and some common punctuation, but I don't want to have to go manually pairwise compare "does G sound like A?", "does G sound like B?", "does G sound like C" myself. As we know this would be O(n^2) linguistic work to do =)...