I have been reading up on arithmetic coding and, while I understand how it works, all the guides and instructions I've read start with something like:
Set up your intervals based upon the frequency of symbols in your data; i.e., more likely symbols get proportionally larger intervals.
My main query is, once I have encoded my data, presumably I also need to include this statistical model with the encoding, otherwise the compressed data can't be decoded. Is that correct? I don't see this mentioned anywhere -- the most I've seen is that you need to include the number of iterations (i.e., encoded symbols) -- but unless I'm missing something, this also seems necessary to me.
If this is true, that will obviously add an overhead to the final output. At what point does this outweigh the benefits of compression (e.g., say if I'm trying to compress just a few thousand bits)? Will the choice of symbol size also make a significant difference (e.g., if I'm looking at 2-bit words, rather than full octets/whatever)?