Arithmetic Coding Questions
Posted
by
Xophmeister
on Programmers
See other posts from Programmers
or by Xophmeister
Published on 2012-12-17T12:31:43Z
Indexed on
2012/12/17
17:13 UTC
Read the original article
Hit count: 517
I have been reading up on arithmetic coding and, while I understand how it works, all the guides and instructions I've read start with something like:
Set up your intervals based upon the frequency of symbols in your data; i.e., more likely symbols get proportionally larger intervals.
My main query is, once I have encoded my data, presumably I also need to include this statistical model with the encoding, otherwise the compressed data can't be decoded. Is that correct? I don't see this mentioned anywhere -- the most I've seen is that you need to include the number of iterations (i.e., encoded symbols) -- but unless I'm missing something, this also seems necessary to me.
If this is true, that will obviously add an overhead to the final output. At what point does this outweigh the benefits of compression (e.g., say if I'm trying to compress just a few thousand bits)? Will the choice of symbol size also make a significant difference (e.g., if I'm looking at 2-bit words, rather than full octets/whatever)?
© Programmers or respective owner