Why create a Huffman tree per character instead of a Node?

Posted by Omega on Programmers See other posts from Programmers or by Omega
Published on 2012-11-13T18:53:28Z Indexed on 2012/11/13 23:22 UTC
Read the original article Hit count: 282

For a school assignment we're supposed to make a Java implementation of a compressor/decompresser using Huffman's algorithm.

I've been reading a bit about it, specially this C++ tutorial: http://www.cprogramming.com/tutorial/computersciencetheory/huffman.html

In my program, I've been thinking about having Nodes that have the following properties:

  • Total Frequency
  • Character (if a leaf)
  • Right child (if any)
  • Left child (if any)
  • Parent (if any)

So when building the Huffman tree, it is just a matter of linking a node to others, etc.

However, I'm a bit confused with the following quote (emphasis mine):

First, every letter starts off as part of its own tree and the trees are ordered by the frequency of the letters in the original string. Then the two least-frequently used letters are combined into a single tree, and the frequency of that tree is set to be the combined frequency of the two trees that it links together.

My question: why should I create a tree per letter, instead of just a node per letter and then do the linking later?

I have not begun coding, I'm just studying the algorithm first, so I guess I'm missing an important detail. What is it?

© Programmers or respective owner

Related posts about algorithms

Related posts about algorithm-analysis