Huffman Coding
Introduction
A quick overview of huffman - bet I never read this again. This is a lossless compression algorithm and works by assigning a code to each byte which is smaller than the original byte. He achieves this by counting the occurrence of characters and assigning a code smaller than the source data e.g. a = 0 b = 01. You need to store the translation so you know when you see 01 it is a b and a 0 is a.
Approach
Our example will be "AABCBAD"
- Count the frequency of each unique character
- Identify the two least used at characters
- Add nodes in order of lowest frequency to the parent node
- Assign the bits, by convention left is 0
- Assign codes to letters
- Write out example using the codes
Huffman basically
Count the frequency of each unique character
This gives
A = 3 B = 2 C = 1 D = 1
Identify the two least used at characters.
This is clearly c and d. We make these the bottom nodes of our tree adding a parent node and placing the sum of the occurrences in the node
Add nodes in order of lowest frequency to the parent node
B is the lowest frequency to this gets added to the tree and parent node is added which is the sum of the children 4
Now the remaining A node.
Assign the bits, by convention left is 0
I have nothing to say about this picture right now
Assign codes to letters
We do this be writing down the edges writing the values. Perhaps C is the best example. 7->4 = 0 4->2 = 1 2->C = 0 = 010. So this gives a table of
A = 3 | 1 B = 2 | 00 C = 1 | 010 D = 1 | 011
Which you will notice the more frequent the smaller the number.
Write out example using the codes
Should be like this (no spaces obviously)
A A B C B A D 1 1 00 010 00 1 011
And that it done with the value 1100010001011.
Build a JavaScript Frequency Table
const buildFreqTable = (byteStream) => {
const freq = new Map();
for (const byte of byteStream) {
freq.set(byte, (freq.get(byte) || 0) + 1);
}
return freq;
}
const bytes = new Uint8Array([1, 2, 2, 3, 255, 0, 2, 255]);
const freqTable = buildFreqTable(bytes);
for (const [byte, count] of freqTable.entries()) {
console.log(`Byte ${byte}: ${count}`);
}