Problem with creating a Huffman code tree - Printable Version +- QB64 Phoenix Edition (https://qb64phoenix.com/forum) +-- Forum: QB64 Rising (https://qb64phoenix.com/forum/forumdisplay.php?fid=1) +--- Forum: Code and Stuff (https://qb64phoenix.com/forum/forumdisplay.php?fid=3) +---- Forum: Help Me! (https://qb64phoenix.com/forum/forumdisplay.php?fid=10) +---- Thread: Problem with creating a Huffman code tree (/showthread.php?tid=1803) Pages:
1
2
|
Problem with creating a Huffman code tree - SagaraS - 07-01-2023 Hello, I started to write a code to read, count and sort a byte array. Now I want to take the next step, but I don't know how to start exactly by creating a Huffman code tree with QB64 syntax. I don't want a C or C++ solution, I want a QB64 solution for it. Here is my current code: In the 'test.txt' stand an example like 'aaavvrijgtmmspoe' The file input can be anything. So all bytes should be considered from 0 to 255. Code: (Select All) 'Huffman Encoding RE: Problem with creating a Huffman code tree - DSMan195276 - 07-02-2023 I haven't closely read all of your code, but it looks like you have the right idea for creating the table of character frequencies. For creating the actual Huffman tree you can just make an array of nodes, and in place of where you'd normally use pointers you just store an index into the array of nodes. Once you do that it's reasonably easy to use the typical algorithm to make the huffman tree. So it would work something like this (I didn't test this): Code: (Select All)
When creating the huffman tree you'll initialize that `tree()` array with leaf nodes created from your existing `Table()` array (and really, you could probably combine them into one array with a little work). One catch is that you can't sort that array to find the next two nodes to handle, because sorting will change the position of the nodes in the array and make all your `.Left` and `.Right` entries wrong. To fix that you can just use another array of indexes and sort that (which is what you'd typically do in C anyway), like this: Code: (Select All)
To actually build the huffman tree you will sort the `HuffmanQueue()` array, then take the first two entries and build a new `HuffmanNode` that has those two entries as its `.Left` and `.Right`. Then you replace one of the `HuffmanQueue()` entries with your newly created node, remove the other one (Maybe set it to `-1`, or shift everything over and reduce the size of the array), and then sort and remove again until only one node is left in `HuffmanQueue()` (either due to reduced size, or all of them are `-1`). RE: Problem with creating a Huffman code tree - bplus - 07-02-2023 Hi @SagaraS Welcome to the forum! What's a Hoffman Tree and why in the world would anyone want one? Asking such questions is how I got into Pathfinder and an alternate Paint3 code. You never know what rabbit hole leads to buried treasure RE: Problem with creating a Huffman code tree - DSMan195276 - 07-02-2023 (07-02-2023, 12:01 PM)bplus Wrote: What's a Hoffman Tree and why in the world would anyone want one?A Huffman tree is a data structure that maps bit sequences to bytes/characters. Each node in the tree has a left and right node, representing either a 1 or a 0. If you follow a particular left/right path through the tree, you eventually end up at a leaf node representing an actual byte or character (Ex. 'A'). The particular left/right path then tells you the bit pattern mapping to that character. In this case it's being used to do compression. SagaraS is creating the huffman tree via analyzing the frequency of each character in the text he is compressing, and then the huffman tree is built such that higher frequency characters have shorter bit sequences (and lower frequency characters have longer ones). You then replace each character in the original text with the bit sequence it maps to in the huffman tree, which results in less bits than the original text. Someone else can then decompress the bits into the original text via using the huffman tree to read each bit and follow the tree until they find a leaf node. Then they output the character at the leaf node and start over at the next bit. RE: Problem with creating a Huffman code tree - Jack - 07-02-2023 DSMan195276 that's an excellent explanation, brief and eloquent RE: Problem with creating a Huffman code tree - SMcNeill - 07-02-2023 Did you guys know that morse code is a huffman/binary tree? Absolutely blew my mind when I learned that fact a few years back! RE: Problem with creating a Huffman code tree - bplus - 07-02-2023 (07-02-2023, 03:34 PM)Jack Wrote: DSMan195276 Well sure Matt knows but describes well too! I was hoping our new guy might respond and give us more a glimpse of who he is. I mean I could have looked it up on Wiki. but applications of algos are very interesting. BTW is all that fooling around really necessary in this day and age, I know back in DOS days space was precious but these days space is cheap and speed is king! RE: Problem with creating a Huffman code tree - SagaraS - 07-02-2023 (07-02-2023, 02:12 AM)DSMan195276 Wrote: I haven't closely read all of your code, but it looks like you have the right idea for creating the table of character frequencies. For creating the actual Huffman tree you can just make an array of nodes, and in place of where you'd normally use pointers you just store an index into the array of nodes. Once you do that it's reasonably easy to use the typical algorithm to make the huffman tree. ... (07-02-2023, 12:01 PM)bplus Wrote: Hi @SagaraS Welcome to the forum!Sorry for my late reply. How the tree is built and how it works on a piece of paper is simple. But implementing the whole thing is harder for me. Especially these recursive matters. After all, the whole thing should automatically generate the code for the respective ASCII characters at the end. Had found many examples on the net, but didn't really know where to start. @'DSMan1s276' Thanks for your post with the sample code. Now I have a plan how I can implement it for my purposes. ^^ @'bplus' On the one hand, I need this for my own interest. Would like to learn that. On the other hand, I would like to use it to reduce the size of small tables or small images. I have already managed to get a working RLE compression for images. Based on Targa images. But if the entropy is too big, it doesn't do me much good. Therefore an additional Huffman Compression. This means that not everyone can manipulate the data immediately. And the data is smaller in size. That's my goal at the moment. Have another QB64 project where I would like to include this for the resources. Here are some pictures of it: The helpfile on the last image is a B800 text format. 1 byte char + 1 byte for foreground and background color. So 2 bytes per character. The helpfile exist 1x in german an 1x in english. The total size for the helpfile is 28 KB. This * 2 is 56 KB. Then i have 3 images in the ressource file. About 212 KB. mouse cursor + my own 8x8 font. About 2,2 KB. I have brought all resources into an archive which is foot-controlled. I can attach this archive to the EXE file. Thus, the EXE file with the archive is standalone. And I also like to apologize for my bad English. Actually, German is my mother tongue. ^^ RE: Problem with creating a Huffman code tree - mnrvovrfc - 07-02-2023 Welcome to the forums. You seem to have a nice project going. RE: Problem with creating a Huffman code tree - SMcNeill - 07-03-2023 @ SagaraS One thing which you might want to keep in mind: QB64PE has built-in compression routines already. If you can simply use _INFLATE and _DEFLATE for your file size reduction, then you may not have to struggle to reinvent the wheel. (Or in this case, huffman compression trees and routines.) |