It sort of goes to show that while Huffman codes have been around for ages, implementations can still improve, especially as the hardware we use changes.
Spent several hours talking various topics, but one of his favorite areas of exploration when I was around campus was paper folding.
A couple of links with examples:
The certificate is only valid for the following names: *.s3.amazonaws.com, s3.amazonaws.com
Let's hope you happened to get served a valid cert, and not that the cert validation is disabled or broken in your Firefox.
For example, a "canonical Huffman code" can save you space encoding the table through some clever trickery. In short, you can encode the table by counting the number of bytes that use each bit-count, and the bytes used in the file. You don't need to store the patterns at all, since in the canonical algorithm the patterns are regenerated from the bit-lengths. 
Right now I'm trying to implement the package-merge algorithm, which will allow me to create the table without building an inefficient fully-allocated tree, and more importantly will allow me to limit the lengths of my codes (the maximum code length is n-1, where `n` is the length of your alphabet. Working with bytes and using 255 bit codes is obnoxious). Unfortunately all explanations of the algorithm I've found are very academic and mathematical, so I'm having trouble working it out.
Some of you might be interested in the short video Tom Scott made about Huffman coding.
> The advantage of a canonical Huffman tree is that it can be encoded in fewer bits than an arbitrary tree.
To take the example from the Wikipedia article you linked, canonicalizing
B = 0
A = 11
C = 101
D = 100
B = 0
A = 10
C = 110
D = 111
│ └------┘ │
│ B │ ┌-┤ ├-┐
└---┘ │ └---┘ │
┌-┤ ├-┐ │ A │
│ └---┘ │ └---┘
│ D │ │ C │
│ └------┘ │
│ B │ ┌-┤ ├-┐
└---┘ │ └---┘ │
│ A │ ┌-┤ ├-┐
└---┘ │ └---┘ │
│ C │ │ D │
Sure, after using trees for understanding, you don't need to actually represent a tree in your data structures / source code / encoding, but that's another matter.
Additionally, after posting the comment to HN from desktop, I happened to notice within the 2-hour edit window that it looked awful on mobile (as does the comment I copied from: probably a bug with the default font stack used by Chrome on Android not using monospace versions of the box-drawing characters), so I edited to replace "─" with an ASCII "-". It looks less perfect on desktop, but less poorly aligned on mobile, so is a reasonable compromise.
The former was confusing at first but mind-blowing when it clicked for me, and the latter looks awesome but the implementation is incomprehensible to me —I guess I'm stuck allocating trees until I can figure it out!
If we have three symbols, A, B, and C, and let's say we assign bit lengths of A=1, B=2, C=2, meaning A is the most probable then we count:
A = 0
B = 10
C = 11
A = 0
B = 10
C = 1100
D = 1101
E = 1110
Much faster, smaller, and generally better than sending over a whole tree.
Your example isn't a complete/perfect tree, one of the length 4 symbols could be 3, for level sequence 1,1,1,2. Converted to internal node counts (excluding root) this is 1,1,1 and could be encoded 0,0,0 (commas for illustration only). That's a little boring, another example is:
A = 00
B = 01
C = 10
D = 110
E = 111
I'm not sure if this is well known, I thought I'd discovered it but there's an equivalent coding in Narimani and Khosravifard 2008, "The supertree of the compact codes."
A. You can avoid encoding the number of leaves by ending the string with impossible sequence 0,11, that is 1 internal node branching into at least 3 at the next level.
B. Or, you can use this as a 1-to-1 mapping between binary strings and canonical trees. If you read 2m-1 1s after a row with m internal nodes, you know this row has 2m so the final 0 isn't needed. (e.g. 2,1 would just be 1,0 instead of 10,0). This can't be combined with A since there are no longer impossible sequences, but at most (on a tree with all symbols the same length) it only saves lg(n) bits, so if you want to save bits you're better off with A.
If you're encoding a byte stream, the receiver already knows what the set of symbols is, although not the order. Is it really more efficient to send the ordered list of symbols plus the histogram of code lengths than it is to send a length per symbol, with an implicit list of symbols? Surely not.
But DEFLATE sends the list of 288 code lengths, with all the symbols being implicit.
It also compresses that list with another layer of repetition encoding and huffman encoding.
For example, if you were encoding DEFLATE's default huffman table, you would first write it like (8, repeat 143, 9, repeat 111, 7, repeat 23, 8, repeat 7), and then you would compress that with a separate tiny huffman table.
The tiny huffman used on code lengths is itself encoded as a series of 3 bit code lengths, taking up 5-ish bytes, with the order of symbols predefined in the spec.
...but I believe DEFLATE goes the exact opposite way, for reasons unknown.
As always, ryg has great posts on this stuff: https://fgiesen.wordpress.com/2018/02/19/reading-bits-in-far...
With a normal (ascending) canonical code, e.g. "2 codes of length 3, 2 codes of length 4" becomes
The DEFLATE RFC describes it as such.
"Huffman codes in bit-reversed order (i.e., with the first bit of the code in the relative LSB position)."
What's the reasoning of leaving out the tree structure needed for decoding in this argument?
I think it still suffices as an example because it would be easy to infer how this scales up to an entire book. Finer details are left out, but is that the sort of detail that should be present in a very short introductory article?
My intention was to pick an example that produced a small tree with only a few leaf nodes (so that the diagram was easy to follow), but still contained some duplication.
My hope was that somebody new to the concept could then infer the results for larger inputs.
I did not intend to imply that this would be a valid use case for building a Huffman tree in practice.
It captivated my interest immediately - it was such a simple and effective approach, and it demystified how compression algorithms worked.
I really like this overview. It's not meant to be a full discourse, more just an intro for newcomers. And I think it gets the basic idea across very effectively.
Probably the first non-trivial data structure/algorithm I had implemented (previously I was making games, where you need no algorithm more complex than rectangle intersection)
Why should it work different? Text files are just regular files where the bytes only use a specific subset of the possible values they could have.
If you do have some knowledge about the data you are encoding, you can be a little bit smarter about it: e.g. for the text section of an executable, you might work on individual instructions instead of bytes, maybe use common, prepared Huffman Trees, so you don't have to encode the tree itself.
On a side note: IIRC the Intel Management Engine does that using a proprietary Huffman Tree, backed into the hardware itself, as an obfuscation technique.
To circle back to your question: PNG simply feeds the pixel data into the zlib deflate() function as-is.
And it actually does often work differently for PNG! PNG has a handful of preprocessing options for the pixels. So in filter mode 2, for example, deflate is encoding the difference between each pixel and the pixel above it. More or less.
I think this is true even when using 16 bits per sample, but i'm not sure.
From what i remember, in gzip, Huffman coding is applied to the offsets of repeated blocks, which are not single bytes.
Some encoders just use a precomputed Huffman table, but you can make an optimized one as discussed here.
Huffman is not the most optimal compression that can be used, for example Dropbox's Lepton saves an additional 20% by replacing the Huffman stage with something better.
However, since it's a lossless stage this can be done transparently, which is nice.
Anything to save a few bits...
It is generally used as the final stage of some other compression algorithm, and operates on symbols generated by that algorithm. Often, this is some variation on LZ77, and the symbols are something like "bytes 0-255" in addition to various symbols that denote a match in previous data of some length and at some offset.