
History of Lossless Data Compression Algorithms - _nullandnull_
http://ieeeghn.org/wiki/index.php/History_of_Lossless_Data_Compression_Algorithms
======
derf_
Sadly, the description of arithmetic coding bears almost no resemblance to the
actual algorithm (it roughly describes the equal probability case, but that
misses most of the point). The description of Shannon-Fano as "bottom up" and
Huffman as "top-down" is also exactly backwards (the actual descriptions of
those algorithms are accurate, but the labeling is confused).

The article contains a lot of terms you can search for if you are interested
in these things, but sadly is not very informative on its own.

------
jmspring
An interesting read, I will need to go through it with more care when I get
home. A couple of personal interests are having studied with both David
Huffman (though not for compression) and Glen Langdon (one of the arithmetic
coding pioneers), I need to see how this article compares with my notes.

I also need to see about getting some of my old (15+ years) course notes
online.

Just in those two, much history and memory lost and a hit for UC Santa Cruz's
computer engineering and science departments (Huffman passed a number of years
back, Glen this year after a period of retirement).

Edit - the scope of the article and the title provided...a serious disconnect
in terms of breadth. But, better than nothing.

~~~
dekhn
I took Huffman's class, 'Cybernetics', at UCSC when I was an undergrad, mainly
because I thought I'd learn how to make robots.

The first day it didn't take long for the lecture to turn to 7-dimensional
spheres, their packing, and its applications to network routing. He was a
tough bastard-- lots of multiplication of large numbers on tests, with only a
table of logarithms and exponentials. He loved tricky questions on the tests-
for example he taught us Karnaugh maps, mentioned the sides wrapped around,
then had an example on the final with 1s in all four corners (I hadn't
realized maps wrap around the corners, too).

Great teacher; wonderful lecturer, hard grader. I failed the class (I was a
biochem major and graduated), the only CS course I took in college.

I really wish I had kept the notes from that class.

------
adbge
If anyone is interested in the PAQ code, I have put one of the stronger-but-
open-source variants on GitHub, here:
[https://github.com/robertseaton/paq8pxd](https://github.com/robertseaton/paq8pxd)

------
webreac
This article is very good, but it is very short about gzip. Event if there was
no technical innovation, I think that gzip was really important for
development of compression softwares that are not patent-encumbered.

------
khitchdee
Seems like even though people went after this, there's not been much
innovation, last few decades. Most of the new stuff looks quite incremental.
It almost seems like we are losing our edge in our ability to build from the
ground up. This is because we are part of a system that maintains an eagle's
eye on new ideas. We need to take some of that pressure off us so that we can
think a bit outside the box.

~~~
gwern
I wonder how much of this is that relatively simple techniques work very well
on small amounts of data, and that more complex techniques don't perform
sufficiently better to justify baking them into a compression format?

That is, to some extent I subscribe to the compression-as-intelligence school
of thought: compressors are tiny little AIs which try to predict regularities
in the bitstreams they are given (
[http://prize.hutter1.net/](http://prize.hutter1.net/)
[http://mattmahoney.net/dc/dce.html](http://mattmahoney.net/dc/dce.html)
[http://www.danburfoot.net/research.html](http://www.danburfoot.net/research.html)
).

But when we look at the state of the art like ZPAQ, the AI techniques used
don't seem to be much more complex than, say, a one or two layer neural
network which might as well be from the 1970s. You don't see anything fancy
like deep networks or other modern staples like random forests.

So this makes me wonder: maybe compression performance has stagnated because
we're not willing to provide compression algorithms extremely large amounts of
data or runtime, and so simple algorithms really do perform best with the
minimal resources we're willing to use for compression. (People are happy to
run neural networks on thousands of GPUs with many gigabytes of data and wait
weeks for training to finish; can you imagine a compression utility which
required that?)

~~~
vtuulos
It naturally follows from the compression-as-intelligence school of thought
that building general-purpose compression/intelligence is hard.

I very much believe in domain-specific intelligence, and correspondingly
domain-specific compression. Here's a practical business use case for lossless
compression (in-memory analytics), which I have been developing:

[http://tuulos.github.io/pydata-2014/](http://tuulos.github.io/pydata-2014/)

In contrast to general-purpose encoders, this approach is extremely data-
intensive, compressing terabytes per chunk.

~~~
gwern
> that building general-purpose compression/intelligence is hard.

Yes, but my point is. that we seem to be doing better at general-purpose
intelligence than at compression despite the apparent equivalence of progress.

------
heyalexej
Site seems to be down. Cached version:
[http://webcache.googleusercontent.com/search?q=cache%3Aieeeg...](http://webcache.googleusercontent.com/search?q=cache%3Aieeeghn.org%2Fwiki%2Findex.php%2FHistory_of_Lossless_Data_Compression_Algorithms&oq=cache%3Aieeeghn.org%2Fwiki%2Findex.php%2FHistory_of_Lossless_Data_Compression_Algorithms)

------
_delirium
One interesting domain-specific class of compression algorithms not mentioned
here is for lossless audio compression, which tends to use a different (though
also pretty simple) technique, somewhat related to PPM. A common approach is
to predict the waveform using linear predictive coding (LPC), and then
entropy-code the residuals. FLAC does that, for example.

------
mistercow
> Arithmetic coding is arguably the most optimal entropy coding technique

Provably optimal, even (assuming infinite precision).

------
elliptic
Are there compression techniques that are only "mostly" lossless? I was
thinking something along the lines of: for delta, N > 0 the probability that
the compression & decompression of a random stream of N bytes will result in
loss is less than delta?

------
oakwhiz
I'm surprised that LZ4 is not in the list. It's based on LZ77 like many of the
others.

------
mariuolo
They left out .zoo (LZW) and .lzh/.lha (LZSS).

Both were quite widespread in the BBS era.

~~~
jmalicki
They mentioned LZW, LZSS, and LZH, if you had bothered to read it.

------
hcrisp
How does Snappy compression fit into the picture?

~~~
duskwuff
It's another LZ77 derivative. Nothing unprecedented.

~~~
wolf550e
It's the current leader in performance:
[http://fastcompression.blogspot.com/p/compression-
benchmark....](http://fastcompression.blogspot.com/p/compression-
benchmark.html)

~~~
maaku
Speed != efficiency. Some of the tricks snappy uses e.g. not compressing at
all, are not that interesting theoretically.

------
leftrightupdown
so based on this, if someone wanted best compression program they would choose
paq?

~~~
kybernetikos
By the nature of things, anything that compresses some input data must
necessarily lengthen other input data, since you can't get away from the fact
that there are only so many input files that can be represented by the output
number of bits. In fact, it will almost certainly lengthen many more of the
possible inputs than it shortens.

I once heard someone describe compression programs as 'expansion programs with
interesting failure cases', and so, of course, the best compression program to
use depends on exactly which failure cases you're interested in.

~~~
skybrian
While true this doesn't seem to be a practical issue. Any uncompressable data
can be encoded using only one bit of overhead, where the bit is a flag
indicating whether the rest of the data is compressed. In practice, there is a
header and a field indicating which compression method to use. You pay for the
size of the header. Adding support for another compression method is nearly
free as far as space is concerned; one byte can switch between 256 of them.
(Time is another matter.)

------
drydot
I miss ARJ in the article

------
danieldrehmer
insert weissman score joke here

------
SurfScore
But what about Pied Piper?

~~~
jamra
Do we really need to hear about a fictional TV show every time compression is
brought up?

~~~
joshuajenkinsyo
It's actually a real TV show

~~~
jamra
Yes it is. I'm sorry. Poor english. Thank you for your constructive comment.

------
thisjepisje
...GIF is lossless?

~~~
AlyssaRowan
Absolutely, the _compression_ part of it (LZW) is, yes. (LZW has been out of
patent for about a decade, by the way, but it isn't particularly noteworthy on
its own.)

GIF looks like crap on some images because it's paletted to
[2,4,8,16,32,64,128,256] colours - so many images are reduced to a palette
(say, with an octree) and/or dithered (perhaps badly, as dithering tends to
increase noise, if not entropy), and also sometimes because some techniques
exist (one implementation can be found in Photoshop's "Save for Web") which
perform lossy transforms on the data so LZW compresses it better - the result
is noisier, however, because it intentionally introduces repeating patterns.

