
Zstandard – Fast and efficient compression algorithm - jaytaylor
https://github.com/Cyan4973/zstd
======
ot
This looks very interesting. Before everybody starts making the obvious
comparisons, note that this is from the same guy that made LZ4, and he is
someone who clearly knows what he's doing. I've been following his work for
some time.

It looks like this is the evolution of Zhuff, an experimental (closed-source)
compressor [1]. It is basically LZ4 followed by a fast entropy coder,
specifically FSE [2], that is a flavor of arithmetic coding that is
particularly suited for lookup-table based implementations.

From a quick look at the source code it seems that the entropy stage uses 3
probability tables, one for literal bytes, one for match offsets, and one for
match lengths. This is not dissimilar from gzip (which however uses Huffman).

EDIT: from a second look it seems that the LZ77 compression stage is basically
LZ4: it uses a simple hash table with no collision resolution, which offers
very high compression speed but poor match search. I'm surprised he didn't
implement (yet?) an HC variant as for LZ4, it could even beat gzip compression
rate with no overhead in decompression speed/memory requirements.

[1]
[http://fastcompression.blogspot.fr/p/zhuff.html](http://fastcompression.blogspot.fr/p/zhuff.html)

[2]
[https://github.com/Cyan4973/FiniteStateEntropy](https://github.com/Cyan4973/FiniteStateEntropy)

~~~
MichaelGG
I wish some of these provided static, preset dictionaries. I'm working on a
packet capture system for SIP, which has a similar layout as HTTP. Thus all
the common header fields and values are perfect for a preset dictionary, and
in fact, the compression doesn't even need to keep any more state than that
dictionary. That is, packets share more with the preset dictionary than with
each other.

LZ4 allows you to "prime the stream" as it were, but I'm not sure it is really
made for this scenario. As far as I can tell, I'd have to essentially have
separate compression/decompression calls for each packet, resetting the state
to the dictionary between each packet.

~~~
bhouston
What one needs is a preset dictionary. Have a hash of it during compression
and then require a preset dic on decompress that has the same hash. Should be
a straightforward extension but it is a format change. I think preset dic
would be useful in a lot of contexts.

~~~
MichaelGG
But the preset dic functionality I've seen, in say, LZ4, really is just the
saved state of a normal compression run. So once you start compressing, the
dictionary eventually evaporates as more data does in and backreferences can
no longer point back that far into the dictionary. That's fine if all your
content compresses well, but if the dictionary is a far superior state...

------
twotwotwo
Neat.

Google's Gipfeli
([https://github.com/google/gipfeli](https://github.com/google/gipfeli)) also
aims for compression ratio and time to fall in between gzip -6 and the fastest
algorithms. I've only glanced at code; think it includes a Snappy/LZO/LZ4-like
repeat matcher using a hashtable, combined with simple-but-fast entropy coding
where each literal input byte becomes 6, 8, or 10 bits.

Super cool that Collet's working on still-pretty-fast-but-better compression
out in the open.

~~~
david-given
...I don't know anything about compression, but 'gipfeli' is Swiss German for
'croissant'.

~~~
twotwotwo
Yep.

Two of Google's other custom compression tools are Zopfli (much slower zlib
implementation producing slightly smaller files, for things you compress once
and serve many many times) and Brotli (high-compression algorithm used in the
WOFF2 font format). The bread algorithms!

------
jaytaylor
Also see corresponding article:
[http://fastcompression.blogspot.fr/2015/01/zstd-stronger-
com...](http://fastcompression.blogspot.fr/2015/01/zstd-stronger-compression-
algorithm.html)

------
pdknsk
I wonder how it compares to Intel's custom zlib implementation, and to a fast
zlib compressor by IBM researchers.

[http://www.intel.com/content/dam/www/public/us/en/documents/...](http://www.intel.com/content/dam/www/public/us/en/documents/white-
papers/zlib-compression-whitepaper-copy.pdf)

[http://dx.doi.org/10.1109/DCC.2014.66](http://dx.doi.org/10.1109/DCC.2014.66)

~~~
twotwotwo
Zstandard beats the Intel zlib tweak (as does Google's Gipfeli, mentioned in
another comment) -- Intel's zlib -6 could also be described as "faster than
stock zlib with only a little compression loss", but (from the chart in the
Intel paper) it's "only" 1.8x faster when Zstandard is closer to 10x.

The IBM implementation's abstract mentions "a factor of 2.6x or higher" so
similar deal, unless the "or higher" language hides a 10x gain. Also, abstract
says they did it by taking LZ4's matcher + Zlib's entropy encoder, so they
can't be expected to outperform the LZ4 author's new work. :)

Of course, those are still some great backwards-compatible gains.

Could be that on modern processors, Huffman coding just isn't the best option
to make your compressor scream. Gipfeli uses a simple non-Huffman entropy
code, and Collet (author of Zstandard) has been working on a state-machine-
based coding approach for a while. deflate is over 20 years old, so it would
make sense for it to be designed for different realities.

Speaking of chipmakers and zlib, though, Intel sells "QuickAssist" dedicated
compression hardware, and AMD promises a compression accelerator in their
server ARM SoC. As with AES, performance in software might come to matter less
if systems start doing it in hardware. I don't know the silicon-area
requirements for a compression accelerator, though (whereas AES's design was
hardware-optimized), and I haven't seen any suggestion that compression
accelerators are on the roadmap to be in any consumer chips.

------
ac29
Looks like this would be an excellent candidate for transparent compression in
newer filesystems like btrfs/ZFS. Ill never go back to non-compressed
filesystems for my personal machines -- btrfs w/ LZO compression has been an
excellent, efficient, fast FS for me (YMMV -- I dont know that I'd use it in
an important production environment).

------
bhouston
How does it compare to lzham? That alto has some amazing speed and ratio and
is zlib compatible.

~~~
MichaelGG
This looks to compress much faster and suited for some realtime usage. lzham
is OK for decompression (but still slower) and is like 8x slower on
compression.

------
spcoll
This could be a good replacement for gzip/zlib. Similar compression ratios,
but much faster.

There is another variable that isn't mentioned here: memory efficiency when
compressing/decompressing large files. For me it's an important feature of
gzip.

------
acqq
I dream about 7zip adding this algorithm, having the convenience of
reliability of 7zip together with the speed which would allow me to never
avoid compression because it would be always almost as fast as plain copying.

------
aristidb
Zstd and LZ4 look amazing. Here's another compressor I really like:
[http://mattmahoney.net/dc/zpaq.html](http://mattmahoney.net/dc/zpaq.html)

------
jheriko
the interface is good. :)

i can not praise that element enough. so many libraries have gigantic,
needlessly elaborate interfaces, and forget to provide clean interfaces for
the most common use cases. that makes it hard to use them... this however is
easy to use.

