
Zstandard – Fast real-time compression algorithm - pierreneter
https://github.com/facebook/zstd
======
pvg
Bunch of previous discussions:

[https://hn.algolia.com/?query=zstandard&sort=byPopularity&da...](https://hn.algolia.com/?query=zstandard&sort=byPopularity&dateRange=all&type=story&storyText=false&prefix&page=0)

------
glinscott
If you are interested in compression at all, be sure to take a trip through
Charles Bloom's blog [1]. It's an incredible read, he covers everything from
the basics all the way through state of the art algorithms.

A great example is this post [2], where he talks about how to correctly
implement a Huffman encoder/decoder. It's a lot tricker than it is made to
sound in most books. For example, most Huffman codes that are used in practice
are length limited, to allow the decoder to use smaller lookup tables. There
are a bunch of surprisingly interesting tricks to get that to work well from
the encoding side (which symbols do you choose to be smaller than they would
be otherwise?).

[1] [http://cbloomrants.blogspot.com/](http://cbloomrants.blogspot.com/) [2]
[http://cbloomrants.blogspot.com/2010/08/08-12-10-lost-
huffma...](http://cbloomrants.blogspot.com/2010/08/08-12-10-lost-huffman-
paper.html)

~~~
lifthrasiir
I like Charles Bloom's blog, but I'm not sure if it's very approachable (after
all, it's "rants" :-). If you want more diversity in the readings, ryg blog
[1] makes good reading. Start with the most recent series on efficiently
reading bits [2].

[1]
[https://fgiesen.wordpress.com/category/compression/](https://fgiesen.wordpress.com/category/compression/)

[2] [https://fgiesen.wordpress.com/2018/02/19/reading-bits-in-
far...](https://fgiesen.wordpress.com/2018/02/19/reading-bits-in-far-too-many-
ways-part-1/)

------
pierreneter
It has been standardized as RFC 8478
([https://tools.ietf.org/html/rfc8478](https://tools.ietf.org/html/rfc8478))
and Hypertext Transfer Protocol (HTTP) Parameters
([https://www.iana.org/assignments/http-parameters/http-
parame...](https://www.iana.org/assignments/http-parameters/http-
parameters.xhtml)) with name: zstd

~~~
ncmncm
"Despite use of the word "standard" as part of its name, readers are advised
that this document is not an Internet Standards Track specification; it is
being published for informational purposes only."

So, there's an RFC, but it is not standardized, per se. But close enough, for
many uses.

~~~
duskwuff
It's as standard as most file formats get. The IETF reserves the phrase
"Internet Standard" for certain _very_ important protocols -- there's no place
in the process for file formats.

~~~
tialaramex
The IETF has a Standards process, which develops new de jure standards and
this just isn't one of those. Most products of the IETFs dozens of working
groups (e.g. the replacement effort for CAT is being worked on by a group
named kitten) will be Standards Track documents.

IETF working groups are perfectly capable of defining file formats, eg RFC
7468.

ZStandard wasn't developed using the IETF process, that's why it isn't on the
IETF Standards Track. PNG likewise is not on the Standards track, whereas Ogg
(the container format) is.

------
yjftsjthsd-h
So "real-time" here means "probably line speed", not "hard realtime (constant
time per byte)", right? That's probably _more_ impressive, but a bit ambiguous
phrasing.

~~~
caf
Probably "online" would be a better term than "real-time" for this meaning.

------
maxpert
I’ve used lz4 several times for compressing blobs and storing them for serving
real-time traffic. In my opinion LZ4 and Snappy hits fine balance of CPU usage
and network usage when you are sensitive to p99 latencies. ZSTD seems to pay
more attention to compression size. I hope to do a real life test instead of
sythetic benchmarks. Has someone already used it in prod?

~~~
matsur
One production write up, compressing Kafka messages at Cloudflare:
[https://blog.cloudflare.com/squeezing-the-
firehose/](https://blog.cloudflare.com/squeezing-the-firehose/)

“For our data zstd was giving amazing results even on the lowest compression
level. Compression ratio was better than even gzip at maximum compression
level, while throughput was a lot higher.”

------
franciscop
The "training mode" bit sounds amazing. Imagine training on the top 1000
websites with this in 3 modes: zhtml, zjs, and zcss. Then make the training
output a different standard, that basically encode the peculiarities of the
languages. Finally apply compression in the server and in the browser, that
would be basically the same as zstd but without having to send the dictionary
each time.

It might be small gains, it might be large ones on both transfer size and
decompression speed, I'd love to see some tests on this. The best thing is
that, if a browser (say Chrome) and a CDN (say Cloudflare) agreed on something
like this there would be no need to even to anything on the front-end nor the
server side, automatic free benefit for the users.

~~~
felixhandte
We are investigating doing that! We're pretty excited about the possibilities
too!

------
waterhouse
Standard comment on zstd: The "zstdcat" program (equivalent to "zstd -cdfq", I
think) is capable of reading a few other compression formats, including .gz
and .lz4 (apparently .lzma and .xz are available too), if support is built in
at compile time. _Unbelievably, zstdcat is 1.5x faster (in my experience) at
decoding .gz files than zcat / gunzip / gzip -d._ I dare you to try it.

Anyway, this should ease transitions away from legacy compression formats.

~~~
goodside
If you find 1.5x-faster gzip useful, you should consider benchmarking pigz, a
parallel gzip implementation that goes faster still:
[https://zlib.net/pigz/](https://zlib.net/pigz/)

Similarly there's lbzip2 ( [http://lbzip2.org](http://lbzip2.org) ) for
parallel bz2.

~~~
waterhouse
Good point. But for my use case, I have hundreds or more of logfiles, and I
parallelize at that level already. So I'm concerned with the total CPU time
per file more than anything else.

------
hultner
Have seen zstd pop up frequently in zfs and hammer2 discussions, have been
running lz4 on FS (and in some cases in for RAM-compressions) for a while.

From my point of view zstd looks like a very interesting alternative to gzip
since it's an order of magnitude faster in tests I've seen.

But lz4 seems to still be the champion for raw throughput speed with decent
compression, this might change (have changed?) with the negative compression
modes in zstd.

It would be interesting to hear from people who've got a bit more hands on
experience with zstd in theses contexts.

The dictionary training, would that be applicable on a dataset/volume in a FS
context? It would be awesome if for instance I have a dataset for jpg and
another for raw-photos and I could get some good compressions for those.

Media usually yields quite bad compression ratios using more traditional
compression formats, dedupe can improve this some but usually requires large
DDTs (deduplication tables). Could the dictionary training be an alternative
in these cases?

------
nvahalik
I've been using zstd now for over a year to compress my large SQL dumps.
Consistently amazed at how fast it works and how small the results are.

------
mnw21cam
So now what the world needs more is for rsync (or heck even just ssh) to have
an option to use zstd compression, instead of gzip. Using gzip compression is
great if you're moving stuff over a slow connection, but I'd like a faster
method for when I'm moving loads of data over Gb ethernet between fast discs.
I'd even settle for rsync/ssh supporting lz4.

~~~
aberoham
This may help?
[https://github.com/facebook/zstd/issues/1155](https://github.com/facebook/zstd/issues/1155)

~~~
mnw21cam
No, that's a completely different feature.

------
olliej
Ok so they say that small file compression is achieved via a training step to
produce a dictionary for subsequent compression/decompression cycles.

For this to work either you need the library to include all of those models,
or you have to transmit those models at least once so they can be cached by
the recipient.

I don't see why any of the other compression schemes couldn't also use that
type of bootstrap mechanism. Obviously it would not be binary compatible with
the baseline libraries, but it's seems disingenuous to claim a huge
improvement if the bulk of it is coming from just that.

~~~
asdfasgasdgasdg
Those other compression algorithms _don 't_ have that feature. If they did
have that feature, they would be different than what they are. If they did
have that feature, zstd would not claim a big improvement. But they don't, so
it does.

~~~
felixhandte
To add to @ot's reply, zlib and lz4 actually do support dictionaries. (Zlib
via `deflateSetDictionary()`, lz4 via
`LZ4_loadDict()`/`LZ4_attach_dictionary()`.)

A few things set Zstd's implementation apart.

1\. Zstd actually comes with tooling to generate dictionaries (`zstd --train`,
`ZDICT_trainFromBuffer()`). No other compressor ships with this capability,
even libraries that support using dictionaries. So we use Zstd to create
dictionaries at Facebook, even when, for example, the application is using
lz4.

2\. Both zlib and lz4 treat dictionaries as strictly prefixes to make LZ77
matches into. Zstd additionally can use metadata in the dictionary to prime
the entropy stage.

3\. Zstd's support for efficiently using dictionaries is much more extensive
than other compressors'. Dictionaries are much more a first class citizen in
the internals of the algorithm. Zlib implements support for dictionaries
similarly to @ot's suggestion. I.e., the dictionary must be parsed/loaded at
the beginning of each compression, or (slightly more efficiently) copied from
one pre-loaded context into a working context that will then be used by the
compression. For very small inputs--which is where dictionaries are most
effective--this loading and/or copying can end up being the bulk of the work
performed. LZ4 used to work this way, but additional functionality was
added--`LZ4_attach_dictionary()`--that let it use the dictionary in place (as
a warm-up exercise in a simpler codebase in preparation for doing the same
work in Zstd). Zstd includes mature support for maximally pre-processing a
dictionary, producing a `ZSTD_CDict`. This object can then be searched in-
place with no per-compression set-up work. This lets Zstd use a large
dictionary over and over again very efficiently.

------
bruce_one
I always find
[lrzip]([https://github.com/ckolivas/lrzip](https://github.com/ckolivas/lrzip))
is under appreciated when it comes to compression discussions; it doesn't suit
all circumstances, but it works really well in the ones it does (we're using
it with the nocompress flag and then using zstd, hence why it comes to mind
:-) )

Edit: it's not well suited to real-time...

~~~
terrelln
If you're using lrzip, you should also check out zstd long range mode [0]. It
uses a long window (128 MB by default, up to 2 GB), together with an efficient
search strategy, and multithreading. For example, a 2 GB window, with 4
threads, at level 10:

    
    
        zstd --long=31 -T4 -10
    
    

It should be faster than lrzip + zstd and provide about the same results.

[0]
[https://github.com/facebook/zstd/releases/tag/v1.3.2](https://github.com/facebook/zstd/releases/tag/v1.3.2)

------
walrus01
Very nice to see that it's released under a standard BSD license.

~~~
pmontra
And also GPL 2.

> The project is provided as an open-source dual BSD and GPLv2 licensed C
> library.

I don't understand how this works. "You either contribute back your changes or
not". Wouldn't the BSD license be enough for that?

~~~
dymk
It's so projects that are licensed under either of GPL2 or BSD can use it - so
there's no worry about license incompatibility.

~~~
geokon
Is that meaningful? Since you can make it close source, can't you as well fork
a BSD project and re-license it as GLPv2?

~~~
TheDong
Copyright does not work that way.

You cannot relicense code you are not the copyright holder of (ignoring public
domain, that's a special case).

Just because it is under a permissive BSD license does not mean that you own
the copyright; only that the copyright owner permits you to use the material
permissively.

You are still unable to re-license it unless you are the copyright owner (or
the copyright owner gives you permission to do so, e.g. by dual-licensing the
code).

In practice, there's little reason to license under both BSD and GPL since the
BSD is compatible with the GPL on its own.

By the way, that might be what you're thinking of; when you have BSD code, you
can integrate it into a GPL codebase, but that's not because you're re-
licensing it as GPL, but because the GPL is explicitly meant to be compatible
with most other free software licenses. You're still using said code under the
terms of the BSD, which allow you to use it alongside GPL code.

Presumably what has happened here is that Facebook's lawyers felt more
comfortable being more explicit about the intended licenses, though I can't
think of a concrete reason why they'd feel the need to do so.

~~~
terrelln
GPLv2 was added before the patents grant was removed, so zstd could be
included in the linux kernel.

------
sligor
So... chose: \- lz4 for speed/cpu usage \- zstd for compression ratio \- zlib
for portability/compatibility with correct compression ratio... but slow...

And dump everything else ?

