
Zstandard v1.3.4 – faster everything - terrelln
https://github.com/facebook/zstd/releases/tag/v1.3.4
======
koolba
One of the coolest parts of zstd is the simple support for custom
dictionaries. If you have a lot of mid sized blobs that you want to separately
compress (so that you can separately decompress them), you can create a common
dictionary that covers the entire corpus. In real world use cases the
compression ratio can go from 3x to 9x: [https://github.com/facebook/zstd#the-
case-for-small-data-com...](https://github.com/facebook/zstd#the-case-for-
small-data-compression)

What was the finality on the issue of patents with this library? Is there an
active covenant from Facebook not to sue users?

~~~
terrelln
The library is dual licensed under plain BSD [1] and GPLv2 [2].

[1]
[https://github.com/facebook/zstd/blob/dev/LICENSE](https://github.com/facebook/zstd/blob/dev/LICENSE)
[2]
[https://github.com/facebook/zstd/blob/dev/COPYING](https://github.com/facebook/zstd/blob/dev/COPYING)

~~~
koolba
Which only makes things more confusing. Is it dual licensed, both apply? Or
dual license, your choice? That only complicates the patent question further.

~~~
terrelln
The GPLv2 license was added for inclusion in the kernel, you may choose either
GPLv2 or BSD. See the header file
[https://github.com/facebook/zstd/blob/dev/lib/zstd.h](https://github.com/facebook/zstd/blob/dev/lib/zstd.h).

~~~
koolba
Sweet. Would be nice if that was in the README too as it only mentions the BSD
license, but the top level includes the GPL as well.

There's still the open patent question but this (i.e. BSD license) makes it
easier to decide to use it.

~~~
terrelln
It is mentioned at the bottom of the README, but it isn't very findable. I've
opened a PR
[https://github.com/facebook/zstd/pull/1085](https://github.com/facebook/zstd/pull/1085).

~~~
exikyut
Which got accepted (an hour after you submitted it) - nice!

------
portmanteaufu
From the project homepage[1]:

"Zstandard, or zstd as short version, is a fast lossless compression
algorithm, targeting real-time compression scenarios at zlib-level and better
compression ratios. It's backed by a very fast entropy stage, provided by
Huff0 and FSE library."

1: [https://github.com/facebook/zstd](https://github.com/facebook/zstd)

------
stochastic_monk
A very important aspect of this project is the zlibWrapper. It provides a
interface for transparently reading from zstd-compressed, zlib-compressed, or
uncompressed files with one API. It's quite fast, provides an excellent
balance between speed and compression ratio, and is generally my preferred way
to work with compressed data.

Their emphasis on branchless coding is even more valuable now than it was when
zstd was released, since we are living in a post-spectre/meltdown world.

I dislike Facebook as a company, but I cannot find fault with their
engineering.

[Edit/Off-topic: I want someone to make an LZ77/DEFLATE/Led Zeppelin pun-based
product sometime. Please. That joke has been begging to be made since at least
1990.]

~~~
tmd83
In this case though the core innovation actually happened pre-facebook. Post
facebook it got more polish/tweaking/multi-threading etc. but I don't recall
any breakthrough kind of change.

------
segmondy
I'll still pick lz4 based on the benchmark.

~~~
loeg
Which algorithm is best is use-case dependent. As of now, zstd offers best-in-
class compression for a wider variety of use cases than lz4. lz4 (created by
the same author) still wins for high-throughput software compression, yes. But
_zstd --fast 4 or 5_ are getting pretty close.

~~~
0xcde4c3db
It's not obvious to me what the relevant measurements are on the zstd side,
but I'm pretty sure lz4 wins considerably where code size and RAM footprint
are major considerations, as in some bootloader and embedded firmware
situations.

------
devinus
These are rather large release notes for a patch release.

------
realPubkey
I wonder if it's possible to create a blockchain where the proof-of-work
consists of building a better compression-dictionary instead of doing useless
hashing.

~~~
s17n
No

~~~
h4l0
To give a little detail on why this is not possible, proof-of-whatever should
be quickly verifiable while hard to generate. Compression might sound like it
can achieve both requirements. However, raw data needs to be very large so
that the problem is not trivial. Large data is a red flag in any blockchain
application.

------
throwaway84742
I wish he’d stop tweaking it at this point. Newer versions are incompatible
with older versions, which is a cardinal sin if you want mass adoption for
something like this.

Or at least identify a stable subset of some sort and put it into another tool
that I would not be hesitant to use.

So when I need something very fast I use Snappy or lz4, and when I need decent
compression ratio I use pigz.

~~~
antientropic
Incompatible in what way? API? File format?

~~~
terrelln
Zstandard also maintain ABI stability for a portion of the API, and require a
macro definition to access the unstable parts.

