Hacker News new | past | comments | ask | show | jobs | submit login
Zstandard – Real-time data compression algorithm (facebook.github.io)
286 points by josephscott on Jan 24, 2018 | hide | past | web | favorite | 96 comments

> Zstandard is a real-time compression algorithm, providing high compression ratios. It offers a very wide range of compression / speed trade-off

This would have been very handy about 17 years ago when I was doing a virtual CD-ROM driver, and needed to store compressed images, and the client wanted a wide range of speed/compression options for the user--much wider than zlib offered.

I ended up doing a hack that was either the most disgusting thing I have ever done, or was brilliant. I have not been able to decide which.

My hack: I gave them a compression/speed slider that went from 0 to 100. If the slider was set to N, I would apply zlib at maximum compression to N consecutive blocks, and then apply no compression to 100-N consecutive blocks. Repeat until the whole image is stored.

The client loved it.

Performance wise it's far from ideal. In your N block you're going to spend a lot of time, per-byte, to compress while leaving trivially compressible data in the remaining blocks. For N=50, you'd get much better compression for the same amount of cycles applying a faster compression to all the bytes.

> I would apply zlib at maximum compression to N consecutive blocks, and then apply no compression to 100-N consecutive blocks

Zlib is actually highly configurable - it has a huge number of tweaks beyond just that one flag of size.

DEFAULT_STRATEGY is pretty much only one possible strategy.

You can tweak the LZ77, RLE and Huffman separately to get different compression performance (the levels correspond to different LZ77 lookups and deflate_fast vs deflate_slow).

In my work, I use different zlib strategies for different data (i.e already bit-packed integers vs raw strings), which works out pretty well.

My current complaint is that the Zlib algorithm doesn't quite detect record boundaries when building a dictionary, which is all only relevant to the compression algorithm, because the output is eventually just a DEFLATE stream (Zopfli is an example of an encoder only Zlib library).

Zstandard has a much wider set of options to use already built-in (fast, greedy, lazy, btlazy, btopt, btultra) etc.

Nice hack, I like it!

Small proposal for improvement: Instead of

    N * "gzip -9"  +  (100-N) * "cat"
I would have used

    N * "gzip -9"  +  (100-N) * "gzip -1"
because "gzip -1" is really, really fast. Sometimes I use it just to speed up copying data (over fast network or onto large disks).

(Yes, I'm aware that zlib's output is slightly different from gzip's output, as gzip adds some more metadata on top of zlib, but this is irrelevant in terms of size and speed.)

My small proposal for further improvement: change the -9 to a -7.

In my experience, -9 is considerably slower than -7 but the gains in compression ratio are negligible. It's almost never a good trade-off.

You would be surprised how much it is relevant in term of speed ....

Reminds me of how LEDs are dimmed using Pulse Width Modulation: "40% brightness" actually means "on at 100% intensity for 40% of the length of each refresh cycle, and then completely off for the rest."

One notable thing is that perceived brightness is not a linear function of the duty cycle.

I don't know the numbers, but there exists a frequency and duty cycle at which humans perceive the light to be brighter even though there's less energy used.

Or playing PCM on the old PC speakers having only on/off beep function... It seemed like a miracle at the time :)

> Or playing PCM on the old PC speakers having only on/off beep function

That is PWM, pulse width modulation.


Ok then: playing PCM with PWM...

PC speakers ended up being pretty amazing for some games though - there was this golf game with good environmental sounds, some platformer that had spoken text, all through the old PC speaker. Sound blaster? Pffff.

Well, you could play 4-channel mods via PC speaker! Many games used it, but I most fondly remember Star Control II that was bathing in great music and sound effects, all through PWM-driven PC speaker, on a 33 MHz 80386.

I believe microwave ovens operate this way, too. 40% power means 100% power 40% of the time.

That's pretty crazy and definitely brilliant in that it solved a client problem entirely to their satisfaction.

The client loved it and it solved their problem. Therefore I'd call it brilliant.

Or the more generic: if it is you think it is stupid but it works, you might be wrong about it being stupid.

I used to like the saying, "If it's stupid but it works, then it's not stupid".

Having seen some of the things my co-workers have implemented, however, I now disagree completely.

So your code presented a 'virtual' optical drive to the OS, and the compression was to squeeze the on-disk virtual CD image?

I imagine compression could be a real performance win when reading from optical media, too.

Anyone here have any experience developing games for optical-media-based games consoles? I imagine high-performance on-the-fly decompression could have been useful on the PS2, for instance. Or perhaps I'm wrong and the weak CPU and slow seek times dominate things.

Vaguely related game-data madness: How Naughty Dog Fit Crash Bandicoot into 2MB of RAM on the PS1, https://news.ycombinator.com/item?id=9737156

Edit: I see compression is indeed used in those contexts - https://news.ycombinator.com/item?id=16228840

Compression has long been used in console games. Not only can you fit a bigger game on a ROM, but you could also use smaller ROMs (which were cheaper) to save money in the cartridge days.

Some CD-ROM games would actually store more than one copy of assets on the disc. It's much faster to read all your level data in one swoop than to use performance-killing seeks.

Sounds like your client was a moron.

How so?

The client, a major Japanese software distributor who was going to be selling copies of the software to end users, wanted a way to let the user decide how they wanted to trade off ripping speed with disk usage.

That was a perfectly reasonable thing for the client to want, given his extensive knowledge of the hardware characteristics of the computers that were in widespread use among his potential customers, and his knowledge of Japanese usage patterns.

What I gave him did that.

So how exactly is he a moron for (1) asking for a reasonable feature, and (2) being happy that he received that feature?

Asking for a “compression slider” with a 0-100 range, when zlib already had compression levels 0-9, is not reasonable. It simply does not make any difference in compression/speed, and needlessly complicates the storage format. The fact that he pushed it through, and was so satisfied with such an ad-hoc, needless hack, clearly illustrates that he’s one of those moronic clients who think they know everything better than the contractor and keep coming up with an ever increasing number of ridiculous and infuriating change requests.

He didn't ask for 0-100. He asked for multiple settings so the user could do a speed/compression trade off, but didn't say how many settings I should give him.

I tried zlib's compression level settings. Maybe it's different in modern zlib implementations, on today's hardware, but a couple of decades ago on typical hardware of the day the 10 compression levels of zlib really were clustered around only two or three performance levels.

I'm the one who decided on the skipping block approach, and decided to give him a percentage slider. I didn't actually expect him to accept that as is. I expected he'd test it at various settings, and pick out 4 or 5 specific percentages and ask me to replace the slider with radio buttons that selected those, and give them descriptive names.

But a lot of users like to have more knobs and tweaks on their software, especially their utility software, so my guess is that based on his understanding of the market he thought that his customers were those kind of users so accepted the slider.

Not exactly an unusual circumstance. And that is a good thing. If our clients understood the technicalities of what we do as well as we do for them, they probably wouldn't need to pay us for project work...

And it's our moral imperative to tell them why the thing they're asking for doesn't make any sense. If I were to ask a plastic surgeon to put wheels on my butt, the reasonable thing to do for him would be to reject my request, and explain me why it doesn't make any sense. I wouldn't expect him to do it and then brag on an internet forum about the clever hack he did to fit those Michelin tires onto my buttocks.

> And it's our moral imperative to tell them why the thing they're asking for doesn't make any sense.

It is, and I do.

But if they still want it anyway, unless it poses a security problem or other ethical issue or would reflect badly on us (as doing something silly very publicly could do), then it comes down to "we get paid for it or someone else does".

zstd is incredible, but just in case the thought hasn't occurred to someone here that may benefit from it: if you're in control of the send and the receive, type-specific compression is hard to beat.

For example, if you know you're dealing with text, you can use snappy, if you know you're dealing with images, webp, videos x264 (or x265 if you only care about decode speed and encoded size), etc and then fall back to zstd only when you don't have a specific compressor for the chosen file type.

Agreed that if you have the control (and potentially the time depending on the algorithm), type specific compression is the way to go. Having said that, zstd beats Snappy handily for text ^_^

On enwik8 (100MB of Wikipedia XML encoded articles, mostly just text), zstd gets you to ~36MB, Snappy gets you to ~58MB, while gzip will get you to 36MB. If you turn up the compression dials on zstd, you can get down to 27MB - though instead of 2 seconds to compress it takes 52 seconds on my laptop. Decompression takes ~0.3 seconds for the low or high compression rate.

ppmd will get you better compression on text (better than zstd and even better than lzma), but it is slow to compress and decompress. Use `7z a -m0=PPMd demo.7z demo.txt`

Why not use a XML/XSD specific compression like EXI for that ?


It really is mostly just text. It's not _quite_

<Article>... [30 kb of text] ...</Article>

but almost.

It really depends. I tend to use a specialized compression tool if I need to compress once and send/decompress often, but use zstd when I compress / decompress a lot. In my experience, if you have a fix, small amount of time(single digit minutes or less), zstd is the one that will compress to the smallest size. I even often pick `-3` as it is typically a lot faster than `-4` and subsequent, for not a huge difference in resulting size.

In my experience, if compression time is not a factor, for text (non-random letters and numbers), lzip is the best. I recently had to redistribute internally the data from python nltk, and tried to compress/decompress with different tools, this was my result (picked lzip again):

    gzip -9                 10 m  503 MiB  31 s
    zstd -19                29 m  360 MiB  29 s
    7za a -si               26 m  348 MiB     s
    lzip -9                 78 m  310 MiB  50 s
    lrzip -z -L 9 (ZPAQ)   125 m  253 MiB  95 m

I did some tests myself on a 22MB SQL file and it turns out:

* 7za -m0=PPMd produced the smallest file being faster than bzip2

* bzip2 turned out to be way faster than both lz (684%) and xz (644%) and produced a smaller file

* xz is marginally faster than lz, compressed sizes are about the same with the xz file being a tad smaller

* without any switches 7za produces an archive a bit bigger than xz and lzip in about the same amount of time

* gzip and zst produce about the same compressed size, only zstd is a lot faster (517%) than gzip

The 7z file was produced using the -m0=PPMd switch. For the other files no command line switches were supplied. Here are the file sizes:

  23668150 file.sql
   3899477 file.sql.7z
   4149962 file.sql.bz2
   5954982 file.sql.gz
   4540628 file.sql.lz
   4506720 file.sql.xz
   5961291 file.sql.zst

When going for smallest size, it'd be interesting to see your comparison using command lines switches for best compression (makes a big difference, both in terms of time and size).

Was bzip2 slightly, or considerably slower than zst?

Bzip2 being slower than gzip, yes, it's also considerably slower than zstd. Yet zstd -19 produced a bigger (4.3M) file in about the same amount of time.

If I can remember correctly zstd = 0.2s, gzip = 0.8s, 7zip (PPMd) = 2.1s, bzip2 = 2.7s, lzip, xz, 7zip (lzma) = 15..16s. This is CPU time from memory, might not be fully accurate.

I'd say zstd and gzip is better suited for general use, while bzip2 and 7zip (PPMd) are better suited for high compression of text files.

We've also had great success using zstd with training. We dump a lot of JSON data into kafka, most of which has a similar schema, and training it easily gave a 2-3x reduction in size over lz4.

Is there a type specific algorithm for data that mostly consists of close numbers? I figure if I send the deltas only it would be just a sequence of small / close numbers that would easily be compressed by standard compression libraries.

An example of close number sequences is just simple graphs. Your CPU temperature is 78 degrees, most likely it'll be 78, 79 or 77 the next tick, so they're almost close, the delta's will be 0's and 1's usually.

Compression via next-symbol prediction seems to be what you'd be looking for. That's what the PAQ compression schemes focus on, although they're very slow and definitely overkill for non-archival purposes. You'd probably just want to write out that data as deltas manually and have the reader know a delta format is being used. So I guess the answer is actually delta encoding, because that's a compression algorithm too.

Good point. While it won't beat a hand-optimized algorithm for a specific use case, compression with a dictionary (like zstd supports/encourages) is sort of a partial specialization of the algorithm to the type of data you're compressing.

Provided that the compression stems from repetition of blocks. If you have a file with 16-bit integers, each either exactly 1 or 2 greater than the previous one, you will have 128K with no repetition that zstd will be unable to compress; However, if you transpose the bytes (all high order bytes, followed by all low-order bytes), the compression will be very significant; similarly, if you replace the numbers with their difference.

The correct term from information theory is that it approximates a "universal" compressor with respect to observable markov or fsmx sources.

zstd is in snappy's domain, as a low overhead way of reducing bandwidth usage. You've got devs writing some service that talks in JSON or protobuf, just rub a little zstd on it, and bingo, your bandwidth is reduced.

I'd pick zstd over snappy for text

I used to object to using zstd based on Facebook's onerous license and patent policy; but now that zstd is plain BSD+GPLv2, I endorse it.

Zstd recently added a long range mode that can find matches up to 2 GB in the past using a specialized algorithm [1]. It can be enabled on the command line with `zstd --long` for a 128 MB window, or `--long=windowLog` for a `2^windowLog` byte window.

[1] https://github.com/facebook/zstd/releases/tag/v1.3.2

I'm not sure if you'll see this, but I have a question. I have a situation where I need to implement in-place compression with perfect rollback in case of failure. The first approach I thought of was feeding LZMA input until I had X amount of output, then finalizing and starting over, but then I started wondering if I could simply periodically serialize the LZMA state machine data every 30 seconds or so.

I haven't investigated either approach, but this is one of my next projects I'm working on, so I'll be figuring it out either way. I was wondering, does Zstd provide any ability to serialize/restore and continue where it left off?

(Context: Disk compression. Due to long-term factors beyond my control I've had severe disk space issues for over a decade (with frequently only tens of MBs free). Complex story ending in "cannot work". I recently realized that some of my disks contained data I didn't immediately need, and began wondering about ways I might be able to compress this data out "of the way" so I could carefully partition the remaining space. This would need to be done in-place as I would have nowhere with enough space to hold the compressed data, at least not to begin with.)

Lending/buying an empty (external) drive for starting seems to be much simpler and safer.

That would be a highly preferable option.

The last time someone gave me an old empty 320GB HDD they weren't using, one of my own disks started clicking about a week later and I was able to save everything on it. I still shake my head at that perfect timing.

Of course, this meant I lost all the additional free space, haha. One step forward, two steps back...

A non-obvious drawback of this is you give up multi-threaded compression (maybe fixed in a later release). You also need to use the same memory size when decompressing.

We intend to improve the interaction of multithreaded compression and long range mode in a future release, as they go hand in hand. We also intend to support mmap in the CLI, which should help reduce the memory usage for very large window sizes when writing to a file.

I understand this to mean that you believe the pages in the long range aren't accessed frequently. Otherwise, mmap doesn't reduce memory usage so much as just hide it.

Heavy mmap usage (VMWare sometimes uses an mmap to hold a guest's memory, for example) doesn't show up as memory usage in any system monitoring tools, and the system starts to thrash long before apparent memory usage gets high. Maybe someone here has a solution to that?

Generally, pages in long ranges will be accessed less than those in a short ranges, and pages beyond the window size will never be accessed. You can construct data that doesn't fit this pattern, but its probably a good bet to make. With a 1-2 GB window size, I would expect there to be large chunks of the window that are rarely accessed.

The long range matcher has a configurable match size, where it will only look for matches that are at least that large. By default it is 64 B, but by making it larger, say 4 KB, you can ensure that if you are going to force a page fault, you get enough benefit.

We added zstd to largely supersede gzip use at work. It performs better than oldschool gzip in both compression and speed.

That said, I'm not sure I'd call it a "real-time" compression algorithm. It's still an factor of 2x slower than lz4 compress and a factor of 4x slower decompress.

It's not clear exactly what "real-time" is supposed to mean, but I don't think it makes sense to define it in terms of other algorithms. If someone developed an algorithm faster than lz4, would lz4 cease to be real-time?

I thought that 'real-time' is not about how fast a program is. Instead, it's about being able to guarantee that a program will always complete in under a fixed amount of time, under all circumstances.

I can't see how that applies to Zstandard either.

It’s a definition not used much anymore but this relates to on demand (streaming) versus ahead of time data processing.

With only a small delay, can I compress and decompress at least one second of data per second, indefinitely.

Think live TV versus film.

That's usually called an "online" algorithm, in my experience. It does seem to agree with the way it's used in this context.

Online isn't quite the right distinction here imo, since that's really more about possibility than speed. An algorithm could be online (doesn't require all the data at once to run) but still not be fast enough to stream.

Talking about compression you're generally talking about things that when you factor in network overhead have a neutral or better impact on end-to-end time.

Yes, I that would be a correct definition although practically speaking the deadline that is guaranteed is important as well. Part of the problem is that the term "real-time" is overloaded.

Hard real time certainly means that. I've seen "soft real-time" used to refer to programs where the standard deviation of the runtime given random input is low.

It's really subjective based on your computing resources client CPU + network speed and your server CPU if you can't compress ahead of time. Here's a great page that captures this: https://quixdb.github.io/squash-benchmark/#transfer-plus-pro...

TL;DR: Your optimal compression algorithm will vary based on these parameters. There's no universal answer.

A curious fact I encountered: First, the zstdcat executable that ships with zstd (actually a symlink to zstd itself, roughly equivalent to "zstd -cqdf") can decompress .gz files. Second, zstd is faster than gzip at decompressing .gz files, taking about 2/3 as long.

I'm not kidding. I couldn't believe it myself, but subsequent testing stubbornly bore it out—on one file that was 15 MB after compression, and on a mix of smaller files. I tried compiling gzip from source, using the same compiler I used for zstd, and the results were the same. strace seemed to show zstd read and wrote in chunks 2x the size, but the number of syscalls didn't seem to be nearly enough to explain the difference. It seems to have this "zlibWrapper" subdirectory; its README has some benchmark numbers that I don't fully understand, but some of them seem to match the 2/3 factor I got.

I'm wondering if this is a clever tactic to drive adoption—get people to use the zstd executable even when they're still using gzipped files. ;-)

Also, the fact that it has support for reading (and, apparently, writing) gzip, lz4, and xz on top of its own format really makes "z standard" an appropriate name.

A zlib feature is being patent free, that might very well affect speed. I do not know if that is the issue you are describing.

So, the zstd executable links to libz.so (says ldd), and I don't think it reimplements the core algorithmic part of gzip. I gather that it's simply more intelligent in how it uses the API... e.g. it says some things about reusing contexts, though that seems to apply only to handling multiple files.

Supported on Btrfs since kernel 4.14. Like other compression options, it's a mount time option.

Edit: Also in squashfs. Here's the git pull request which includes some benchmarks. https://lkml.org/lkml/2017/9/11/369

Upstream SquashFUSE also has zstd support.


I wonder how the RAD Game Tools compressors would fit into that benchmark list. In their benchmarks Oodle Selkie has roughly the same ratio as Zlib but is 3-4x faster at decompression than Zstd (not the same benchmark though). http://www.radgametools.com/oodle.htm

Asset compression has a specific optimization function (Charles Bloom of RAD Game Tools has a ton about this on his blog, cbloomrants): maximize output bandwidth given this specific CPU and input bandwidth.

Compression speed, for example, doesn't really matter. Sure, it should be something reasonable (e.g. not like PAQ), but if you can bump the above metric by 10 % while compression takes 50 % longer: probably worth it.

Aha, he actually addressed zstd in a post. tldr is zstd usually lower ratio and lower speed than mermaid but sometimes better ratio for text.

We've been using zstd for a while in production, switching over from rar/gzip/7z for archiving huge XML files. It's speed to compression ratio is really impressive. Hats off to the development team.

I use the C bindings for ZStandard, it performs very nicely as an algorithm. I wish the documentation was a little easier to use. I am still not sure if the provided examples really properly handle the degenerate cases of /dev/urandom and /dev/zero when using `ZSTD_DStream{In,Out}Size` and friends.

In any case - thanks for releasing this, it's been very helpful to me.

Some good discussion in earlier postings too: https://hn.algolia.com/?query=zstandard&sort=byPopularity&pr...

Anyone using zstd along with an incremental rsync approach? Like gzip --rsyncable ?

Say I have 1000 files. I want to compress them and let the cron rsync do it's thing. Next day, if only one file had changed, rsync should pickup only the differential instead of the whole archive.

Or is there a better way of doing it?

In the "Compression Speed vs Ratio" graph, the Y-axis doesn't start at zero. If it were changed to start at zero, it would be easier to evaluate the magnitude of the compression ratio change at a glance. IMO, that's probably worth the extra whitespace in the graph.

From that benchmark list I'd prefer lz4..

lz4 and zstd are complementary, not competitive. lz4 targets applications where extremely fast decompression speed (and compression speed) are required, at the expense of compression ratio. However, many applications would prefer to spend a bit (or a lot) more CPU in exchange for better compression, while maintaining fast decompression speed.

Last time this was discussed Zstandard didn't have a splittable mode and by the looks of it[1] they still don't. That doesn't make it a bad algorithm it just means that it's not a good choice yet for Hadoop et al. As far as I know no container format has implemented Zstandard yet.

Does anyone know any better? It seems like we could use a better alternative to Snappy.

[1] https://github.com/facebook/zstd/issues/395

> As far as I know no container format has implemented Zstandard yet.

ORC has a reserved future enum for ZSTD [1] (my guess is that FB already uses it).

The original headache was the patent licensing scheme[2] for the actual codebase, but now that hadoop has a ZStandard built into libhadoop.so, this is much more easy to ship to Apache.

My branch is so far behind now that I probably would implement it faster if I started over & used the libhadoop.so impl as the basis - https://github.com/t3rmin4t0r/orc/tree/zstd

[1] - https://issues.apache.org/jira/browse/ORC-46

[2] - https://github.com/facebook/zstd/issues/775

In a lot of big data workflows, zstd is no brainer replacement of gzip. Always faster decompression, plus you can save tons of storage and data transfer.

When one talks about general data compression, isn't that relevant that it generally applies to text compression, and not video or audio?

I guess general data compression works on audio and video, but most of the time you either choose to compress text, audio, video or you create a file format that indexes your data.

We switched many of our Redshift compression encodings to zstd. Works fantastically on text and JSON.

Not really totally this topic, but I deal with a bunch of 8 Gig images of device firmwares. Is there anything that can make compressing these and de-duplicating these images fast? I have used borg in the past, but archive operations are just seem slow.

You may want to store them uncompressed and rely on ZFS block level deduplication. It's similar to borg but may be faster.

Also, you may want to play with borg options and use lz4

So like 'middle out'

see Compression Benchmark at (https://github.com/powturbo/TurboBench)

That's the best Weissman score I have ever seen.

Come on people, take a joke

Why is this site an SPA?!?!?

Genuine question with everything happening in Intel land these days I've turned off JS except for a select few websites. Seems like an interesting project though.

It's... not? It's a static page with some JavaScript-driven charts.

It's a page with the majority of the content wrapped in an XMP element with style="display:none;". I'd consider that pretty hostile.

TL;DR: Facebook acquires "Pied Piper" to gain exclusive rights on the Mean-Jerk-Time (MJT) compression patent; renames it to Zstandard for political correctness.

That patent clause though. Stick to the fork from before Facebook bought the man and the algorithm.

It was relicensed in v1.3.1. It is now dual licensed under BSD without the patents clause [1] and GPLv2 [2]. GPLv2 was added for inclusion in the Linux kernel.

[1] https://github.com/facebook/zstd/blob/dev/LICENSE

[2] https://github.com/facebook/zstd/blob/dev/COPYING

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact