
Now using Zstandard instead of xz for package compression - nloomans
https://www.archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/
======
Lammy
Meta: This post is yet another victim of the HN verbatim title rule despite
the verbatim title making little sense as one of many headlines on a news
page.

How is "Now using Zstandard instead of xz for package compression" followed by
the minuscule low-contrast grey "(archlinux.org)" better than "Arch Linux now
using Zstandard instead of xz for package compression" like it was when I
originally read this a few hours ago?

~~~
nmstoker
Saying it's "yet another victim" seems slightly too emotive to me.

If people can't read the source site's domain after the headline then I agree
there wouldn't be much context, but equally, if they can't read that, surely
their best solution is to adjust the zoom level in the browser.

It's clear you won't get complete context from the headline list plus domain,
but a hint of it is provided and if you want more you click the link. Maybe
I'm being a little uncharitable but I don't see a big problem here.

~~~
rat9988
Even using the source domain isn't informative enough. The alternative
headline is better. You are being too charitable to an inferior title.

~~~
Dylan16807
"archlinux.org" is less informative than "Arch Linux"?

I'm sympathetic to disliking the change, but that's taking it to an extreme.

~~~
rat9988
With this title, I thought it was a post advocating the usage of Zstandard or
vanting its technical merits, and that it was posted on archlinux.org.

When you take out information, don't expect people to have the correct guess.

------
WinonaRyder
Zstandard is awesome!

Earlier last year I was doing some research that involved repeatedly grepping
through over a terabyte of data, most of which were tiny text files that I had
to un-zip/7zip/rar/tar and it was painful (maybe I needed a better laptop).

With Zstd I was able to re-compress the whole thing down to a few hundred gigs
and use ripgrep which solved the problem beautifully.

Out of curiosity I tested compression with (single-threaded) lz4 and found
that multi-threaded zstd was pretty close. It was an unscientific and maybe
unfair test but I found it amazing that I could get lz4-ish compression speeds
at the cost of more CPU but with much better compression ratios.

EDIT: Btw, I use arch :) - yes, on servers too.

~~~
bufferoverflow
Here's a compression benchmark.

[http://pages.di.unipi.it/farruggia/dcb/](http://pages.di.unipi.it/farruggia/dcb/)

Looks like Snappy beats both LZ4 and Zstd in compression speed and compression
ratio, by a huge margin.

LZ4 is a ahead of Snappy in the decompression speed.

~~~
ncmncm
I find these numbers for Snappy entirely implausible.

The numbers I know about are wrong: zstd always beats gzip for compression
ratio.

I will need to do my own testing.

~~~
ncmncm
I have tested snzip 1.0.4.

It compresses about as well as lz4, but more slowly. It also decompresses more
slowly.

It is faster than zstd -1, but compresses less well.

It is possible that it does better with certain kinds of data, but 12x remains
implausible.

Apparently the current file format has suffix ".sz".

------
filereaper
Apparently this is how to use Zstd with tar if anyone else was wondering:

    
    
      tar -I zstd -xvf archive.tar.zst
    

[https://stackoverflow.com/questions/45355277/how-can-i-
decom...](https://stackoverflow.com/questions/45355277/how-can-i-decompress-
an-archive-file-having-tar-zst)

Hopefully there's another option added to tar that simplifies this if this
compression becomes mainstream.

~~~
viraptor
tar accepts `-a` for format autodetection for a while now. You can do:

    
    
        tar -axf archive.tar.whatever
    

and it should work for gz, bz2, Z, zstd, and probably more. (verified works
for zstd on gnu tar 1.32)

~~~
yjftsjthsd-h
I think that's a GNU extension, so obviously fine on Arch, but probably not on
ex. MacOS (Darwin) or Alpine (busybox) by default.

~~~
_ZeD_
With GNU tar, the -a flag is not needed

~~~
Carpetsmoker
You don't need it with libarchive-based BSD tar either. It can also extract
zipfiles, ISO archives, and many other formats with just "tar xf file.zip".

~~~
qalmakka
libarchive is really awesome! I've started using bsdtar everywhere, it's so
well done and polished I never felt the need to bother with anything else
(same goes for bsdcpio)

------
m4rtink
BTW, Fedora recently switched to zstd compression for its packages as well.
For the same resons basically - much better overall de/compression speed while
keeping the result mostly the same size.

Also one more benefit of zstd compression, that is not widely noted - a zstd
file conpressed with multiple threads is binary the same as file compressed
with single thread. So you can use multi threaded compression and you will end
up with the same file cheksum, which is very important for package signing.

On the other hand xz, which has been used before, produces a _binary different
file_ if compressed by single or multiple threads. This basucally precludes
multi threaded compression at package build time, as the compressed file
checksums would not match if the package was rebuild with a different number
of compression threads. (the unpacked payload will be always the same, but the
compressed xz file _will_ be binary different)

------
ncmncm
Zstd has an enormous advantage in compression and, especially, decompression
speed. It often doesn't compress _quite_ as much, but we don't care as much as
we once did. We rebuild packages more than we once did.

This looks like a very good move. Debian should follow suit.

~~~
beatgammit
I build packages periodically from the AUR, and compression is the longest
part of the process much of the time. For a while, I disabled compression on
AUR packages because it was becoming enough of a problem for me to look into
solutions. If it's annoying for me, I can imagine it's especially problematic
for package maintainers. I can only imagine how much CPU time switching the
compression tool will save.

~~~
SamWhited
I love the AUR, but every single time I have to wait for it to compress
Firefox nightly, and then wait for it to immediately decompress it again
because the only reason I was building the package in the first place was to
install it I about lose my mind. Hopefully this helps, but I really wish AUR
helpers would just disable compression and call it a day so I don't have to go
mess with config files that would also change my manual package building
routine.

 __EDIT: __nevermind, this doesn 't seem to have made this the default for
building packages locally, just for ones you download from the official repos.
Guess I'll go change that by hand and then still be sad that I can't have it
easily disabled entirely for AUR helpers but build my packages with
compression.

~~~
beanaroo
This isn't a function of an AUR helper but rather makepkg itself.

In makepkg.conf, ommit compression by specifying:

    
    
        PKGEXT='.pkg.tar'
    

More information can be found at
[https://wiki.archlinux.org/index.php/Makepkg#Tips_and_tricks](https://wiki.archlinux.org/index.php/Makepkg#Tips_and_tricks)

~~~
SamWhited
Then it happens for packages I build myself (and want compressed) and those
that I just want to install.

~~~
zerogara
If you care about space more than you care about speed you may want to stick
with xz, it is hard to beat or impossible by zstd. So set your own priorities
rather than adopt the ones of Arch devs.

As long as there will be support within the tools for xz individual builders
of packages for their own use can use either or more.

~~~
SamWhited
I don't really care about space all that much, but packages I build tend to
get uploaded and the people downloading them may not have fast internet.
Meanwhile, packages built by my AUR helper I care about speed (seriously, it
takes _ages_ to compress then immediately decompress firefox). The problem
isn't that I want to optimize for one or the other, it's that AUR helpers
generally have a different need than I do when building my packages myself,
but for some reason AUR helpers don't override the compression setting for
just their install. Probably due to caching like I said which means they can't
assume everyone will want compression off all the time, but I'm not sure,
that's just a guess.

------
cmurf
Fedora 31 switched RPM to use zstd.
[https://fedoraproject.org/wiki/Changes/Switch_RPMs_to_zstd_c...](https://fedoraproject.org/wiki/Changes/Switch_RPMs_to_zstd_compression)

Package installations are quite a bit faster, and while I don't have any
numbers I expect that the ISO image compose times are faster, since it
performs an installation from RPM to create each of the images.

Hopefully in the near future the squashfs image on those ISOs will use zstd,
not only for the client side speed boost for boot and install, but it cuts the
CPU hit for lzma decompression by a lot (more than 50%).
[https://pagure.io/releng/issue/8581](https://pagure.io/releng/issue/8581)

------
kbumsik
> Recompressing all packages to zstd with our options yields a total ~0.8%
> increase in package size on all of our packages combined, but the
> decompression time for all packages saw a ~1300% speedup.

Impressive. As a AUR package maintainer I am also wondering how the
compression speed is though.

~~~
ncmncm
Compression speed is many, many, many times faster than xz, and (only) _much_
faster than gzip. Really, only lz4 beats it.

~~~
integricho
After reading these comments, I can't help but wonder, what is the benefit of
Zstd over lz4? Why didn't they switch to lz4 if it was the speed of the
algorithm that they favored even with marginally worse compression ratios?

~~~
isatty
Guessing that 0.8x size increase for 1300% speedup was worth the tradeoff but
maybe ≥1.5 size increase or more was not (especially considering a
1300%->2000% increase is not going to be user visible for 99% of the
packages).

~~~
TJSomething
It's not 0.8 times size increase, it's a 0.008 times size increase, since the
unit is percent. The latter seems pretty marginal to me.

------
JeremyNT
I learned about this one the hard way when I went to update a really crufty (~
1 year since last update) Arch system I use infrequently the other day. I had
failed to update my libarchive version prior to the change and the package
manager could not process the new format.

Luckily updating libarchive manually with an intermediate version resolved my
issue and everything proceeded fine.

This is a good change, but it's a reminder to pay attention to the Arch Linux
news feed, because every now and then something important will change. The
maintainers provided ample warning about this change there (and indeed I had
updated by other systems in response) so we procrastinators really had no
excuse :)

------
golergka
I used zstd for on-the-fly compression of game data for p2p multiplayer
synchronization, and got 2-5x as much data (depends on the payload type) in
each TCP packet. Sad that it still doesn't get much adoption in the industry.

~~~
ncmncm
Zstd knows how to use a user-supplied dictionary at each end. I hope you are
doing that.

But if latency matters you might better use lz4.

~~~
golergka
Yes, I did. Too bad that I didn't get to see up CI in that project, and
current maintainers probably forgot to update the dictionary.

------
loeg
I'd love to see Zstandard accepted in other places where the current option is
only the venerable zlib. E.g., git packing, ssh -C. It's got more breadth and
is better (ratio / cpu) than zlib at every point in the curve where zlib even
participates.

~~~
jacobolus
It would be great to see better compression supported by browsers.

~~~
dictum
Brotli has wide browser support
([https://caniuse.com/#feat=brotli](https://caniuse.com/#feat=brotli)) and
comes closer to zstd in compression ratio and compression speed, but its
decompression speed is significantly lower and closer to zlib.

[https://github.com/facebook/zstd#benchmarks](https://github.com/facebook/zstd#benchmarks)

AFAIK (I haven't looked much into it since 2018) it's not widely supported by
CDNs, but at least Cloudflare seems to serve it by default (EDIT: must be
enabled per-site [https://support.cloudflare.com/hc/en-
us/articles/200168396-W...](https://support.cloudflare.com/hc/en-
us/articles/200168396-What-will-Cloudflare-compress-))

~~~
imtringued
That's interesting. Brotli has wide browser support although its less than 5
years old but webp is reaching a decade and Safari still doesn't support it...

~~~
JyrkiAlakuijala
WebP has an excellent lossless image compressor (like PNG just 25-40 % more
dense), but the lossy format has weaknesses that people focused on, and slowed
down the adoption. The initial lossy encoder had weaknesses in quality -- it
had bugs or was a port of a video coder. Nowadays, the quality is much better,
but the format forces YUV420 coding (does not allow YUV444 coding) which
limits the quality of colors and fine textures.

------
rwmj
I wish zstd supported seeking and partial decompression
([https://github.com/facebook/zstd/issues/395#issuecomment-535...](https://github.com/facebook/zstd/issues/395#issuecomment-535875379)).
We could then use it for hosting disk images as it would be a lot faster than
xz which we currently use.

~~~
ncmncm
Fun fact: Two zstd files appended is a zstd file.

Also, parallel zstd must have some way to split up the work, that you could
maybe use too.

~~~
rwmj
I would suggest reading the github issue that I linked to, you'll see why it's
not currently possible.

------
gravitas
AUR users -- the default settings in /etc/makepkpg.conf (delivered by the
pacman package as of 5.2.1-1) are still at xz, you must manually edit your
local config:

    
    
      PKGEXT='.pkg.tar.zst'
    

The largest package I always wait on perfect for this scenario is `google-
cloud-sdk` (the re-compression is a killer -- `zoom` is another one in AUR
that's a beast) so I used it as a test on my laptop here in "real world
conditions" (browsers running, music playing, etc.). It's an old Dell m4600
(i7-2760QM, rotating disk), nothing special. What matters is using default xz,
compression takes twice as long and _appears_ to drive the CPU harder. Using
xz my fans always kick in for a bit (normal behaviour), testing zst here did
not kick the fans on the same way.

After warming up all my caches with a few pre-builds to try and keep it fair
by reducing disk I/O, here's a sampling of the results:

    
    
      xz defaults  - Size: 33649964
      real  2m23.016s
      user  1m49.340s
      sys   0m35.132s
      ----
      zst defaults - Size: 47521947
      real  1m5.904s
      user  0m30.971s
      sys   0m34.021s
      ----
      zst mpthread - Size: 47521114
      real  1m3.943s
      user  0m30.905s
      sys   0m33.355s
    

I can re-run them and get a pretty consistent return (so that's good, we're
"fair" to a degree); there's disk activity building this package (seds, etc.)
so it's not pure compression only. It's a scenario I live every time this AUR
package (google-cloud-sdk) is refreshed and we get to upgrade. Trying to stick
with real world, not synthetic benchmarks. :)

I did not seem to notice any appreciable difference in adding the
`--threads=0` to `COMPRESSZST=` (from the Arch wiki), they both consistently
gave me right around what you see above. This was compression only testing
which is where my wait time is when upgrading these packages, huge improvement
with zst seen here...

~~~
Foxboron
It should be noted that the makepkg.conf file distributed with pacman does not
contain the same compression settings as the one used to build official
packages.

pacman:

    
    
        COMPRESSZST=(zstd -c -z -q -)
    

[https://git.archlinux.org/svntogit/packages.git/tree/trunk/m...](https://git.archlinux.org/svntogit/packages.git/tree/trunk/makepkg.conf?h=packages/pacman#n133)

devtools:

    
    
        COMPRESSZST=(zstd -c -T0 --ultra -20 -)
    

[https://github.com/archlinux/devtools/blob/master/makepkg-x8...](https://github.com/archlinux/devtools/blob/master/makepkg-x86_64.conf#L135)

~~~
gravitas
The man page for zstd mentions that using the --ultra flag will cause
decompression to take more RAM as well when used to compress. Does this
indicate a _huge_ increase in memory to decompress, or just a trivial amount
per package, say something large like... `libreoffice-fresh`? Or `go`? They're
two of the largest main repo packages I have installed... (followed by linux-
firmware)

~~~
telendram
Without `--ultra`, the decompression memory budget is capped at 8 MB. At
`--ultra -20`, it's increased to 32 MB.

That's still less than XZ, which reaches 64 MB.

~~~
JyrkiAlakuijala
The respective flag for brotli would be `--large_window 25 --quality 11`

Brotli defines memory use as log2 on command line, i.e., 32 MB = 1 << 25

zstd uses a lookup table where the number given by the user is mapped to a
decoding-time memory use. The user just needs to look it up if they want to
control decoder memory use.

If one benchmarks zstd with `20` and brotli with `20`, zstd may be using 32 MB
of decoding memory, where one is specifying 1 MB for brotli. By default zstd
tends to use 8 MB for decoding (however it is variable with encoding effort
setting) and brotli 4 MB for decoding (not changing with the encoding effort
setting).

------
maxpert
I’ve used LZ4 and Snappy in production for compressing cache/mq payloads. This
is on a service serving billions of clicks in a day. So far very happy with
the results, I know zstd requires more CPU than LZ4 or snappy on average but
has someone used it under heavy traffic loads on web services. I am really
interested trying it out but at the same time held back by “don’t fix it if it
ain’t broken”.

~~~
loeg
Zstd has "fast" negative levels (-5, -4, ... -1, 1, ..., 22). -4 or -5 are
purportedly comparable (but not quite as good) as LZ4.

~~~
ncmncm
Better to just use Lz4, then.

~~~
emn13
Maybe. The thing is; zstd is quite close, and unlike lz4, zstd has a broad
curve of supported speed/time tradeoffs. Unless you're huge and engineering
effort is essentially free or at least the microoptimization for one specific
ratio is worth the tradeoff - you may be better off choosing the solution
that's less opinionated about the settings. If it then turns out that you care
mostly about decompression speed + compression ratio and a little less about
compression speed, it's trivial to go there. Or maybe it turns out you only
sometimes need the speed, but usually can afford spending a little more CPU
time - so you default to higher compression ratios, but under load use lower
ones (there's even a streaming mode built-in that does this for you for large
streams). Or maybe your dataset is friendly to the parallization options, and
zstd actually outperforms lz4.

If you know your use case well and are sure the situation won't change (or
don't mind swapping compression algorithms when they do), then lz4 still has a
solid niche, especially where _compression_ speed matters more than
decompression speed. But in many if not most cases I'd say it's probably a
kind of premature optimization at this point, even if you think you're close
to lz4's sweet spot.

------
G4E
For those who want a TLDR : The trade off is 0.8% increase of package size for
1300% increase in decompression speed. Those numbers come from a sample of 542
packages.

~~~
agumonkey
thanks, a great change for those with SSDs

~~~
crazysim
or CPUs!

~~~
fctorial
Or RAM

------
Phlogi
The wiki is already up to date if you build your own or AUR packages and want
to use multiple cpu cores
[https://wiki.archlinux.org/index.php/Makepkg#Utilizing_multi...](https://wiki.archlinux.org/index.php/Makepkg#Utilizing_multiple_cores_on_compression)

------
yjftsjthsd-h
> If you nevertheless haven't updated libarchive since 2018, all hope is not
> lost! Binary builds of pacman-static are available from Eli Schwartz'
> personal repository, signed with their Trusted User keys, with which you can
> perform the update.

I am a little shocked that they bothered; Arch is rolling release and
explicitly does not support partial upgrades
([https://wiki.archlinux.org/index.php/System_maintenance#Part...](https://wiki.archlinux.org/index.php/System_maintenance#Partial_upgrades_are_unsupported)).
So to hit this means that you didn't update a rather important library for
over a year, which officially implies that you didn't update _at all_ for over
a year, which... is unlikely to be sensible.

~~~
jpgvm
Arch is actually surprisingly stable and even with infrequent updates on the
order of months still upgrades cleanly most of the time. The caveats to this
were the great period of instability when switching to systemd, changing the
/usr/lib layout, etc but those changes are now pretty far in the past.

~~~
yjftsjthsd-h
Sure, and I've done partial upgrades and it was mostly fine:) It just
surprised me to see the devs going out of their way to support it on volunteer
time. On the other hand, maybe that's exactly the reason; maybe someone said
"hey look, I can make static packages that are immune to library changes! I
guess I'll publish these in case they're useful". Open source is fun like
that:)

~~~
semi-extrinsic
Also, Arch devs probably run Arch servers, and I'd not be surprised if some of
those have uptimes in hundreds of days.

~~~
Foxboron
All Arch infra runs on Arch Linux. The infrastructure repository is all open.

[https://git.archlinux.org/infrastructure.git/](https://git.archlinux.org/infrastructure.git/)

Now, official infra doesn't reach hundreds of days. But personal systems might
:)

------
shmerl
Was XZ used in parallelized fashion? Otherwise comparing is kind of pointless.
Single threaded XZ decompression is way too slow.

~~~
esotericn
Multithreaded xz is non-deterministic and so it's not a candidate.

~~~
ComputerGuru
We are talking about decompression speed and not encryption. Decompression is
necessarily deterministic.

~~~
shmerl
May be the point is that compressed package can change every time, which is an
issue for reproducible builds idea many distros now are using. Though I'm not
sure why parallelized xz can't behave in predictable fashion.

~~~
ComputerGuru
No, I mean you don’t need to parallel compress. The compression speeds don’t
matter, and are compatible with single- or multi-threaded decompression.

~~~
shmerl
Compression speed can matter in general (to improve build times).

For xz, you need to compress with chunking (and may be indexing for more
benefit), in order to allow parallel decompression to begin with. Otherwise xz
produces a blob which you can't split into independent parts during
decompression, which makes using many decompression threads pointless.

But yes, if parallel compression is creating non determinism, you can do all
the compression work with chunking without parallelism, still allowing
parallel decompression. But I'm not sure why it even has to create non
determinism in the first place.

------
zerogara
Most of the results published show very little positive or negative speed in
decompression, where is all this -1300% coming from?

edit: Sorry, my fault that was decompression RAM I was thinking about, not
speed, although I was influenced by my test that without measuring both xz and
zstd seemed instant.

------
dhsysusbsjsi
Quick shout out to LZFSE. Similar compression ratio to zlib but much faster.

[https://github.com/lzfse/lzfse](https://github.com/lzfse/lzfse)

------
nwah1
I wonder if they will switch to using zstd for mkinititcpio

~~~
Squithrilve
mkinitcpio is being replaced with dracut so zstd won't probably happen.

~~~
nwah1
Man page says zstd is an option on dracut

[http://man7.org/linux/man-
pages/man5/dracut.conf.5.html](http://man7.org/linux/man-
pages/man5/dracut.conf.5.html)

~~~
Foxboron
The kernel doesn't support booting zstd compressed initramfs' yet, but you can
very well use zstd compression with dracut and mkinitcpio

------
imtringued
This blog post probably wasted more of my time than I will ever gain from the
faster decompression...

------
vmchale
What of lzip?

------
Annatar
I couldn't care less about decompression speed, because the bottleneck is the
network, which means that I want my packages as small as possible. Smaller
packages mean faster installation; at 54 MB/s or faster decompression rate of
xz, I couldn't care less about a few milliseconds saved during decompression.
For me, this decision is dumbass stupid.

~~~
ubercow13
Why do you care so much about the few extra miliseconds wasted downloading,
then? (0.8% size increase is ~ 0). Also don't forget that Arch can also be
used on machines with very slow CPU but very fast network connections, like
many VPSs. I think this will make a tangible difference on mine. This is also
a big improvement for package maintainers and anyone building their own
packages without bothering to modify the makepkg defaults, eg. most people
using an AUR helper.

~~~
Annatar
Because size does matter.

