
Apple Open-Sources its Compression Algorithm LZFSE - laktak
https://www.infoq.com/news/2016/07/apple-lzfse-lossless-opensource
======
svckr
With energy efficiency as a primary goal I was expecting way more use of
explicit SIMD instructions.

The InfoQ post mentions xcodebuild, but there is also a Makefile. I really
appreciate the presence of a no-nonsense Makefile. No autoconf, no pkgconfig,
just plain and simple make. Also, because nobody mentioned it: yes, it
compiles on Linux out of the box.

~~~
vog
_> I really appreciate the presence of a no-nonsense Makefile._

Indeed, the current version of their Makefile is a great example of how to
write a simple yet portable Makefile:

[https://github.com/lzfse/lzfse/blob/33629bc65f4b356072c9a7d5...](https://github.com/lzfse/lzfse/blob/33629bc65f4b356072c9a7d5b6321d3b360ed5ec/Makefile)

 _> No autoconf, no pkgconfig, just plain and simple make._

While I agree with your sentiment, I believe that your statement about pkg-
config goes a bit over the top.

Yes, the LZFSE project doesn't use pkg-config, but it also doesn't have any
library dependencies. There's not a single "-l" argument in the linker flags.

If it had, I would prefer pkg-config over any other mechanism, as that is
right now the best "simple yet portable" method of defining library
dependencies.

Pkg-config is especially handy when it comes to cross-compiling, or when you
have a special need for a static build instead of shared libraries.

~~~
mschuster91
> Indeed, the current version of their Makefile is a great example of how to
> write a simple yet portable Makefile:

Yet it forgets the MOST important thing: make uninstall.

Nothing worse than software where I have to reverse engineer a makefile in
order to uninstall!

~~~
vog
I don't trust "make uninstall" anyway. Instead, I usually put self-compiled
packages into separate directories, such as:

    
    
        make install INSTALL_PREFIX=/opt/lzfse
    

or

    
    
        make install INSTALL_PREFIX=$HOME/.../lzfse
    

Uninstalling is then as simple as:

    
    
        rm -r /opt/lzfse
    

For convenience, I either add _/ opt/lzfse/bin_ to _$PATH_ , or create a
symlink from _/ opt/lzfse/bin/lzfse_ to _/ usr/bin/lzfse_.

More generally, I believe that uninstalling, upgrading and related operations
are the task of a package manager, not a build script.

~~~
mschuster91
Good luck with getting pkg-config to recognize your include files
automatically, for example. Or the good old "man" utility.

Alternative: maintain a HUUUUUGE list of xPATH env variables, and update them
every time you recompile something and change the directory in the process.
Oh, and hope that other people's Makefiles are intelligent enough to not mess
up shit (e.g. use headers from system, and library .so files from your own
compile)...

~~~
vog
If you are alluding to cross-compiling, rest assured that I wrote my comment
being fully aware of the plenty of pitfalls, which is why I started the MXE
(mingw-cross-env) project some time ago:

[http://mxe.cc/](http://mxe.cc/)

~~~
mschuster91
Cross-compiling is yet another pile of dungheap in addition to the dungheap I
mentioned.

I usually set up a Debian chroot with qemu and compile "natively" (e.g. for
RPi). It's dog slow, yes, but at least it works reliably in contrast to cross
compilation.

The only way I ever got CC to work is with the buildroot toolchain, which has
the downside that it isn't Debian.

------
espadrine
I feel like LZFSE is too little, too late. It would be great to have a proper
comparison, but Zstd is stable, and offers a superior compression ratio with
compression and decompression speeds that seem on par with LZFSE.

And Zstd is not proprietary. (This issue is relevant in this regard:
[https://github.com/lzfse/lzfse/issues/21](https://github.com/lzfse/lzfse/issues/21))

[https://github.com/Cyan4973/zstd](https://github.com/Cyan4973/zstd)

Edit: here is a quick comparison I did on Linux with Project Gutemberg's
webster
([http://sun.aei.polsl.pl/~sdeor/corpus/webster.bz2](http://sun.aei.polsl.pl/~sdeor/corpus/webster.bz2)).

    
    
      $ time ./lzfse-master/build/bin/lzfse -encode -i webster -o webster.lzfse
      real    0m1.885s
      user    0m1.860s
      sys     0m0.024s
    
      $ time ./zstd-master/programs/zstd webster -8 -f -o webster.zstd
      webster              : 25.98%   (41458703 =>10772836 bytes, webster.zstd)      
      real    0m1.700s
      user    0m1.660s
      sys     0m0.036s
    
      $ ls -l
      -rw-r--r-- 1 tyl tyl 12209496 Jul  7 16:26 webster.lzfse
      -rw-rw-r-- 1 tyl tyl 10772836 Jul  7 16:31 webster.zstd
    
      $ time ./lzfse-master/build/bin/lzfse -decode -i webster.lzfse -o /dev/null
      real    0m0.127s
      user    0m0.112s
      sys     0m0.012s
    
      $ time ./zstd-master/programs/zstd -d webster.zstd -o /dev/null
      webster.zstd        : 41458703 bytes                                           
      real    0m0.116s
      user    0m0.112s
      sys     0m0.000s
    

LZFSE's -h option doesn't show a flag to tweak compression. Zstd's default -1
compression is super-fast, but obviously not optimal. Its -8 is the closest I
got to LZFSE's compression speed; its -4 was the closest to LZFSE's
compression ratio, with a speed of 0m0.527s real compression, 0m0.101s real
decompression.

~~~
Someone
I’m not that comfortable with Xcode, but I just built two versions of this
tool, one direct from GitHub, one (called lzfse2) that calls the Compression
library on Mac OS X. Typical timings on my system are:

    
    
        > time ./lzfse -encode -i webster -o webster.lzfse
    
        real    0m0.945s
        user    0m0.864s
        sys     0m0.062s
    
        > time ./lzfse2 -encode -i webster -o webster.lzfse2
    
        real    0m0.803s
        user    0m0.715s
        sys     0m0.072s
    
    
        > time ./lzfse -decode -i webster.lzfse -o /dev/null
    
        real    0m0.133s
        user    0m0.091s
        sys     0m0.036s
    
        > time ./lzfse2 -decode -i webster.lzfse2 -o /dev/null
    
        real    0m0.083s
        user    0m0.053s
        sys     0m0.025s
    

So, the version that ships on Mac OS X seems to be faster (10% at encoding,
35% at decoding) than what this source and makefile produce. I don’t think
that has to do with my way of building them, as I used the makefile (which
uses -Os) to build the original tool, and any compiler flags will not have
much effect on the one using the Mac OS X library.

Worryingly, the two versions also produce different files (12,209,496 bytes
for the GitHub code, 12,234,159 bytes for the library on Mac OS X 10.11.5),
but they can decompress each others files and produce the original file.

~~~
espadrine
> _Worryingly, the two versions also produce different files_

That is actually pretty common. I would guess the version you have in macOS is
an older version. They may have tweaked the algorithm to deliver slightly
better compression at the expense of speed, which is always the tradeoff in
this field.

On the other hand, I would not be surprised if they prepared special tweaks in
their internal version to better support arm64. Strategically, Apple seems to
believe that people will stick with building for their App Store if they are
pushed to write non-cross-platform code.

(Oddly enough, LZFSE/LZVN seems well-suited for file system compression on
hard drives, but here again, Yann Collet wins with its LZ4's superior
compression speeds, which impact write speeds.)

------
nodesocket
If you want to see some crazy C code, check out this file from the GitHub
repo:
[https://github.com/lzfse/lzfse/blob/master/src/lzvn_encode_b...](https://github.com/lzfse/lzfse/blob/master/src/lzvn_encode_base.c)

~~~
drinchev
Excerpt from the link :

    
    
          if (D == D_prev) {
            if (L == 0) {
              *q++ = 0xF0 + (x + 3); // XM!
            } else {
              *q++ = (L << 6) + (x << 3) + 6; //  LLxxx110
            }
            *(uint32_t *)q = literal;
            q += L; // non-aligned access OK
          } else if (D < 2048 - 2 * 256) {
            // Short dist    D>>8 in 0..5
            *q++ = (D >> 8) + (L << 6) + (x << 3); // LLxxxDDD
            *q++ = D & 0xFF;
            *(uint32_t *)q = literal;
            q += L; // non-aligned access OK
          } else if (D >= (1 << 14) || M == 0 || (x + 3) + M > 34) {
            // Long dist
            *q++ = (L << 6) + (x << 3) + 7;
            *(uint16_t *)q = D;
            q += 2; // non-aligned access OK
            *(uint32_t *)q = literal;
            q += L; // non-aligned access OK
          } else {
            // Medium distance
            x += M;
            M = 0;
            *q++ = 0xA0 + (x >> 2) + (L << 3);
            *(uint16_t *)q = D << 2 | (x & 3);
            q += 2; // non-aligned access OK
            *(uint32_t *)q = literal;
            q += L; // non-aligned access OK
          }

~~~
willvarfar
I've never seen this code base before, but this makes straightforward sense to
me as I do keep my hand in with compression software, and I don't anticipate
others who works with compression algorithms in general would have any
trouble.

It looks just like the style of code in all the other fast LZ codebases. They
are all in this style.

The "non-aligned access OK" comment litter is presumably to silence an LLVM
performance sanitizer.

~~~
nabla9
When implementing complex algorithm, this kind of code is usually easiest to
understand.

When you look at the code, you use the paper that describes the algorithm as
documentation. Using same short one letter variable names in the code and
paper makes understanding much easier.

The thing I hate most is when the the paper uses 1-based numbering and the
programming language uses 0-based numbering. We should settle for 0-based
numbering when describing algorithms.

~~~
eru
Arithmetic coding often looks just as dense, because C is not a good vehicle
to describe algorithms.

Look at eg
[https://www.cs.ox.ac.uk/jeremy.gibbons/publications/arith.pd...](https://www.cs.ox.ac.uk/jeremy.gibbons/publications/arith.pdf)
to see a cleaner alternative.

(This is about describing algorithms in papers. Optimizing for performance
after the big-O has been taken care of is a different matter.)

------
dchest
[https://news.ycombinator.com/item?id=11944975](https://news.ycombinator.com/item?id=11944975)

~~~
legulere
And even earlier:
[https://news.ycombinator.com/item?id=11922453#11925386](https://news.ycombinator.com/item?id=11922453#11925386)

------
DeepYogurt
Here's the github link
[https://github.com/lzfse/lzfse](https://github.com/lzfse/lzfse)

------
microcolonel
It's nice to see that we will be able to at least decode these archives.
Though I think for new software people are better off using zstd if they're
looking for this set of performance characteristics.

------
nodesocket
Interesting to see a benchmark against .zip and .gz in terms of file reduction
and uncompress time.

------
shmerl
_> LZFSE is only present in iOS and OS X, so it can’t be used when the
compressed payload has to be shared to other platforms (Linux, Windows)._

So now it will be cross platform?

~~~
hollander
You can bet that there is a Linux version soon, and if it's good enough, it
will end up in the default repo. For Windows, it's a different story, but
maybe because of the iPhone and iPad, and many MS employees using them, will
it be supported somewhere in the not so near future. Anyways, 7zip and Winzip
will probably support it soon enough.

~~~
coldcode
It compiles in Linux as is. There is a generic makefile.

------
skreuzer
If anyone on FreeBSD wants to try this I just added a port under
archivers/lzfse

[http://www.freshports.org/archivers/lzfse/](http://www.freshports.org/archivers/lzfse/)

------
Negative1
I'm not fully up on the latest and greatest in compression technologies but my
go to format these days is usually 7zip which I believe is just a container
that uses LZMA. For whatever reason *nix people seem to hate it even though I
get much better compression with it than tarballs, zlib/zips. Is there a
similar container format that will or does use LZFSE? And how much better is
it than 7zip/LZMA?

~~~
wmf
The Unixy version of LZMA is called XZ. Unix people prefer tar over zip,
probably for historical/tribal reasons.

There are really three different compression "markets"; LZMA/XZ provides
strong compression while LZFSE provides moderate or light compression so they
don't really compete.

~~~
macintux
> Unix people prefer tar over zip, probably for historical/tribal reasons.

To this day I refuse to use the z argument to tar. Packaging and compressing
are two different operations, and this is why the UNIX gods created pipes.

And also I'm a stubborn old fart.

------
tracker1
Just a side-comment, it would be _really_ nice if we could get the browser
vendors to support a newer compression algorithm beyond gzip and deflate. I
know there have been a couple others implemented, but nothing that has been
implemented by multiple browsers that has stuck. Really need to get MS,
Google, Apple and Mozilla to come together on this. Should be patent free.

~~~
j_s
I think Brotli is getting there, with support in Firefox and Chrome so far.

[https://samsaffron.com/archive/2016/06/15/the-current-
state-...](https://samsaffron.com/archive/2016/06/15/the-current-state-of-
brotli-compression)

~~~
tracker1
Thanks for this... the last I checked was a few months ago, as I coulldn't
believe we were still limited to deflate/gzip... I'm somewhat surprised lzma
didn't get broader support earlier on though.

mental note, setup a new dokku box, and try getting this setup...

------
technion
For anyone wondering, I've tried throwing afl-fuzz at this. It's only been a
few hours, but as yet, nothing has turned up.

------
0x54MUR41
Sorry, I think this comment is out of topic. I don't know why Apple put this
separated with their open source projects on
[https://github.com/apple](https://github.com/apple).

------
ausjke
A quick test resutl(zip a 1.5GB file):

    
    
        lzfse:
        real	1m44.481s
        user	1m17.956s
        sys	0m2.852s
    
        lz4:
        real	0m28.136s
        user	0m1.200s
        sys	0m2.240s
    

lz4 is much faster somehow. The final size are very close.

~~~
klodolph
It's not terribly useful for most applications to test compression speed. The
only applications I can think of where this is relevant is data backup and
archiving.

There are two typical speed benchmarks you want to do. For the "compress once,
decompress many times" situation, benchmark the time it takes to _decompress_
and ignore compression time. For the "compress once, decompress once"
situation, add the compression and decompression times.

The first situation is common for distributing packages and static assets, the
second situation is common for distributing dynamic assets.

~~~
eye2sky
For the second situation, I prefer the compress/transmit/decompress metric (as
a function of transmit pipe speed), as described here:

[http://fastcompression.blogspot.com/p/compression-
benchmark....](http://fastcompression.blogspot.com/p/compression-
benchmark.html)

------
finchisko
Waiting for first guy who take this implementation and run it through
emscripten so we can actually use it in client -> server communication, eg
sending compressed json payloads to the server.

~~~
panic
Is running (not to mention downloading/parsing/compiling) emscriptened
decompression code really that much faster than just transferring the data?

I think a better strategy would be to add support for LZFSE to the browsers
themselves.

~~~
finchisko
Native support in browser would be appreciated. But I was also thinking about
nodejs server side decompressing via native or emscripten transpiled module.

~~~
ascagnel_
A quick reading of the license[0] shows that there's no legal stumbling blocks
to the code showing up in browsers, but the license doesn't mention anything
about patents.

[0]
[https://github.com/lzfse/lzfse/blob/master/LICENSE](https://github.com/lzfse/lzfse/blob/master/LICENSE)

------
kazinator
Open source; but is it patent-free?

------
kirkdouglas
Huh, I hope macOS code looks better.

------
panic
The article is essentially a link to
[https://www.infoq.com/news/2016/07/apple-lzfse-lossless-
open...](https://www.infoq.com/news/2016/07/apple-lzfse-lossless-opensource)
with a bunch of ads on top. Maybe someone could update it to point there
instead?

~~~
dang
Ok, we changed to that from [http://www.appleworld.today/blog/2016/7/6/apples-
lzfse-compr...](http://www.appleworld.today/blog/2016/7/6/apples-lzfse-
compression-algorithm-goes-open-source).

------
pilif
It's 2016. How can you launch a reasonably high profile open source project
with code that looks like this? This fulfills all the TODO list for unreadable
code. One character variable names, one character parameter names, full of
magic numbers...

Yes. This is very performance critical code and I completely see the need to
write very optimized code. That's fine. But optimizing code for speed
shouldn't imply also optimizing it to use as few characters as possible.

Compression code is code that often runs at boundaries to the external world
and thus is a very significant attack surface. To release compression code in
a non-safe language is risky enough but then using what amounts to practically
write-only code is, IMHO, irresponsible.

~~~
johncolanduoni
> To release compression code in a non-safe language is risky enough

At the moment, what's their real alternative? Rust is the only memory-safe
language I can think of that could hope to meet their performance
requirements, but even the Rust runtime would be a lot of overhead for this
application.

That said, I agree this isn't acceptable C code for something that runs on
untrusted data while using tons of pointer arithmetic.

~~~
pilif
You're right about C. C in general, I would find acceptable, because, yes,
there aren't that many good alternatives around for this kind of code.

But there's nothing stopping you from writing readable C code. That's where my
concerns come from.

~~~
msvalkon
I don't really understand where the downvotes come from? I find the
readability concerns legitimate, and would like to understand why compression
algorithm developers feel like this is OK? Is it just the math heavy
background? Can't think of any real benefits to this style.

~~~
throwaway2048
If you don't understand the underlying mathematical algorithms its using, no
amount of explicit varible names are going to help you. If you do, the concise
structure makes things straightforward. The code is not meant to be read alone
and understood, the papers published along with it need to be understood
first.

~~~
rimantas
Exactly. Not all code can be understandable to layman with zero effort.

------
tlrobinson
Well, what's the Weissman score?

~~~
dietrichepp
The Weissman score is the most moronic compression "metric" ever devised. I
put "metric" in quotes because from a mathematical perspective, it is
practically gibberish. I'm tired of seeing it mentioned in every HN post on
compression.

~~~
legodt
I also see this kind of response a lot in situations the score is brought up.
I don't have a background in compression, what is the current method for
measuring compression efficiency? Is it a balance between size reduction and
speed to unpack or is measuring compression efficacy far more nuanced?

~~~
dietrichepp
Usually you compare on a 2D graph, with one axis being decompression speed and
the other axis being compression ratio. Sometimes decompression speed is
replaced by decompression + compression speed, for round-trip data. Very
rarely you only consider compression speed, for "write-many read-seldom" data
(like backups).

Normally, any sensible 1D metric would be visualized as isolines in that 2D
plane. But the "Weissman score" doesn't even make that much sense, it's a
discontinuous, non-monotonic function with singularities right in the middle!
It doesn't even make sense from a dimensional analysis perspective… the score
will fluctuate wildly based on the _units_ you choose for time and size
(minutes? seconds? octets? bits?). There is no conceivable real-world
application of the Weissman score. It is just a bit of bad math that appeared
on TV once.

------
DiabloD3
So... basically a clone of LZ4?

~~~
awalton
Basically an Apple-specific reimplementation of Zstd:
[https://github.com/Cyan4973/zstd](https://github.com/Cyan4973/zstd)

~~~
DiabloD3
So, worse than LZ4 for what Apple seems to be using it for. Why didn't they
just use LZ4? Confusing company, they are.

~~~
shmerl
They are very much into NIH, seemingly out of paranoid fear of patent attacks
(though not sure how it can protect them).

~~~
dchest
This is a baseless speculation. First of all, Apple provides LZ4 in
libcompression. Secondly, LZFSE uses Lempel–Ziv algorithm and ANS coder
invented by Jarek Duda
([https://arxiv.org/abs/1311.2540](https://arxiv.org/abs/1311.2540)):
[https://developer.apple.com/library/ios/documentation/Perfor...](https://developer.apple.com/library/ios/documentation/Performance/Reference/Compression/index.html)

~~~
twitch_checksum
Well, many people nowadays pretend that, as if they read Jarek's paper,
understood it and came out with a genuine implementation of their own.

But Jarek's ANS paper was first out in 2007, and almost no one paid attention
to it, because it was plain inscrutable.

Many years later, it took an individual to create FSE
([https://github.com/Cyan4973/FiniteStateEntropy](https://github.com/Cyan4973/FiniteStateEntropy)),
to prove that it could be transformed into something actually useful and
competitive. Since then, the paper has been updated a few times, borrowing a
few points from FSE in order to become more readable. But it's still very hard
to read.

In contrast, FSE code can be copy/pasted.

And all of a sudden, lot of versions have popped out over Internet. By pure
chance, they all look like derivatives of FSE or Fabian Giesen's rANS, but
they pay tribute to Jarek's ANS paper, because quite clearly it is the source
of their work, and prior existence of an actual open source implementation
which works and looks pretty damn close to theirs was purely accidental.

This is not paying tribute where it's due.

~~~
tmd83
Very well said. Its a stark contrast to how Yann Collet acknowledges others
inspiration and collaboration
([http://fastcompression.blogspot.fr/2013/12/finite-state-
entr...](http://fastcompression.blogspot.fr/2013/12/finite-state-entropy-new-
breed-of.html))

------
benmarten
I just quickly tested it, in terms of highest compression ratio it still does
not beat xz, e.g. `tar -cf -FILE | xz -c9e > FILE.tar.xz`
[https://blog.benmarten.me/2016/04/01/Compress-Files-With-
Hig...](https://blog.benmarten.me/2016/04/01/Compress-Files-With-Highest-
Compression/)

~~~
Someone
That's not surprising, given that they went for compression and decompression
speed and for energy usage.

Their goal seems to have been to be at least as good as zlib at compressing
stuff using less energy and doing it faster (that often correlates quite well
with energy use on modern CPUs, as it allows them to drop to low energy states
faster)

~~~
a_imho
Could you give me some pointers on the actual numbers? My searches came back
with nothing. I'm especially interested how they benchmarked the energy
consumption.

~~~
Someone
[http://asciiwwdc.com/2015/sessions/712](http://asciiwwdc.com/2015/sessions/712)
is the best pointer I know, but it does not give details.

My guesses would be that they have a simulator that computes/estimates power
usage, and that they have CPU setups where they measure power usage directly.
I doubt they regularly do the "compress things till you run out of battery"
thing that that talk mentions. That takes too long, and cannot be used to
measure small changes in power usage.

~~~
a_imho
Do you mean by extrapolating from executed instructions? It would be very much
dependent on the architecture then. It would be nice to read some independent
studies and experiments, after the recent events I take any manufacturer claim
on consumption with a pinch of salt.

------
grewil2
Kind of weak licence, what are you actually allowed to do with this code?
Change it? Distribute your changes?

[https://github.com/lzfse/lzfse/blob/master/LICENSE](https://github.com/lzfse/lzfse/blob/master/LICENSE)

~~~
dalbin
It's the BSD 3-Clause :

[https://opensource.org/licenses/BSD-3-Clause](https://opensource.org/licenses/BSD-3-Clause)

