Hacker News new | past | comments | ask | show | jobs | submit login

I feel like LZFSE is too little, too late. It would be great to have a proper comparison, but Zstd is stable, and offers a superior compression ratio with compression and decompression speeds that seem on par with LZFSE.

And Zstd is not proprietary. (This issue is relevant in this regard: https://github.com/lzfse/lzfse/issues/21)

https://github.com/Cyan4973/zstd

Edit: here is a quick comparison I did on Linux with Project Gutemberg's webster (http://sun.aei.polsl.pl/~sdeor/corpus/webster.bz2).

  $ time ./lzfse-master/build/bin/lzfse -encode -i webster -o webster.lzfse
  real    0m1.885s
  user    0m1.860s
  sys     0m0.024s

  $ time ./zstd-master/programs/zstd webster -8 -f -o webster.zstd
  webster              : 25.98%   (41458703 =>10772836 bytes, webster.zstd)      
  real    0m1.700s
  user    0m1.660s
  sys     0m0.036s

  $ ls -l
  -rw-r--r-- 1 tyl tyl 12209496 Jul  7 16:26 webster.lzfse
  -rw-rw-r-- 1 tyl tyl 10772836 Jul  7 16:31 webster.zstd

  $ time ./lzfse-master/build/bin/lzfse -decode -i webster.lzfse -o /dev/null
  real    0m0.127s
  user    0m0.112s
  sys     0m0.012s

  $ time ./zstd-master/programs/zstd -d webster.zstd -o /dev/null
  webster.zstd        : 41458703 bytes                                           
  real    0m0.116s
  user    0m0.112s
  sys     0m0.000s
LZFSE's -h option doesn't show a flag to tweak compression. Zstd's default -1 compression is super-fast, but obviously not optimal. Its -8 is the closest I got to LZFSE's compression speed; its -4 was the closest to LZFSE's compression ratio, with a speed of 0m0.527s real compression, 0m0.101s real decompression.



Can you measure energy impact? Apple has specifically said that LZFSE is the right choice when you want lower energy impact, so doing benchmarks of LZFSE against other algorithms is kind of meaningless when you don't measure one of the key metrics. Of course, I'm not sure offhand how to measure energy impact, but Activity Monitor displays energy impact numbers so there must be some way to measure it.


Based on Apple's own presentation it looks like energy efficiency is pretty proportional to performance, so it's very likely that a faster algorithm (e.g. LZ4) would also be more efficient. In modern "braniac" processors, 90% of the energy is spent on overhead and doesn't vary based on instruction mix. Memory accesses consume both energy and time, so fewer are better.

(From the previous discussion: https://news.ycombinator.com/item?id=11944975 )


Surely LZ4 is not in fact more energy efficient, otherwise Apple would be pushing LZ4 as the algorithm to use on mobile devices instead of recommending LZFSE.

If you see in that previous discussion, I was asking the same questions there and got no real answer. It looks to me like nobody (outside of Apple) has actually tested the energy efficiency.


LZ4 is a format. All you can say about how energy efficient a format is, is by looking at the average speeds of its implementations. However, zlib for instance (a DEFLATE library) is more energy efficient than zopfli (another DEFLATE library), just by virtue of taking less CPU cycles to crunch data.

There is in fact a very high correlation between CPU cycles and energy efficiency, since compression algorithms don't sit idle and use roughly the same instructions. In fact, Yann Collet's Zstd uses the same principles as LZFSE, as both were sprouted from Jarek Duda's research: http://arxiv.org/abs/1311.2540.

The reference LZ4 implementation is absolutely more energy efficient than LZFSE, and in fact Apple does push for its use by offering it in its compression library. However, it tends not to compress as well as both LZFSE and Zstd. For 4G or WiFi (or even broadband), the time lost by transferring more data is not compensated by the time won by decompressing it faster, resulting in much slower downloads than even zlib. LZ4 is still relevant for higher speeds, such as those offered by magnetic hard drives. (Beyond a certain speed, such as for SSDs, compression no longer offers a benefit, but you might be ok with the slowdown given that you win drive space.)

There is a separate discussion to be had about the fact that the open-sourced LZFSE reference implementation is not the one they use (which explains how little they touched it since), as it does not even have ARM-specific code. Also, LZFSE does not claim to be patent-unencumbered. LZ4 and Zstd do have optimized code for ARM.

All in all, it is not a stretch to assume that Apple benefits from this FUD, which explains why there is no comparative benchmark anywhere to be found on their GitHub or in their documentation. It really looks like Zstd is better all around.


If Zstd is better, then how does Apple win by pushing LZFSE? It's not like Apple gets anything from having people use a compression algorithm they wrote. If Apple convinces third-party devs to use an algorithm that is worse all around, then everybody loses.


Zstd was a work in progress when Apple released LZFSE even though it was already seeming to be better than LZFSE. But apple needed a finished product and at the time of the release it looked to be the state of the art (released) in a combined metric of compression and performance and they promoted the fact highly. Apple is unlikely to say at the same time that we think a WIP compression scheme would be better than our in a few months or six months later say that please don't use our state of the art compression format anymore its been superseded even if its true specially since its much better than zlib the one people have used for so long.

I think once Zstd gets a bit more mainstream the best they would do is add it as an option.


Zstd and lz4 fit different spaces. Zstd is more symmetrical in terms of compression/decompression performance. The reference implementation of lz4 has a fast-compressing and slower-compressing (but better compression ratio) implementation, and they both are slower to compress than Zstd, but are also both much faster to decompress.


I’m not that comfortable with Xcode, but I just built two versions of this tool, one direct from GitHub, one (called lzfse2) that calls the Compression library on Mac OS X. Typical timings on my system are:

    > time ./lzfse -encode -i webster -o webster.lzfse

    real    0m0.945s
    user    0m0.864s
    sys     0m0.062s

    > time ./lzfse2 -encode -i webster -o webster.lzfse2

    real    0m0.803s
    user    0m0.715s
    sys     0m0.072s


    > time ./lzfse -decode -i webster.lzfse -o /dev/null

    real    0m0.133s
    user    0m0.091s
    sys     0m0.036s

    > time ./lzfse2 -decode -i webster.lzfse2 -o /dev/null

    real    0m0.083s
    user    0m0.053s
    sys     0m0.025s
So, the version that ships on Mac OS X seems to be faster (10% at encoding, 35% at decoding) than what this source and makefile produce. I don’t think that has to do with my way of building them, as I used the makefile (which uses -Os) to build the original tool, and any compiler flags will not have much effect on the one using the Mac OS X library.

Worryingly, the two versions also produce different files (12,209,496 bytes for the GitHub code, 12,234,159 bytes for the library on Mac OS X 10.11.5), but they can decompress each others files and produce the original file.


> Worryingly, the two versions also produce different files

That is actually pretty common. I would guess the version you have in macOS is an older version. They may have tweaked the algorithm to deliver slightly better compression at the expense of speed, which is always the tradeoff in this field.

On the other hand, I would not be surprised if they prepared special tweaks in their internal version to better support arm64. Strategically, Apple seems to believe that people will stick with building for their App Store if they are pushed to write non-cross-platform code.

(Oddly enough, LZFSE/LZVN seems well-suited for file system compression on hard drives, but here again, Yann Collet wins with its LZ4's superior compression speeds, which impact write speeds.)


The difference is size probably come from suboptimalities which were repaired after open-sourcing: http://encode.ru/threads/2221-LZFSE-New-Apple-Data-Compressi... The difference in speed is more surprising, it could come from using a different compiler - there can be really large differences.


Try compiling with a recent 64-bit GCC and adding `-mtune=native -flto` to the mix.


Thanks for this. Down the thread I was asking exactly for this. And it seems Zstd is significantly better. We would of course need a wider range of data to really judge this.

One thing could be that apple's work started before or in parallel with Zstd and didn't know that this was going to be better. But the problem remains we might end up with a compression algorithm widely used (by virtue of being pushed by apple) while another very similar and better algorithm waiting to come to mainstream. Now if there isn't something special about LZFSE in terms of power usage (beyond faster operation reduces power usage) it would be best if once Zstd is really proved to be solid they phase out LZFSE and start pushing this. Don't really think this would happen.

It also seems that Zstd has some dictionary support and so an ever bigger question is whether Zstd can actually replace brotil which probably has a much bigger impact. I really like the idea of Zstd being available everywhere if all the numbers are as good as they seem to be.


Can somebody test the binary that ships with Mac OS X and iOS? I'm asking because the GitHub page says

"This is a reference C implementation of the LZFSE compressor introduced in the Compression library with OS X 10.11 and iOS 9."

My guess is that the versions on Mac OS X and iOS are different, and aren't even written in C (they might have started as C programs, but would have been hand-optimized later)


I’d love to see Charles Bloom add LZFSE and ZSTD to his “pareto frontier” charts: http://cbloomrants.blogspot.com (read down a few posts)



Though it does seem to show decoding is slightly faster with LZFSE unlike your parent's local benchmark. It's not clear to me that either is always superior on those charts.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: