
Why you should always use gzip instead of deflate compression - tbassetto
http://stackoverflow.com/questions/388595/why-use-deflate-instead-of-gzip-for-text-files-served-by-apache/9856879#9856879
======
DHowett
The "CRC-32 is slow" sentiment feels like a bit of a straw-man argument
against gzip. With today's computing power, the difference is nigh-unto
negligible: checksumming is dwarfed by the actual decompression, or maybe even
the network overhead/latency.

Gzip is, doubtless, better for the reasons laid out in the article, but why
aren't we moving forward? Do any browsers support LZMA or bzip2? Would they be
at all worth the effort? I assume not in the case of HTML/resources, but maybe
in raw non-compressed binary streams.

~~~
rmc
The largest assets will be already compressed (eg images and/or video). Http
compression is only really benefiting for textual content. Gzip gives you a
massive space saving over non compressed for textual data, but after that
you're in diminishing returns. Although bzip2 is better than gzip, its not a
lot better.

~~~
burstlag
bzip2 compresses significantly better than gzip (I typically see a delta of
between 10% and 30% in filesize) but it comes at a significant cost:
compressing/decompressing the same file generally takes many times longer. xz
compresses even better but takes even longer.

Also, some web servers (or at least Apache) will serve gzipped static content
automagically if it finds a matching gzipped file. For example, if you have
/var/www/html/foobar.css.gz, it will serve the contents of that file directly
when the client requests <http://example.com/foobar.css>. So if you're running
a stock Apache configuration, there's no reason not to just run gzip on all of
your static assets and get lower bandwidth bills right away.

------
justincormack
Adler32 is very broken, but it doesnt really matter as you can use other
checksums on top.
[http://www.leviathansecurity.com/blog/archives/16-Analysis-o...](http://www.leviathansecurity.com/blog/archives/16-Analysis-
of-Adler32.html)

Browsers should just stop supporting deflate, dont say they can in the
request, and servers should drop it too. Ideally gzip support should be
required not negotiable.

~~~
dfox
On the other hand, quality of Adler32 is hardly an valid argument here, as
HTTP without compression does not use any checksums whatsoever. The fact that
it's not consistently implemented is.

By the way, Adler32 I don't think that it's right to call Adler32 broken, as
there are much worse algorithms in common use (TCP checksum, for example), and
limitations of Adler32 are widely known and are not too relevant for this
application.

~~~
calloc
Luckily once we all switch to IPv6 we will no longer have TCP checksums :-)

------
vasi
Good post, but it would be great if "for HTTP" was added to the title. This
has nothing to do with other use cases.

------
sciurus
It's interesting to learn about the web browser and server compatibility
problems.

In the general case, I recommend one of two compression algorithms:

* lzo, for when compression or decompression speed matters most * lzma, for when file size matters most

~~~
premchai21
LZO, though, has the potential problem or benefit (depending on your
situation) of not being readily incorporable into proprietary software without
navigating Oberhumer's commercial-license maze, unless there's some form of
clean-room reimplementation around that I don't know about. (“LZO
Professional” at <http://www.oberhumer.com/products/lzo-professional/> claims
to be available only as a “binary-only evaluation library” under NDA; I don't
know whether “evaluation” in this context would obstruct use in an actual
product.) The zlib license makes it much easier to deploy. liblzma seems to be
in the public domain.

~~~
wmf
I think LZO has been obsoleted by Snappy anyway.

~~~
pmjordan
Even where the GPL is fine, LZO consists of some pretty scary code. Though at
this point, I think probably enough people smarter than me have understood it
and deemed it safe (it's part of the Linux kernel).

Snappy is fine as long as you can use compile C++ and link against the C++
standard library as it uses std::string for byte buffers for some strange
reason. I can only assume it's mainly used for compressing strings at Google,
but unless someone ports it to C or at least rips out the standard library
dependency, that will preclude its use from some embedded systems, or the
Linux kernel.

As a substitute for LZO or Snappy, I recently integrated the BSD-licensed LZ4
[1] into a project where no C++ standard library was available. Dependencies
are minimal, memory use is predictable and the code is actually readable.
Speed and ratios are comparable to LZO and Snappy.

[1] <http://code.google.com/p/lz4/>

