
Ask HN: Best compression format for HTML? - cztomsik
Is there any custom algorithm to compress HTML? I mean gzip is fast &amp; good, bzip2 can compress even better and PAQ can do the best probably.<p>Is there anything specifically designed for HTML?
======
dabmancer
HTML is repetitive text, and that is what conventional compression algorithms
(like the ones you mentioned) are good at. How they work is pretty
interesting, so I'd encourage you to go look it up.

The fact that you need to compress HTML makes me think there is a bigger
problem, though (People shouldn't make complicated web pages).

~~~
cztomsik
I know how these work and it's indeed interesting for example PAQ achieves
great compression by training multiple neural networks but it's still generic
so I thought maybe somebody already has pretrained something similar at large
amounts of HTML so it gives really great compression even better than PAQ
which has to start from scratch every time...

------
zimpenfish
Brotli?

[https://en.wikipedia.org/wiki/Brotli](https://en.wikipedia.org/wiki/Brotli)

> Brotli uses a pre-defined 120 kilobyte dictionary [...] contains over 13000
> common words, phrases and other substrings derived from a large corpus of
> text and HTML documents.

------
cztomsik
BTW: this is by far the best what I've read about compression
[http://mattmahoney.net/dc/dce.html#Section_1](http://mattmahoney.net/dc/dce.html#Section_1)

