Despite the "extreme inefficiency", HTTP/HTML have managed to work successfully ...

voidlogic · on July 9, 2013

>here is actually a binary format that much of the web content is compiled into: gzip. It's remarkably effective.

I'm sure you realize this but gzip is a content encoding. If there was a binary html format, it could still (and should be) gzipped (or better yet 7-zipped).

I did some playing around this this and gzipped binary encoded HTML ended up around 1/4th the size a gzipped minified HTML.

Gzip doesn't give you the other advantages of the hypothetical binary format, primarily: An extremely standardized and quickly parseable format.

samwillis · on July 9, 2013

what "binary encoded html" did you use? I would be interested in seeing this as my initial thought would be that the both would gzip to almost exactly the same size?

voidlogic · on July 9, 2013

>what "binary encoded html" did you use?

This was for internal testing, so it was a custom implementation.

>my initial thought would be that the both would gzip to almost exactly the same size?

That is like saying you would expect a gzipped CSV file to be as small as a gzipped database engine file. This of course not the case.

jessaustin · on July 9, 2013

So which is which? Is html the csv or the DB dump? What is this metaphor even supposed to mean? It's not as though a csv file is sufficient to serialize a database.

Actually, it would be more informative to point us at a "binary encoding", any binary encoding, that compresses smaller than the equivalent html text.

bitwize · on July 9, 2013

I would expect the gzipped CSV file to be smaller.

When you add in indexes or B-trees or whatever else database engines use to retrieve data quickly, database files can get quite big indeed.