
As much Stack Overflow as possible in 4096 bytes - df07
http://danlec.com/blog/stackoverflow-in-4096-bytes
======
mberning
Very impressive. I wish extreme performance goals and requirements would
become a new trend. I think we have come to accept a certain level of
sluggishness in web apps. I hate it.

I wrote a tire search app a few years back and made it work extremely fast
given the task at hand. But I did not go to the level that this guy did.
[http://tiredb.com](http://tiredb.com)

~~~
bane
Now that we have blisteringly fast computers, it's worth it to browse _old_
websites and see what "snappy" looks like.

[http://info.cern.ch/hypertext/WWW/TheProject.html](http://info.cern.ch/hypertext/WWW/TheProject.html)

If we could cram more modern functionality into say...twice or three times the
performance of the above, I think the web would be a better place. Instead the
web is a couple orders of magnitude slower.

~~~
deckiedan
Yes. In some ways I think we're still in a very primative kind of level for
web development. Either you do it by hand, tweaking each individual parameter
like the old demoscene, and making it fast and amazingly small, or else you
write huge chunky slow web apps, or more usually, something in the middle.

I feel like the big thing I'm missing is smart compilers that can take web app
concepts, and turn them into extremely optimsed 'raw' HTML/CSS/JS/SQL/backend.
All of the current frameworks still use hand written frequently very bloated
or inelegant hand written CSS & HTML, and still require thinking manually
about how and when to do AJAX when it's least offensive to the user. Maybe
something like yesod ( [http://www.yesodweb.com/](http://www.yesodweb.com/) )
or something like that is heading in the right direction.
[http://pyjs.org/](http://pyjs.org/) has some nice ideas too... But I'm
thinking of something bigger than the individual technologies like
coffeescript or LESS... Something that doesn't 'compile to JS', or 'compile to
CSS', but 'compile to stack'. I dunno. Maybe I'm just rambling.

~~~
aaronblohowiak
the "sufficiently smart compiler" is kind of like "world peace"; something to
work towards, but i doubt we'll have it this lifetime.

[http://c2.com/cgi/wiki?SufficientlySmartCompiler](http://c2.com/cgi/wiki?SufficientlySmartCompiler)

~~~
derefr
"Sufficiently Smart Compiler", like most AI, is a concept with constantly
shifting goal-posts. As soon as compilers _can_ do something, we no longer
consider that thing "smart." Consider variable lifetime analysis, or stream
fusion -- a decade ago, these would be considered "sufficiently smart
compiler" features. Today, they're just things we expect (of actually-decent
compilers), and "sufficiently smart" means something even cleverer.

~~~
calinet6
And, given those optimizations, the programmers get sufficiently dumber to
compensate, resulting in a constant or decreasing level of performance.

That's gotta be a law codified somewhere, right?

------
jc4p
Some of the workarounds he mentions at the end of his Trello in 4096 bytes[1]
post seem really interesting:

\- I optimized for compression by doing things the same way everywhere; e.g. I
always put the class attribute first in my tags

\- I wrote a utility that tried rearranging my CSS, in an attempt to find the
ordering that was the most compressible

[1] [http://danlec.com/blog/trello-
in-4096-bytes](http://danlec.com/blog/trello-in-4096-bytes)

~~~
baddox
> \- I optimized for compression by doing things the same way everywhere; e.g.
> I always put the class attribute first in my tags

Compression algorithms can do a better job when they're domain-aware. An HTML-
aware algorithm could compress HTML much better than a general-use plain-text
compression algorithm, without requiring the user to do things like put the
class attribute first. Of course, that also requires the _de_ compression
algorithm to be similarly aware, which can be a problem if you're distributing
the compressed bits widely.

~~~
MasterScrat
> that also requires the decompression algorithm to be similarly aware, which
> can be a problem if you're distributing the compressed bits widely.

Well not necessarily... An HTML-aware algorithm could for example rearrange
attributes in the same order everywhere because it knows _it doesn 't matter_.

Actually that would be a nice addition to the HTML "compressors" out there.

~~~
baddox
That's a good point. You could have an HTML-aware "precompressor" prepare the
HTML for a general-use compression algorithm. However, with end-to-end HTML
awareness I think you could do even better.

~~~
MasterScrat
Actually that's what Google has done with Courgette:
[http://www.chromium.org/developers/design-
documents/software...](http://www.chromium.org/developers/design-
documents/software-updates-courgette)

> Courgette transforms the input into an alternate form where binary diffing
> is more effective, does the differential compression in the transformed
> space, and inverts the transform to get the patched output in the original
> format. With careful choice of the alternate format we can get substantially
> smaller updates.

------
Whitespace
I'm curious if a lot of the customizations re:compression could be similarly
achieved if the author used Google's modpagespeed for apache[0] or nginx[1],
as it does a lot of these things automatically including eliding css/html
attributes and generally re-arranging things for optimal sizes.

It could make writing for 4k less of a chore?

In any case, this is an outstanding hack. The company I work for has TLS
certificates that are larger than the payload of his page. Absolutely terrific
job, Daniel.

[0]:
[https://code.google.com/p/modpagespeed/](https://code.google.com/p/modpagespeed/)

[1]:
[https://github.com/pagespeed/ngx_pagespeed](https://github.com/pagespeed/ngx_pagespeed)

 _edit: formatting_

~~~
lstamour
Well, the TLS problem is why we'd also want QUIC. But that's another story...

------
nej
Wow navigating around feels instant and it almost feels as if I'm hosting the
site locally. Great job!

------
derefr
> I threw DRY out the window, and instead went with RYRYRY. Turns out just
> saying the same things over and over compresses better than making reusable
> functions

This probably says something about compression technology vs. the state of the
art in machine learning, but I'm not sure what.

------
cobookman
First off, nice work. I've noticed that St4k is loading each thread using
ajax, where-as stackoverflow actually opens a new 'page', reloading a lot of
webrequests. Disclaimer I've got browser cache disabled.

E.g on a thread click:

St4k:

GET
[https://api.stackexchange.com/2.2/questions/21840919](https://api.stackexchange.com/2.2/questions/21840919)
[HTTP/1.1 200 OK 212ms] 18:02:16.802

GET
[https://www.gravatar.com/avatar/dca03295d2e81708823c5bd62e75...](https://www.gravatar.com/avatar/dca03295d2e81708823c5bd62e752121)
[HTTP/1.1 200 OK 146ms] 18:02:16.803

stackoverflow.com (a lot of web requests):

GET [http://stackoverflow.com/questions/21841027/override-
volume-...](http://stackoverflow.com/questions/21841027/override-volume-
button-in-background-service) [HTTP/1.1 200 OK 120ms] 18:02:54.791

GET
[http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min...](http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js)
[HTTP/1.1 200 OK 62ms] 18:02:54.792

GET
[http://cdn.sstatic.net/Js/stub.en.js](http://cdn.sstatic.net/Js/stub.en.js)
[HTTP/1.1 200 OK 58ms] 18:02:54.792

GET
[http://cdn.sstatic.net/stackoverflow/all.css](http://cdn.sstatic.net/stackoverflow/all.css)
[HTTP/1.1 200 OK 73ms] 18:02:54.792

GET
[https://www.gravatar.com/avatar/2a4cbc9da2ce334d7a5c8f483c92...](https://www.gravatar.com/avatar/2a4cbc9da2ce334d7a5c8f483c9216e1)
[HTTP/1.1 200 OK 90ms] 18:02:55.683

GET [http://i.stack.imgur.com/tKsDb.png](http://i.stack.imgur.com/tKsDb.png)
[HTTP/1.1 200 OK 20ms] 18:02:55.683

GET [http://static.adzerk.net/ados.js](http://static.adzerk.net/ados.js)
[HTTP/1.1 200 OK 33ms] 18:02:55.684

GET [http://www.google-analytics.com/analytics.js](http://www.google-
analytics.com/analytics.js) [HTTP/1.1 200 OK 18ms] 18:02:55.684

GET [http://edge.quantserve.com/quant.js](http://edge.quantserve.com/quant.js)

....and more....

~~~
oneeyedpigeon
Almost all that really tells us is that you have browser cache disabled
(resources such as jquery wouldn't be re-requested on StackOverflow is you
didn't). As a matter of interest, why are you disabling browser cache? Doesn't
that waste a needless amount of bandwidth? Is it for some kind of security
reason?

~~~
cobookman
I'm currently developing a lot of websites. Disabling browser cache ensures
that every file is the most resent version (and not cached), as well as to
better see dependency loading.

Even with browser cache enabled, stack overflow loads a considerable number of
resources compared to st4k. St4k loads 1 api call to get the S.O. data (JSON -
~1KiB), then loads any needed images. Stack overflow is loading the entire
HTML document again (~15KiB), along with a lot of other web resources. Without
going into their code, I've got no idea on what is lazy loading.

But my point still stands on the speed of page navigation (not first time
landing). St4k is faster as each change of page requires less KiBs of
information to perform a page render, as well as less content to render:
Compressed JSON, vs the entire HTML markup, and Rendering the Changes vs Re-
Rendering the entire window/document.

~~~
oneeyedpigeon
Oh, yes - I certainly agree that St4k is a lot faster than Stack Overflow, but
I'm not sure quite whether to attribute that to exactly what's happening on
the server (remember that the real SO will have load several orders of
magnitude greater), fewer 'requirements' (e.g. no analytics?), more efficient
markup, or specifically loading markup via AJAX+JSON rather than the 'normal'
web route.

I'm surprised there's _such_ (15/1) a difference between the HTML and JSON
versions of the same data. Both add their own syntactic cruft, but I wouldn't
expect the weight of the markup to be that much greater than the equivalent
JSON, unless it's being implemented horribly inefficiently (e.g. very verbose
class names, inline styling, DIVitis). I'm suspicious about that 15/1 figure.

All in all, though, this is an interesting approach. Nothing radically new,
but definitely good to see a solid proof of concept that we can all relate to.
I particularly like the way this gets around any api throttling limits since
the St4k server isn't doing the communication with SO, it's all happening
client->server, much as if one were just browsing SO as normal. Is there a
term for this? It's not quite a proxy, since it's not 'in the middle', but
more 'off to the side, not interfering directly, merely offering helpful
advice' :)

~~~
oneeyedpigeon
OK, having looked at the SO source, it's evidently _not_ very concise. Full of
inline script, lots of 'data-' attributes, even - gasp - tables for layout. (I
think I was dimly aware of that last point, but had chosen to pretend it
wasn't the case. And, yes, I know that HN is no better in that regard) Still,
FIFTEEN times weightier ...?

------
SmileyKeith
This is amazing. As others have said I really wish this kind of insane
performance would be a goal for sites like this. After trying this demo I
found it difficult to go back to the same pages on the normal site. Also I
imagine even with server costs this would save them a lot of bandwidth.

------
masswerk
And now consider that 4096 bytes (words) was exactly the total memory of a DEC
PDP-1, considered to be a mainframe in its time and featuring timesharing and
things like Spacewar!.

And now we're proud to have a simple functional list compiled into the same
amount of memory ...

~~~
rangibaby
Your iPhone also had more computing power than the rest of the world.
Combined! :-)

~~~
masswerk
And it even can play Bach and connect to the network, like the PDP-1! :-)

------
afhof
4096 is a good goal, but there is a much more obvious benefit at 1024 since it
would fit within the IPv6 1280 MTU (i.e. a single packet). I recall hearing
stories that the Google Homepage had to fit within 512 bytes for IPv4's 576
MTU.

~~~
tedd4u
One packet is great if you can do it. There's a big penalty after the sender
in a new TCP connection reaches the initial transmit window. A lot of sites
these days have configured this up from 2x or 3x MSS to 10x MSS (about 5,360
bytes) to increase what can be sent in the first transmission back from the
server (HTTP response for example).

~~~
Dylan16807
If they're configured for 10x they're probably also going to be using an MSS
of 1460, so you can cram 14 kilobytes of data into the initial request.

------
jonalmeida
Pages load almost instantly like as if it's a local webserver - I'm quite
impressed.

------
blazespin
Very impressive! So incredibly fast.

My only thoughts are that search is the real bottleneck.

------
Jakob
I didn’t realize that the original site is already quite optimized. With a
primed cache the original homepage results in only one request:

    
    
        html ~200KB (~33 gzipped)
    

Not bad at all. Of course the 4k example is even more stunning. Could the gzip
compression best practices perhaps be added to an extension like
mod_pagespeed?

------
kislayverma
Very very awesome.

I'd take some trade-off between between crazy optimization and
maintainability, but I'd definitely rather do this than slap on any number of
frameworks because they are the new 'standard'.

Of course, the guy who has to maintain my code usually ends up crying like a
little girl.

------
dclowd9901
>"I threw DRY out the window, and instead went with RYRYRY. Turns out just
saying the same things over and over compresses better than making reusable
functions"

I would love to investigate this further. I've always had a suspicion that the
aim to make everything reusable for the sake of bite size actually has the
opposite effect, as you have to start writing in support and handling tons of
edge cases as well, not to mention you now have to write unit test so anyone
who consumes your work isn't burned by a refactor. Obviously, there's a place
for things like underscore, jquery, and boilerplate code like Backbone, but
bringing enterprise-level extensibility to client code is probably mostly a
bad thing.

------
nathancahill
This is really fast! Love it. I thought the real site was fast until I clicked
around on this.

------
arocks
Looks broken on my Android mobile, but seriously this is incredible!

Wonder how we can unobfuscate the source. It would be great if there is a
readable version of the source as well, just like we have in Obfuscated C Code
Contests. Or perhaps, some way to use the Chrome inspector for this.

~~~
rangibaby
Using HTML prettify on the source is a start at least:

[https://github.com/victorporof/Sublime-
HTMLPrettify](https://github.com/victorporof/Sublime-HTMLPrettify)

------
TacticalCoder
In a different style, the "Elevated" demo, coded in 4K (you'll have a hard
time believing it if you haven't seen it yet):

[http://www.youtube.com/watch?v=_YWMGuh15nE](http://www.youtube.com/watch?v=_YWMGuh15nE)

------
shdon
His root element is "<html dl>". I'm not aware of the dl attribute even
existing... Is that for compressibility or does the "dl" actually do
something?

------
jazzdev
Impressive, and a useful exercise, but it doesn't seem practical to give up
DRY in favor of RYRYRY just because it compresses better and saves a few
bytes.

------
iamdanfox
The simpler UI is quite pleasant to use isn't it! I wonder if companies would
benefit from holding internal '4096-challenges'?

------
nandhp
Code is formatted in a serif font, instead of monospace, which seems like a
rather important difference. Otherwise, it is quite impressive.

~~~
dubcanada
You most likely don't have Consolas, or Monaco then.

That font family should have been

Consolas,Monaco,monospace

Rather then

Consolas,Monaco,serif

But what ever :)

~~~
timtadh
Yep: but fixing breaks the 4096 barrier:

    
    
        $ curl -s http://danlec.com/st4k | gzip -cd | sed 's/serif/monospace/' | gzip -9c | wc
            14      94    4098

------
timtadh
funny, his compressor must do a better job than mine:

    
    
        $ curl -s http://danlec.com/st4k | wc
             14      80    4096
        $ curl -s http://danlec.com/st4k | gzip -cd | wc
             17     311   11547
        $ curl -s http://danlec.com/st4k | gzip -cd | gzip -c | wc
             19     103    4098

~~~
bdonlan
Turn up the compression level:

    
    
        $ curl -s http://danlec.com/st4k | gzip -cd | gzip -9c | wc
             14      80    4096

~~~
timtadh
Right. I feel silly for not trying that. Good spot.

------
scoopr
There seems to be many bytes left! :)

    
    
       $ zopfli -c st4k |wc
          11     127    4050

~~~
slackito
Thanks for the pointer to zopfli. I've used p7zip in the past as a "better
gzip", and it gets good results for this one too :D

    
    
      $ curl -s http://danlec.com/st4k | gzip -cd | 7z a -si -tgzip -mx=9 compressed.gz
      $ wc compressed.gz 
        14   84 4048 compressed.gz

------
dangayle
I'd love to see a general list of techniques you use, as best practices.

~~~
thedufer
There's a short list at the end of the post about Trello4k:
[http://danlec.com/blog/trello-in-4096-bytes](http://danlec.com/blog/trello-
in-4096-bytes)

~~~
dangayle
Thanks. How much of that could we do during the original design phase?

------
tantalor
> The stackoverflow logo is embedded?

Did you try a png data url? Could be smaller.

------
jpatel3
Way to go!

------
stefan_kendall
Maybe part of the story here is that gzip isn't the be-all-end-all of
compression. A lot of the changes were made to appease the compression
algorithm; seems like the algorithm could change to handle the input.

A specialized compression protocol for the web?

