

Digg's New DUI.Stream and MXHR - snewe
http://blog.digg.com/?p=621

======
aston
Interesting idea, attempting to crunch all of the requests for a page snippet
into a single request. Why do it in Javascript, though?

It seems like you'd be better off with a pre-processor server-side to inline
include CSS, Javascript and images, then send the resulting page back for
rendering. Javascript isn't nearly as fast as my C++-based templating
engine...

edit to add: Also, don't discount the big savings you can get from "302 Not
Modified" return statuses, especially on big chunks of content like images.
The DUI.Stream image demo they had would get smoked by the 'dumb' version if
they allowed the browser to cache the digg dude.

~~~
thamer
I suspect there's not much to gain by caching images in the browser: If you're
looking at a page full of comments with hundreds of avatars, chances are the
next page is going to make you load hundreds more that you haven't seen
already: There is a very large number of commenters, and you probably won't
see the same people on every page.

The demo is right: they ask for 300 different images (as in, different URLs),
and their packager makes it faster.

Also, how would you inline include images? Using base 64 in the image tag?

~~~
aston
The demo is pretty unfair. If you want to package lots of the same image, you
just use it and let cache control do its job. If you want to package lots of
different images, you should use a big sprite (again, server-side packaging
without any js). If you want to make image loading look slow, do what they did
in the demo.

------
tjpick
what advantage does this have over using something like keep-alive?
<http://httpd.apache.org/docs/1.3/keepalive.html>

Keep alive, to me, seems better as you don't have to modify the frontend
architecture. Turn it on at the server and let the server and browser do the
hard work.

~~~
zemaj
Keep alive maintains a persistent connection, you are still limited by the
number of concurrent connections to the server and the latency that each one
experiences.

MHXR bundles all your requests into one. This leads to savings the latency
overhead on each request (& probably some server resources not having to deal
with multiple connections too).

I tell you what; piping binary data into objects? Genius! I had no idea that
was possible.

~~~
tjpick
yeah, my understanding was that the overhead of a request is significantly
less than the overhead of establishing a connection. So by using
keepalive/persistent connections, you get the majority of the savings. All the
headers and data for each file still have to get sent over the wire in the
digg solution anyway, and the concurrent connections limit is the same in both
cases.

But thanks for explaining, I see the latency particularly is an issue.

~~~
Glide
I found this interesting tidbit on the HAProxy page:

Keep-alive was invented to reduce CPU usage on servers when CPUs were 100
times slower. But what is not said is that persistent connections consume a
lot of memory while not being usable by anybody except the client who openned
them. Today in 2009, CPUs are very cheap and memory is still limited to a few
gigabytes by the architecture or the price. If a site needs keep-alive, there
is a real problem. Highly loaded sites often disable keep-alive to support the
maximum number of simultaneous clients. The real downside of not having keep-
alive is a slightly increased latency to fetch objects. Browsers double the
number of concurrent connections on non-keepalive sites to compensate for
this.

Don't know if it's true or not, but it doesn't take much thought to realize if
someone wanted to DDOS a server they would use persistent connections.

------
mr_justin
Unless I'm reading their demo wrong, their technique makes the page load
slower (for me anyway): <http://demos.digg.com/stream/streamDemo.html>

Normal is consistently under 100ms and MXHR is around 400ms (using Safari 4
beta, OS X)

~~~
mitchellh
I get the same thing with the text version. I refreshed 20 times and only got
1 that was faster (and only 7ms to boot).

But if you try this in a non-IE browser:
<http://demos.digg.com/stream/imageDemo.html>

The image demo performs amazingly well! I wish there was a timer for that one
but its extremely fast.

~~~
mitchellh
CORRECTION TO THIS BY AUTHOR: There is a timer, it just waits until all the
normal images load (and I was too lazy to wait). Upon refreshing, the MXHR
stream was 10.3x faster on my FF.

~~~
mr_justin
Hmm, I tried 10-20 times and MXHR was always slower. Sometimes a full half-
second slower. I had some times approaching 1-second load time, meanwhile the
"normal" load was consistently between 350-400ms.

I know it's just a proof of concept and yes, YMMV, but couldn't they come up
with a demo that clearly showed this new technology they invented was worth
using?

------
schtono
Did you figure out how they're transforming the serialized image data into
real images? I remember there was something like a data-attribute in the img-
tag back in the days, but i thought it was deprecated years ago...?

Any hints?

~~~
a-priori
It looks like they do it by replacing the image tags in the DOM with an object
tag pointing to a "data:" uri containing the Base64-encoded image data. Here's
two pieces of relevant code, first where they fire off the listeners for the
DOM objects, and second where they replace the DOM objects:

[http://github.com/digg/stream/blob/66dcce320c73b109cd5836e44...](http://github.com/digg/stream/blob/66dcce320c73b109cd5836e44ca2163c235906dd/js/Stream.js#L222)

[http://github.com/digg/stream/blob/23a411eb0cc00048bbb090889...](http://github.com/digg/stream/blob/23a411eb0cc00048bbb09088937ab556c5cebe26/imageDemo.html#L40)

~~~
Glide
I don't think they're replacing any objects.

Every time a Content-Type of image/gif is encountered it appends a new object
to #stream with the image data inlined. I was really confused at the object
tags at first, so I went ahead and copy pasted one of the object tags with
image data into a blank html and loaded it into firefox. The image showed up
fine.

This is an extremely clever solution. I know jQuery UI's icons are in one
giant image file and background offsets are used to display the correct icon,
but this is probably better for a dynamic solution because the amount of CPU
it would take to stitch upwards of 100 images together.

~~~
smhinsey
that's called a sprite, that giant jquery ui file with multiple images, in
case you want to find more info.

------
vdm
Hmm, this goes some way to making up for the diggbar.

