

The Several Million Dollar Bug - oskarth
http://jacquesmattheij.com/the-several-million-dollar-bug

======
stevejones
This isn't remotely a bug, it's not even really anything to do with browsers.
The browser will be doing (roughly) the following:

    
    
        while pending_requests():
            send_request()
            read_response()
    

But what send_request and read_response are doing is putting data on the OS's
outbound queue and then attempting to get data from the inbound queue. If the
data is already in the inbound queue before the request is put on the outbound
queue it doesn't matter - the browser is not aware of this fact. So long as
the "responses" don't come in faster than the browser is sending requests and
causing the queue to overfill, and so long as the responses send out come in
the order the browser is sending requests this technique will work. In general
this is just an "optimistic strategy".

~~~
pbhjpbhj
So once the initial request is made you can push anything you like to the
browser using this pipelining method? What is stopping the responses from
coming in too quickly and "overfilling" the queue? To make it work doesn't
seem too hard but aren't there possibilities for exploits if you're loading in
to memory unrequested data?

~~~
jacquesm
Hm. Now you have me wondering what potential for abuse there is in this. I
never looked at it from that angle.

~~~
joosters
Potentially, you could imagine a poorly-written web browser that could be
fooled by an extra HTTP response, confusing it with a later request to a
_different_ web site.

So you could follow up a HTTP response for
[http://empty.website.with.no.other.files.to.request.com/](http://empty.website.with.no.other.files.to.request.com/)
with a HTTP response containing malicious javacript. When the user then tries
to view a different website (say their facebook page), they get 'served' your
javascript that is now running under the context of the new site. Cookie
stealing and other attacks could be run.

In practice it would be unlikely to happen. If the client doesn't read the 2nd
request then it is likely to be sitting in a network buffer assigned to the
first connection, which will probably be thrown away when it comes to open a
connection to the new site.

~~~
chetanahuja
That's not how it would work. The later request would be made on a fresh TCP
connection (since it's a different URL). Your previous unsolicited response is
sitting in the buffers for the old TCP connection. They would not mingle.

------
joosters
An alternative way to pump lots of webcam frames was to use multipart-MIME
responses. That way, there was only one HTTP request and the response just
streamed JPEG images, one after the other. No need to break any specifications
to get full network usage.

~~~
jacquesm
Yes, we did that too (on Netscape, IE did not do multipart for the longest
time and when it did it at first was quite buggy).

~~~
joosters
I'd forgotten the pain of trying to get IE to work with this! I can see how
the technique in the article could be a good workaround.

ISTR some webcam software that would fall back to a java applet (that would
parse the multipart MIME) on IE. Anything is preferable to that though!

~~~
jacquesm
Yes, but we did everything we could to stay away from things like java,
downloads and plug-ins. Our mantra was 'it just has to work with whatever the
user already has'.

And there always was a way, even if sometimes it required some - for want of a
better word ;) - unorthodox methods.

------
x1798DE
For anyone who was, like me, confused by who "we" is and why I was supposed to
know that, this is from the author's "About" page:

> _My main occupations are being owner /operator of ww.com, which pioneered
> streaming webcam technology, and working as a consultant to do technical due
> diligence._

Reading the article, at first I assumed he was someone who worked on an early
browser or something, then maybe a hardware webcam manufacturer. I assume that
he's just not used to people showing up at his blog with no context about who
he is, so there you go.

~~~
oskarth
It's jacquesm
([https://news.ycombinator.com/user?id=jacquesm](https://news.ycombinator.com/user?id=jacquesm)),
a regular and top contributor here
([https://news.ycombinator.com/leaders](https://news.ycombinator.com/leaders)).
I think most regulars on HN recognize his blog and are somewhat familiar with
what he has done. It would be nice to have a short explanation in the article
though :)

~~~
mjn
I put together a set of capsule bios of the top 20 HN contributors, which
might be useful background info:
[http://www.kmjn.org/notes/hacker_news_posters.html](http://www.kmjn.org/notes/hacker_news_posters.html)

~~~
goatforce5
Nice work! It would be useful to have their HN names as part of their bios.

~~~
mjn
I was a bit undecided about that, though I'm leaning towards adding them now.
There was a small discussion about that when I first submitted this:
[https://news.ycombinator.com/item?id=6957005](https://news.ycombinator.com/item?id=6957005)

------
colanderman
Some IPSes (Intrusion Protection Systems) that perform deep-packet inspection
won't pass such traffic.

But this isn't really a "bug" per se; the TCP model is a stream is a stream is
a stream. There's no notion of time, packets, or correlation between streams.
So browsers (and the OS) are acting only as they can; by treating a TCP
connection as two independent streams.

(Though, how could it be otherwise? Assume HTTP over SCTP (sequenced packets).
We can't require, or even allow, HTTP servers to ignore response packets that
arrive "too early", since it's possible that observers of the client (e.g.
Wireshark) may not observe the exact same timing, which would lead to
divergent interpretations of the conversation.)

Amazon does this too. Upload APIs will return 4xx errors well before the body
is uploaded in the event there's an issue with the headers. Not that (a) most
HTTP clients pay attention to this, or (b) they can do anything about it
without closing and reopening the connection.

~~~
jacquesm
> Some IPSes that perform deep-packet inspection won't pass such traffic.

So will those ISPs also not pass Amazon's upload APIs responses? That would be
pretty sloppy!

~~~
colanderman
IPS (Intrusion Protection System), not ISP. And yes, they'd drop the
connection on such an error response if configured to do so. (Which is likely
what you want to do anyway in this case. Amazon's engineers had the foresight
_not_ to respond early in the case of a successful upload thankfully.)

~~~
jacquesm
Ah! Complete reading fail on my end. (thanks for the edit, it is much clearer
now, I apparently substituted ISP for IPS).

I don't think it is possible to respond early in case of a successful upload,
after all, that means the upload can still fail for a variety of reasons.
Success indicates that you can move to the next state, and an 'early success'
might still turn into a late failure.

------
sysexit
Agreed with some of the other posters that this isn't a bug. It would be
pretty hard for a browser to make this _not_ work. To make it not work, the
browser would have to check whether there's data available in the local socket
buffer before issuing an HTTP request. On Unix, you could e.g. put the socket
in non-blocking mode, issue a read() to read 1 byte, and then see if you're
getting an EWOULDBLOCK. If you get data instead of EWOULDBLOCK, then
(supposedly) you're in violation of the RFC and therefore the browser might
decide to close the connection (what should it do otherwise?)

It just doesn't make a lot of sense doing the above. Especially because
there's a fundamental race condition here: there is no way to distinguish
between data that's in-flight but not received prior to the browser issuing
the request, and data that was generated after the remote peer read the
browser's request.

~~~
colanderman
You could encounter this behavior (dropping unsolicited responses) multiple
"legitimate" ways (all of which suffer from the race condition you mention):
reading & writing in separate threads can do it; so can an asynchronous
receive mechanism.

Erlang TCP connections can be configured for asynchronous receive: any
incoming data is delivered as a message to a given process, which usually
immediately acts on it. Say this process has not yet sent a request; it's not
unreasonable to just drop the incoming data.

Of course, I would consider such behavior non-conforming, for the reasons you
point out. Time isn't really defined in a TCP stream.

Better is to utilize the flow control Erlang provides for asynchronous
receive, but this is extra effort so it's plausible a naive implementation
would miss this.

------
neckro23
I remember this sort of thing being called "push" back in the day (1995 or
so). Before animated GIF support was added to Netscape this was the only way
you could achieve animation of any sort on the Web.

The only concrete example I can recall is that Suck.com used this to have an
animated logo at the top of their page. (I think this predated the Java applet
version that you see on the Internet Archive...)

~~~
jacquesm
You're thinking of x-multipart-replace.

------
nl
This is a nice hack.

Combining it with dynamically generated DNS names might be a nice "content
accelerator" add-on for CDNs, etc.

ie: a page uses resources, each of which has a unique url.

You have custom infrastructure (that sits in front of a normal website) which
dynamically generates a new subdomain for each resource, and replaces the
resource urls with the new urls (using the subdomain).

At the top of the page (or ideally on the previous page) you include some
zero-length resources with the same MIME-type as the resources you want to
serve.

The browser requests these resources, and as soon as you have the connection
open you reply with the zero length resource and then the actual resource you
want to serve.

Subsequently the browser requests the actual resource, and finds it already
waiting.

The unique hostnames are needed to allow you predict which resource will be
requested.

(This was probably patentable until I wrote it all out, too ;))

~~~
jacquesm
That needs a POC, I'm really curious if you can get that to work reliably. You
may even have the ordering problem solved.

------
Mawaai
Isn't this what SPDY try to implement?

~~~
jacquesm
One of the elements in SPDY is that responses to requests can be pushed by the
server anticipating a request. But that's a relatively new development
compared to when I figured out that this 'feature' is supported by just about
every browser out there. And it's kind of logical, if you implement HTTP in
the most straightforward way then the network stack will buffer the response
until the next read, regardless of what the rest of the program is doing. So
when the browser issues that read (either in a separate reader thread or in
the same one if it is programmed single threaded write-then-read style) it
immediately finds the answer to the request it just sent out.

Strictly speaking extra bytes sent past the end of the response to the current
request (or before even any request has been sent) is a protocol violation but
I'm really not complaining about this one, after all, that line in the spec
does not actually specify the timing. We all just read between the lines to
see what we expect to see: ping ... pong.

------
codingdave
This sounds less like a bug, and more like a specific tweak to his logic due
to his specialized use case.

In most cases, even if the web server knows that a specific page contains
images, it does not know if the browser is actually going to request those
images. What if it is a bot? What if the user cancels? What if they have
disabled image downloads in that browser? What if they have the images and
other secondary files cached?

I do think it is worthwhile to consider such things for your individual needs,
but most use cases won't change the standard request/response mechanism.

~~~
Svip
While true that you cannot know whether the client will request the images,
you can use their user agent to make a pretty decent prediction. There will be
corner cases where you are wrong, but most of the time your prediction will be
true.

For instance, if a bot is pretending to be a Chrome browser, you'd think it
was a regular client, but in fact it was not. But that's the bot's fault, not
your implementation.

~~~
TeMPOraL
> _But that 's the bot's fault, not your implementation._

It's your implementation that breaks the protocol spec, not the bot, so it is
still your fault.

------
atesti
What is the purpose of this trick? So reduce lantency?

I get that if the browser sends a GET request for a picture, it's possible
that the server already answered. But what happens afterwards?

Is the connection closed? Or does the server send another JPG again?

multipart-mime / content-replace only worked with Netscape, not with IE.

What else is needed for this solution?

Is there a index.html page with a <img>-tag and some javascript that requests
the picture again every time the image was loaded?

------
damian2000
Is it possible that the technique had become widespread and actually known
about by the browser makers? i.e. it was a bug but they didn't want to break
any applications so didn't fix it...

------
pedrocr
Am I missing a trick or does this only work when the only thing you're serving
at that HTTP server is the JPEG image of the camera? Otherwise the user later
refreshes the page thus doing a "GET / HTTP/1.1" and gets /image.jpg instead.

~~~
jacquesm
Yes, you're missing a trick. The url was modified on each request to bust the
caches in between.

~~~
pedrocr
But how will that work if you're sending the response before you parse the
request? You don't know the URL the client is after. Were you relying on the
browser keeping the same connection alive so you always went
index.html->jpegs?

~~~
jacquesm
The cam _only_ sends out images, it can't really do much else. So you don't
need to know the request, it is implicit.

~~~
pedrocr
Right, what I meant was that you can't have the camera serve a nice
/index.html with the embedded image and other niceties like modern IP cameras
do, because you reply with an image to every request.

~~~
jacquesm
Well, you can actually. All you need to do is switch modes after the _first_
request, which you handle like every other. Which is in fact what it did...
The idea here is that once you've received one request for an image all
subsequent requests on that socket will be images as well.

~~~
pedrocr
Ok, then you are relying on multiple requests on a single socket, which was
what I had suggested before. Does that work reliably though if the user
reloads the page while it's streaming? Wouldn't the browser reuse the same
connection to request the HTML page again and get an image instead?

------
carsonreinke
Does anyone have any specific examples of this?

~~~
jacquesm
If you put your email address in your profile (or send me a line) I'll reply
with a link to a cam that is still online from way back when using this
technology.

I'd rather not post the link in the thread because the poor people sending out
the stream would not be able to satisfy even a small portion of the kind of
volume that HN can direct to a site in an eyeblink.

~~~
maxk42
Alright, here's one of those dumb questions you seem very open to: Can we use
this technique to (for example) reply with all of a page's dependencies upon
the initial request? i.e. if a user goes to www.example.com/ and the server
immediately replies with /, /favicon.ico, /styles.css, /script.js,
/banner.png, etc? I imagine if it were possible, this would result in a
massive reduction in latency...

~~~
jacquesm
Well, you can and you can't. See the problem is that you have no idea what the
next request will be about! So if you're sending the _same_ kind of request
from the client you can respond with a payload in the mime type that is being
expected. But for your usecase you could receive a request for /style.css and
respond with /favicon.ico if a client decided to make the requests in an order
that you did not anticipate.

If you get lucky it will work, but if you're unlucky then you'll be sending
out the wrong payloads on all but the first request.

The only reason this trick worked for the webcam is because it knows ahead of
time what _kind_ of request will come (the request for the next frame). That's
why it can anticipate.

~~~
maxk42
Informative!

~~~
maxk42
Now that I think about it, one could use a small javascript library embedded
in the index page to make a number of additional requests and interpret them
as the correct types via data: URLs. That would be a lot of messy hacking to
shave off a few hundred ms, but might be an interesting exercise to
undertake...

------
Istof
Using this method, you could possibly reduce load times by sending Javascript
and CSS files with the HTML file

~~~
jonny_eh
Assuming those requests are sent in order, that could be an interesting
optimization!

~~~
jacquesm
> Assuming those requests are sent in order

That's the problem right there. Right off the bat I don't see how to get
around that one.

~~~
nl
Dynamically generated unique hostnames per resource would work.

~~~
corford
I'm probably missing something but... wouldn't all the extra dns lookups
remove most of the speed advantage?

------
tootie
Isn't this what UDP is for? Or did it not exist in those days?

~~~
jacquesm
Sure UDP existed 'in those days'.

But how are you going to use UDP to send images to a browser without using a
plug-in or an applet? The whole idea was to remain 'compatible' (for small
values of compatible) with HTTP, which more or less guaranteed delivery.

UDP wouldn't make it through most firewalls and would make all kinds of
assumptions about port forwarding and so on besides that fact that browsers
simply do not expect content to arrive via UDP.

