
Reactive prefetch on Google Search: 100-150ms speedup - igrigorik
https://plus.google.com/+IlyaGrigorik/posts/ahSpGgohSDo
======
qwerta
There is simple way to speed this up. All Google Search links point to
redirection service: www.google.gr/url?example.com. It is trivial to write
script which makes those links direct.

~~~
acdha
They use that to improve search quality by seeing which links people actually
click on – a key signal they're not going to give up – but the good news is
that there's a better way to do that and they're already using it. HTML5 added
a ping attribute to the <a> tag which tells the browser to make an untracked
asynchronous request to a different URL to record the click:
[https://developer.mozilla.org/en-
US/docs/Web/HTML/Element/a#...](https://developer.mozilla.org/en-
US/docs/Web/HTML/Element/a#attr-ping)

According to the same author, that was deployed over a year ago but only to
browsers which support it:

[https://plus.google.com/+IlyaGrigorik/posts/fPJNzUf76Nx](https://plus.google.com/+IlyaGrigorik/posts/fPJNzUf76Nx)

Unfortunately, this was implemented in Firefox years ago but disabled due to a
fear-mongering campaign by some self-styled privacy advocates who were quite
vocal in sharing their misunderstanding of web privacy:

[https://web.archive.org/web/20060126211610/http://weblogs.mo...](https://web.archive.org/web/20060126211610/http://weblogs.mozillazine.org/darin/archives/009594.html)

EDIT: I forgot to mention the new Beacon API, which is getting more traction
because it's more powerful and is fully supported as of Firefox 31:

[https://developer.mozilla.org/en-
US/docs/Web/API/navigator.s...](https://developer.mozilla.org/en-
US/docs/Web/API/navigator.sendBeacon)

~~~
nly
> Unfortunately, this was implemented in Firefox years ago but disabled due to
> a fear-mongering campaign by some self-styled privacy advocates who were
> quite vocal in sharing their misunderstanding of web privacy

What 'misunderstanding'? I don't want people knowing what third party links
I'm clicking on. There's no misunderstanding. I understand it perfectly. I
just don't want it. I disable 3rd party HTTP referers as well (using the
RefControl extension). Sure it's not the only way sites can implement this
behaviour, and I'm glad an official way exists... but only so Google will use
it and then I can disable it. The argument against having an off switch is
basically 'well, they're going to fuck you anyway, so bend over and here's
some lube'.

You say 'some self-styled privacy advocates' are fear-mongering, well its
because webheads keep implementing insanely harmful features and aren't
actively making the web better for the privacy concious. They (we, I guess)
are grossly under-served.

The response to link tracking _should_ be "Hmm, how can we have websites ask
for this permission, and shut down all these nasty means of doing it?" not "oh
boy, these people are tracking people in an ugly way, how can we make this
_fast?_ ". But guess which one is actually a hard problem.

Do you suppose that most everyday users know (not suspect, but know) that
Google are watching every link they click on? A technologically illiterate
user base cannot consent.

~~~
logicallee
why don't you download the internet and run grep, then you don't have to leak
your search terms. Seriously - why would you want Google to know what search
terms you're using? Why would anyone want to pass this information to an
untrusted third party?

As for me, I think once a site has seen my exact search term, knowing which of
the results I'm clicking on is a small leak and quite useful so that the
popular results can be put at the top.

~~~
nly
So when I google 'suicide' it's OK for Google to know whether I'm clicking on
the Samaritans or the Wikipedia article? And when I Google 'rape' it's OK for
Google to know whether I click on a news article about a string of recent
rapes or click through to rape fetish erotica website?

Not everything is black or white.

~~~
sliverstorm
The main argument here is, they _already know_ which you click on. They use
referral links. So that's not a sensible reason to prohibit the "ping"
attribute.

Given two solutions with equal potential for abuse, why _not_ pick the
technically superior solution?

~~~
nly
> Given two solutions with equal potential for abuse, why not pick the
> technically superior solution?

Straw man. You're presupposing the existence of 'ping'. The argument is why,
given an observation that web features X and Y are being used to implement
contentious function Z, would you want to implement a brand new, even more
insidious, web feature, designed solely for doing Z, in the first place?
Technical superiority of the new implementation of Z is _not_ in dispute.

> they already know which you click on.

I'm pretty confident that they don't in my case.

~~~
phpnode
> I'm pretty confident that they don't in my case.

Would you mind sharing the details of how you achieve this?

~~~
nly
Monitor the HTTP activity with FF developer tools while using Google. It's
plain to see that no new traffic flows to Google occurs when I click a link.

~~~
dchichkov
AFAIK they track links only for a subset of users and not every time.

------
cryptoz
Will other browsers be implementing support for this? How much of this type of
improvement should we view as Google's ambitions and fast pace of execution,
and how much as a Microsoft-style move to lock in users to a specific platform
that offers a better, but incompatible, experience?

I'm not taking sides here, I don't know enough to make a judgement. But it's
interesting that Google seems to be increasing their pattern of standards-
tweaking in order to make a superior product - and who can fault them for
that? But isn't that how we got so much of the mess that MS made?

What's to be done?

~~~
jewel
I think this is just a work-around for the sloppiness of the web. If all the
resources for a page were compiled into a single file and sent all at once,
this wouldn't be necessary, but that'd be inefficient to for subsequent page
requests.

Sites that want to get the same speedup can make sure they don't have any
secondary resources that block page rendering.

Finally, the particular mechanism, link rel="prefetch" is used by Bing and IE
11 [1]. Google has just found a way to prefetch even earlier than normal, by
inserting the prefetch links into the search page as soon as the user clicks.

[1]: [http://blogs.msdn.com/b/ie/archive/2013/12/04/getting-to-
the...](http://blogs.msdn.com/b/ie/archive/2013/12/04/getting-to-the-content-
you-want-faster-in-ie11.aspx?Redirected=true)

~~~
toomuchtodo
> I think this is just a work-around for the sloppiness of the web. If all the
> resources for a page were compiled into a single file and sent all at once,
> this wouldn't be necessary, but that'd be inefficient to for subsequent page
> requests.

The benefit of the web was that you didn't have a fat GUI locally. With the
push to aggregate assets together and get the entire UI/logic into the client
browser, cached, and read off of remote services, we're slowly making our way
back to local GUI apps. This time, the browser is the OS (forgive the poor
analogy).

What was old is new again.

------
dzhiurgis
How long until Google just allows to host the results on their website - the
destination opens instantly (preloaded while you gaze thru the results).

I mean it's not great for net neutrality standpoint, but it's a next logical
step for them.

~~~
lkbm
Well, they do have snippets from Wikipedia and the like. Bing (I believe) had
the magnifying glass tool that would show you a thumbnail of the result on
hover. Google came out with something similar for a bit--you'd click a result
and it would pull up a thumbnail(?) to the right. That may still be a thing,
actually.

------
Illniyar
Other then search engines what type of sites gain any benefit from this?

For external links, if you aren't a search engine, you don't know what
resources need to be prefetched and even if you did you will have no way of
knowing when it changes (other then manually). For internal links, in most
cases the resources are already cached and http 2.0 server push is going to
fix the rest.

Beyond that, the API looks clunky - instead of having a "prefetch" attribute
on a link, you need to add more links with special syntax to the header
dynamically.

What's the point in adding this functionality to a browser? this looks like an
implementation specifically designed to make only google search faster.

~~~
3minus1
> you don't know what resources need to be prefetched and even if you did you
> will have no way of knowing when it changes

you could find out if google set up an API that exposed this information. Then
a simple plugin could update the links on your page.

------
mark242
I would imagine that once HTTP/2 becomes a serious implementation, this kind
of thing will be unnecessary.

~~~
igrigorik
No, HTTP/2 has no effect on this. The insight here is that we're initiating
the fetch for the HTML _and_ its critical resources in parallel... which
requires that the page initiating the navigation knows which critical
resources are being used on the target page.

~~~
oconnore
With HTTP/2 server push, there is no difference between acquiring the html and
the critical resources. They all get sent without a round trip.

Then there is no latency benefit from requesting in parallel, because there is
no round trip to avoid.

~~~
dragonwriter
But isn't HTTP/2 push restricted to same-origin content? Critical resources
may not be same-origin, so I'd expect this technique to have some utility even
when HTTP/2 is fully deployed on clients and servers and fully utilized.

~~~
igrigorik
Actually, you're both right. With http/2 the server can push critical assets
delivering similar results. But, that requires that the server supports http/2
and is smart enough to initiate the push, AND those resources are same-origin.

The benefit of above technique is that it's deployable today, doesn't require
the destination server to be upgraded, and works for cross-origin resources.

~~~
oconnore
Yeah, it would be quite a feat for an http server to force another http server
to inject content into it's TCP stream :P (aka, cross origin)

~~~
dragonwriter
Cross-origin could be an HTTP server itself serving content from a different
origin via a push channel (which it could do either because the different
_origins_ share _servers_ \-- which can occur -- or by having, e.g.,
prefetched the data itself from the foreign server and pushing it.)

HTTP/2, I'm pretty sure, doesn't allow this (and there are all kinds of
reasons it wouldn't be a good idea), but it wouldn't necessarily require
forcing the other server to push the content.

------
ams6110
is 0.1 seconds improvement a big deal? Especially on mobile/g3 that is lost in
the noise. By comparison, pages right here on HN routinely take 10+ seconds to
load for me, even on a gigabit uplink.

~~~
robrenaud
Yes, latency matters a lot in aggregate.

[http://perspectives.mvdirona.com/2009/10/31/TheCostOfLatency...](http://perspectives.mvdirona.com/2009/10/31/TheCostOfLatency.aspx)

From Marissa Mayer while at Google:

> Marissa ran an experiment where Google increased the number of search
> results to thirty. Traffic and revenue from Google searchers in the
> experimental group dropped by 20%.

> Ouch. Why? Why, when users had asked for this, did they seem to hate it?

> After a bit of looking, Marissa explained that they found an uncontrolled
> variable. The page with 10 results took .4 seconds to generate. The page
> with 30 results took .9 seconds.

> Half a second delay caused a 20% drop in traffic. Half a second delay killed
> user satisfaction.

~~~
taeric
This feels like a situation where we found an answer, and extrapolated to say
it must be the answer.

The article does at least have plenty of examples. I just can't help but think
some of these are heavily confounded with "changes cause people to change."
That is, I'd be curious to know if any of the tests did _nothing_ other than
increase latency.

The increase from 10 to 30 results, for example, had to have changed the look
of the page. Would that alone have been enough to change behavior? My
hypothesis is that it would be. Especially if there was a viable alternative
that was still familiar to the user. Can't fathom by how much, though.

------
rjammala
book by the author:

[http://chimera.labs.oreilly.com/books/1230000000545/index.ht...](http://chimera.labs.oreilly.com/books/1230000000545/index.html)

------
daxfohl
Should reactively download precompiled javascript too.

------
mkramlich
if you like this topic you might also be interested in a recent HN post of
mine, that did not get much attention, on the topic of all the various
possible techniques or patterns we can draw from in order to improve
performance (eg. lower latency, increased throughput) or scalability:

[https://news.ycombinator.com/item?id=8665707](https://news.ycombinator.com/item?id=8665707)

------
throwaway4719
Posting from a throwaway account since I work on a competing browser.

I think Google needs to check its steps quite carefully when doing things like
these. For quite some time they have leveraged their search monopoly (think
about their EU search market share) to bring search/browser-type integration
features to chrome first. I would say this is abusing a monopoly in one market
segment (search in the EU) to attempt to create a monopoly in another segment
(browsers in the EU) by continually making sure that Chrome is the browser
that works better than other browsers when using Google search services.

Yes, this is innovative, but there is also a concept known as antitrust laws.
Another way of bringing this to the market would have been to invite competing
browsers to use this and build a credible time plan for a simultaneous launch
for all the browsers that wanted to support this.

~~~
cmelbye
I'm confused, isn't it an open standard?
[http://www.w3.org/html/wg/drafts/html/master/links.html#link...](http://www.w3.org/html/wg/drafts/html/master/links.html#link-
type-prefetch)

Perhaps their implementation differs, but they even showed the JS they're
using to perform the prefetch in the post. It doesn't seem like they're trying
to hide anything.

~~~
throwaway4719
They kept the fact that they were going to deploy this on the search service
that has a monopoly on search in EU secret until it was launched in Chrome.
Before this, this link prefetching has seen very little use in the wild.

~~~
LukeB_UK
They don't have to tell anyone what they're doing. As cmelbye stated, it's an
open standard. Other browsers can implement it if they wish.

