

The bugfix that could make the Internet 5% faster - dudus
http://eduardo.cereto.net/the-bugfix-that-could-make-the-internet-faster

======
jgrahamc
I don't think this argument stands up to technical examination. His claim is
based on this:

1\. The average size of HTTP request headers is 700 bytes. 2\. Of that around
25% is Google Analytics 3\. About 50% of web sites use GA

Thus you could save 12% of HTTP request header size across the Internet.

Then he posits that this would make the net 5% faster. How?

If the average size is really 700 bytes then they will fit inside a single TCP
packet and so changing it from 700 to 525 will make no difference at all. The
entire request will be in a packet. The real cost is multiple packets
requiring ACKs and that occurs with the larger header sizes that SPDY
specifically targets with HTTP header compression. I've seen many egregious
cookies, but the GA ones are tiny.

If you really want to speedup the web you need something like SPDY plus SDCH
to do data-aware compression of repeated chunks of HTML.

~~~
ajross
Packet sizes make very little difference on an unsaturated line, which is
often the case on the client side. It's very often _not_ the case on the
server side, or on mobile networks. There, it's not at all uncommon to see
line saturation at peak usage, and under those circumstances pure bandwidth
optimizations can be helpful.

~~~
amalcon
The server's uplink is certain to be full-duplex and outbound heavy, so
inbound bandwidth is effectively unconstrained. You still could run into line
congestion somewhere in the middle, in which case bandwidth optimizations
would help, but router congestion is much more common.

Except in mobile networks, as you note.

~~~
jmileham
I'd guess that the mean request size is misleading though - it is presumably
the average of a mix of lightweight CDN requests and gargantuan app requests
loaded up with the egregious cookies jgrahamc mentions. It's not hard to
imagine a healthy number of requests hovering right at the brink of breaking
into two packets with a 1500 byte MTU.

------
ck2
Or block ga entirely in your browser for 100% speed recovery against tracking.

But seriously, GA will still work if you add DEFER and make it load only after
page load. Then there is zero impact on the page.

~~~
sp332
But then GA might not record that someone loaded the page and clicked a link
to navigate away, before the page was done loading.

~~~
dextorious
So what? Why should you count this case?

------
aquark
The speed of the request is probably not linear with the size either.

At 700-800 bytes for an average HTTP request you are below the single packet
size. Given the overhead of processing the packet the extra bytes are pretty
cheap: a 1 byte packet is not going to be 1000% faster than a 1K packet.

Now, if the cookie pushed the average request size over the packet boundary
then the cost would be huge.

~~~
mmmooo
Exactly, and even if it was 5%, which by no means it is, what % of time is the
request vs response. Sounds like this a solution to a non-problem.

~~~
jqueryin
We must also consider the relative cost of google implementing fully cross-
browser compatible localStorage code into ga.js, with fallbacks on cookies.
They're likely going to want to put the burden on the client than on their own
personal bandwidth.

------
loup-vaillant
Please, I beg you, do not call the web "the internet". The web is but a subset
of the internet. Yes, it is taking over the rest of the internet (web-mail and
such). Yes, client-side HTTP is often the only part of the internet you can
have from many networks. But no, that is not a good thing.

But because of sloppy wording, non-technical people get the idea that if they
can query web servers, then "they have the internet". Most restrictions go
unnoticed.

I know it may sound pedantic. It may even remind you of Stallman's _GNU_
/Linux. But really, it can't hurt to say "web" instead of "internet": unlike
"GNU/Linux", it is _shorter_ than the incorrect word.

------
amalcon
_The moment you notice GA is present in about 50% of top websites you notice
that useless GA cookies going around the internet represent 12% of all HTTP
requests._

This is incorrect. "GA is present in about 50% of top websites" does not mean
"50% of HTTP requests are on the domain with the cookies." Many websites load
external images, etc. See ytimg.com, etc.

~~~
dudus
Of course the websites that use cookieless domains for resources reduce this
issue by an order of magnitude, but it still exists in at least the html
document request. Besides if you think that 50% of the traffic of the top
million sites probably represent 70%-80% of the total internet traffic the
real gain may be even larger.

~~~
amalcon
1) How can half the traffic on a subset ever represent more than half of the
traffic on the superset? The 12% number assumes that these top million sites
have effectively all global traffic; in practice it will only ever be less
than that. 2) These top million sites are exactly the sort that are likely to
serve static content from a different hostname.

------
jmileham
"We're talking about the average speed of the whole internet"

5% fewer bytes in the request headers would seem to be well within the noise
in terms of user experience when you consider latency and application runtime.

This is probably a good argument for using a secondary hostname for assets
(like facebook, with fbcdn.net), though, if you want to shave the last few
bytes off of requests you don't need to track.

~~~
SquareWheel
I never understood that, why would moving assets to another domain name speed
things up? Wouldn't it involve another DNS lookup, thus adding to download
time?

edit: I remember reading about this on Yahoo's speed guide, actually:
[http://developer.yahoo.com/performance/rules.html#cookie_fre...](http://developer.yahoo.com/performance/rules.html#cookie_free)

Still don't quite get it though. Cookie data is always passed in HTTP requests
for some reason?

~~~
jmileham
Yeah, there's an initial hit, but only one DNS lookup per session-or-so if
you've got your TTLs right. You're likely to hit hundreds (if not thousands)
of assets in a typical Facebook session, though.

Edit: Actually, in Facebook's case, Akamai has a 20 second TTL. But it's not
difficult to perform ~1k asset requests in that amount of time... just scroll
through a large friend list.

------
gte910h
This article presupposes a vast quantity of the internet's data is tracked
basic HTTP requests.

Huge quantities of the traffic on the internet are video data being slung
about (which has no where near the penetration of GA the poster is assuming).
Cisco's estimate is video will be >50% of all traffic in 2012

Source:
[http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/...](http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-481360_ns827_Networking_Solutions_White_Paper.html)

------
elliottcarlson
Forgive me if I am missing something, but wouldn't moving the cookies to
localstorage still require an additional call to be made to send the data to
the server in the end - not to mention the additional javascript required to
handle the localstorage management and ajax calls?

~~~
dudus
The idea is just to get rid of the extra overhead on the HTTP request, because
that overhead will be multiplied by the requests you do. Any extra code on
ga.js is cached and doesn't incur in too much overhead.

~~~
elliottcarlson
But you would still have an additional HTTP request that would be sending the
same information + headers for another call - in the end it would be the same
performance wise as if you were DEFERing the request, but you wouldn't be
saving any overhead on data passing around.

------
raghavsethi
That's not a bug at all, it's a feature.

How is Google supposed to track repeat visitors without any kind of state
except IP? And IP is an awful variable for state because of NAT.

The heading is misleading and overly sensational.

~~~
dudus
The idea is not to get rid of the data that is stored in the cookie. But
instead to move it into the localStorage. Still acessible through JS and
doesn't ge into the HTTP request.

~~~
TylerE
But the data still needs to go the Google Analytics server to actually do the
tracking...in an HTTP request.

~~~
AUmrysh
Yes, but you wouldn't have to send the cookie to Google's servers every page
load. This is a great use of the localStorage or sessionStorage in HTML5.
Another obvious benefit we will see when using sessionStorage in specific is
that you can store your session ID in that storage space and not have to worry
about logging in from a new tab causing a session to overflow from one tab to
the other.

Cookies are shared between browser sessions, but sessionStorage is not.
localStorage is the same way except it persists over browser sessions and
between tabs.

