Hacker News new | past | comments | ask | show | jobs | submit login
Google Launches Web Hosting Disguised As Page Speed Service (searchenginewatch.com)
25 points by processing on July 29, 2011 | hide | past | favorite | 12 comments

I'm half joking, but I wonder... is this just a device to run a giant beta test of SPDY?

It's more of a CDN than web hosting (like amazon cloudfront).

Nice in theory, but with so much content these days being dynamic I can't see offsite caching relying on Google's crawlers serving you quickly being all that attractive. The article mentions using them as a CDN, which could make sense. I'd like to see their proposed use cases before making any judgements one way or 'tother.

I didn't get the impression it was dependent on the normal Google crawlers and I didn't see any warning about dynamic content.

Maybe it's clever enough to work out what can be cached and grab everything else on the fly?

OK. So, on my site, a search request takes 200ms.

Search from me directly (assuming DNS is cached and no lookup is required):

   You to Me: TCP setup, request send, 200ms, response send, TCP teardown
If you get that same thing through Google, assuming they're doing this on the fly like would be required (same DNS assumption):

   You to Google: TCP setup, request send +
   Google to Me: TCP setup, request send, 200ms, response send, TCP teardown
   Google to You: response send + TCP teardown
There's no way this could be faster. They'd have to precache everything, or cache static and take the hit for dynamic, in which case, how useful is this really?

edit: As the children of this comment suggest, and after some more consideration, there is certainly some value, and some speed to be gained--certainly with static content of all varieties.

There's no way this could be faster.

Actually it can be faster due to peering. If you and I use the same ISP, then your assumption is correct but as soon as we enter the real world, the services you use often have different ISPs than yourself.

Google can easily have faster & more direct access across long distances than your local ISP.

This is true, and very shortsighted of me.

The value is in the bit where the client also have to fetch 15 JS and CSS files from you. Getting those from Google instead is very likely to be faster. Also, they may very well minify and compress them, and lump them in with other sites that uses the same (ie. automatically making you use http://code.google.com/apis/libraries/).

Also, if you get a prominent link to your somewhere, this services will very likely be able to discover that all the searches for "cute kitten" returns the same page, and cache the results for you, keeping your server from crashing.

Yes, sure you and I know how to make a site fast by employing these techniques, but (a) not everyone does (b) why should we spend time on this if Google can do it for us?

I can already use a shared jQuery or other library, I don't need Page Speed for that, though if Google can discover that the library I have is actually exactly the same as one it hosts on it's own CDN, then of course it can rewrite my page to make use of the CDN one.

My javascript on the other hand is (likely) not useful to someone else's site, and therefore that example isn't a very effective use.

I just remembered that Page Speed also has an image optimizer, which could be very effective here.

The search example obviously has some flaws, because yes, caching is likely to be effective at least for a few seconds (think Twitter search). What happens in the case of protected resources where a session cookie or XSRF token make the page uncacheable?

> There's no way this could be faster.

Incorrect, mostly.

Your website is likely hosted on one or two servers in one spot on the Internet. To users physically local to that/those servers yes direct access would be faster, but what about people out of sate? What about those on a completely different continent?

Most of Google's services are hosted in several distinct locations and they use and their DNS setup makes sure that any given user gets sent to the one closest to them.

Even without DNS to help by handing out different IP addresses for the same service depending on the location you are calling from, they have an anycast routing arrangement whereby even when referencing some things by IP address directly you will get a local service of which there are several. For instance if I ping one of their public DNS servers from my home connection (in the UK) I get:

  64 bytes from icmp_seq=1 ttl=56 time=30.6 ms
  64 bytes from icmp_seq=2 ttl=56 time=29.2 ms
  64 bytes from icmp_seq=3 ttl=56 time=28.9 ms
  64 bytes from icmp_seq=4 ttl=56 time=38.4 ms
  64 bytes from icmp_seq=5 ttl=56 time=36.5 ms
and when doing the same from a VM running state-side I get:

  64 bytes from icmp_seq=1 ttl=56 time=12.3 ms
  64 bytes from icmp_seq=2 ttl=56 time=12.3 ms
  64 bytes from icmp_seq=3 ttl=56 time=12.4 ms
  64 bytes from icmp_seq=4 ttl=56 time=12.3 ms
  64 bytes from icmp_seq=5 ttl=56 time=12.3 ms
It is simply not possible, unless packets from/to one of those locations are travelling faster than light speed and/or nipping back in time during their journey, for these two locations to have been talking to the same server(s) despite using a specific address not a name.

As well as locality of reference, there is the benefit that your most local Google site is likely to be connected to the backbone by a faster pipe than your hosted servers (particularly for users of large ISPs who have more direct peering arrangements with Google). This comes into effect for any static content, or dynamic content where you can tell Google's checker when the content has not changed so the last version can be used, where the user does not need to wait for your server to respond in full to Google

For truly dynamic content this is going to be slower, but for many sites, especially if you have proper cache control on your dynamic content, I would expect this service to produce a measurable speed-up on average (obviously you'd have to test your specific site to see how much difference it truly makes) and it could save you a fair chunk of bandwidth if the site becomes popular.

This is certainly a service I'll keep my eye on with a view to maybe making use of it for certain projects if they ever get off the ground.

Actually, for truly dynamic content, it will still be faster to serve through the CDN. I do this for my own website, and saw nearly 2x latency improvements from seriously-offsite locations like Qatar.

The reason for this is that the CDN can keep "hot" TCP connections "open" to your backend origin servers.

Normally, when the user connects to your website, best case, he/she will have to send a SYN, receive a SYN/ACK, send their request in their final ACK (hopefully it fits), and then get their response. That's two round trips.

Then, if the file is "large", the TCP window will start out at some horrible default size, and will only get larger over the course of the connection being used, with successful packets being sent back over great distances and latency.

However, with a CDN, it likely already has a connection open, saving an entire round trip to your server. Meanwhile, if your response is somewhat heavy (multiple kilobytes), the old connection will already have a large window (as bandwidth between the CDN node and your origin server is likely high, even if the last mile to the user isn't), so you will get much higher performance and won't get stuck waiting for ACKs.

In fact, once you start thinking at the TCP level, you realize all sorts of things, like: even if you don't have existing reusable connections, the window will still warm up faster for the CDN (due to their connection being slightly lower latency and likely more stable), yielding greater total bandwidth; and, even more hilariously, even if there is no "last mile effect", having a server halfway between two servers will improve their performance due to these same windowing issues.

I'd not thought of persistent connections and window size effects - both good points especially when the receiving end is either far away or unreliable. It could save other resources too: if you deliver the content to the CDN in 0.1ms and it takes them a further 0.5s to get it to the other end, you don't have to care about the 0.4s during which your server might have otherwise been holding a PHP process (or something else similarly heavy) open.

Another thing that will help long distance if you use a large enough well managed CDN is the interconnectedness of the CDN's nodes. Google probably has a much better pipe between there point of presence here and the one nearest Qatar (to use your example), so much so that "my server -> them -> then again through their fat pipe -> user in qatar" may well be faster then "me -> user in qatar via other ISP and international peering arrangements".

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact