
Compute Engine Load Balancing hits 1 million requests per second - aritraghosh007
http://googlecloudplatform.blogspot.com/2013/11/compute-engine-load-balancing-hits-1-million-requests-per-second.html
======
seiji
This post made the rounds recently when reporters got a hold of it. "Millions!
Millions of googles! Compute!"

It's basically meaningless. They start off with C10k which is about handling
10k concurrent clients on _one machine,_ then they go on to say "a million
requests on our _cloud_!" (which they later mention is a 64-server virtual
load balancer cluster pointing to 200 backend virtual thingamajiggers)

But, like, Eurovision. Totally.

While I'm yelling at them: your problem isn't nobody knows about your
performance. Your problem is nobody knows what google Compute Engine is. Is it
an EC2 competitor? Is it one of those things you have where you give it data,
a query, and you crunch the numbers for the client?

Google means well, but here they suffer from their typical "we know it, so why
don't you know it?" style without much useful intro material (which tends to
get written by fans, which Google Compute Load Balancer Data Ingestion doesn't
have).

</rant>

------
jheriko
I never really understand the challenges of web scalability - the hard problem
I see is bandwidth. Distributing work, other than the lag in communication
maybe? Being clever about bandwidth usage in some special ways?

Why isn't it just bandwidth limited? throwing data around aside from crossing
the network layer is extremely fast and easy... surely distributing the load
is even a secondary problem to being efficient with the bandwidth, since
efficient load distribution seems also like it should be network bandwidth
(not, e.g. clock cycle, memory, disk access...) limited.

1 million anythings a second is a pretty low number in many computing
environments - this is much less than (about two orders of magnitude) the
pixel throughput on the monitor you are reading this on, as a trivial example.

~~~
lukev
Well, let's see...

In the case of my monitor, the software sets a single byte in a direct memory
array which is then sent directly to the monitor via a direct, dedicated
connection, for a total distance of about 3 feet.

An HTTP request is usually closer to a _kilobyte_. So there's your two orders
of magnitude, right there. Then add in the fact that the communication
channels are miles long, routed through dozens of separate hardware modules,
shared not only with other traffic but with other protocols, subject to
synchronization and concurrency issues, etc...

Really, the situations aren't comparable at all.

Not to say networking couldn't be faster. But it's a very, very different
problem.

------
martinml
Previously on HN... :)
[https://news.ycombinator.com/item?id=6804897](https://news.ycombinator.com/item?id=6804897)

------
MrBuddyCasino
This post is a little light on details. They mention they're not using DNS to
distribute the requests, but use a single IP instead - how is that possible?

Also, they boast that it cost only 10$ - for how long was this thing running?

~~~
X-Istence
anycast. If you announce the same IP from multiple networks your request will
be routed to the nearest one automatically.

It is how Google is also able to provide Google DNS on just two IP addresses
while having tens if not hundreds of physical locations those requests get
routed to.

~~~
mritun
This is different. What they mentioned is a load balancer, like F5,netscaler
etc. HaProxy is a software based load balancer, but their does lot more.

------
tiagobraw
How they are able to route forth and back >10M packets with a single IP?

~~~
chrisfarms
Single IP != Single Machine.

There can be multiple paths that all end with the same address.

[http://en.wikipedia.org/wiki/Anycast](http://en.wikipedia.org/wiki/Anycast)

