

Google – A Study In Scalability And A Little Systems Horse Sense - abraham
https://doubleclix.wordpress.com/2010/11/11/google-a-study-in-scalability-and-a-little-systems-horse-sense/

======
elblanco
Absolutely fantastic notes, the standout ones to me:

"They have had 7 significant revisions in 11 years" - This echos something
I've learned in my own career. You'll never ever ever ever get it right the
first time. If you spend all of your time trying to get it right the first
time, you'll never produce anything. BUILD! (but plan well to eliminate
obvious mistakes).

"All their systems work well inside a datacenter, but have no way f spanning
datacenters." - This was a real surprise. I guess I had mistakenly assumed
that all of Google's distributed resources lived as one giant meta-machine. I
never realized how disconnected the datacenters were.

"Don’t design to scale infinitely – consider 5X – 50X growth. But > 100X
requires redesign << very insightful" - quote says it all.

~~~
btilly
If you care about latency, you can't abstract away the existence of data
centers. The round trip speed of light means that that abstraction leaks too
much.

~~~
nostrademons
Such a system could, in theory, abstract away the existence of data centers by
_optimizing_ for latency. Wherever the user is, perform the computation by
using the copies of data that result in the lowest time spent hopping between
DCs. And shard the data between DCs so that data that's frequently used
together stays together, and data lives near its likely point of use, and
replicas give good coverage of possible access points.

Performing this calculation in a way that's faster than just going and
fetching the data is left as an exercise to the reader. And Jeff Dean.

------
btilly
For those who don't know, Jeff Dean is probably the most respected developer
within Google. See <http://research.google.com/people/jeff/index.html> for
some of what he has done.

~~~
moultano
Internally there's a "Jeff Dean Facts" page in the spirit of Chuck Norris
Jokes. It includes things such as "Once, in early 2002, when the index servers
went down, Jeff Dean answered user queries manually for two hours. Evals
showed a quality improvement of 5 points."

~~~
mdwrigh2
Is there any chance of this being released to the public?

...or even a chance of you sending me a copy (which of course I would keep to
myself!)

~~~
moultano
It references too many internal things for that, but as a consolation prize,
here's my second favorite:

"Jeff Dean puts his pants on one leg at a time, but if he had more than two
legs, you would see that his approach is actually O(log n)."

