
Paper: Netflix’s Transition to High-Availability Storage Systems - blasdel
http://highscalability.com/blog/2010/10/22/paper-netflixs-transition-to-high-availability-storage-syste.html
======
moe
AWS infomercial of the most boring kind - not a single figure in the entire
thing.

I'm starting to suspect big customers get a discount in exchange for adding
their name on the dotted line of these PR Lorem Ipsum's.

~~~
toddh
I guess I'll have to ask Sid to add some cartoons next time to liven it up a
little.

~~~
moe
I meant figure as in: numbers.

They (you?) switched from an expensive platform to another expensive platform.
In an article about a transition I hope to find hard data about the
performance- and cost-tradeoffs that were made.

How many hosts, how much data, how much traffic, how many queries, how much
latency, how many dollars?

Instead I got a generic "AWS is awesome" boilerplate along with a generic AWS
best-practice bullet-list...

FWIW, here's a not-so-old chart from backblaze about relative storage cost:
[http://blog.backblaze.com/wp-content/uploads/2009/08/cost-
of...](http://blog.backblaze.com/wp-content/uploads/2009/08/cost-of-a-
petabyte-chart.jpg)

S3 ranges way up there and knowing _that_ figure first-hand (and also their
traffic rate) I sort of expected cost considerations to be at least
_mentioned_ in a document titled "XXX transition to ... storage-system".

~~~
terra_t
I did the math for my own applications and found that Amazon's system is
competitive pricewise with something I can set up in a data center that has
acceptable reliability (RAID1 + "enterprise backup" that may or may not pull
through in a pinch)

If you consider that a home-rolled solution is not elastic to demand and
therefore I'd probably wind up with a capacity factor < 50%, it's a wash
pricewise. Meanwhile, I know the AMZN-based system is going to make me sleep
easy when it comes to business continuity nightmares.

------
sunjain
I was wondering how much effort it took them to implement these practices. In
a typical Oracle based systems, some of features are invariably used(sequence,
triggers) even if you assume that application were already written to not have
any stored procedures or PL/SQL (which itself is rare for a system implemented
in Oracle). Some of these best practices I would not call them best practices:
for example using natural keys for primary keys, not sure if that is a good
idea. And setting up kind of replication he is talking about is not easy to
setup. One of the reasons he gave for this migration was that they wanted to
save their eng resource to focus on product related stuff but this migration
itself would have cost them significant resources to done.

------
dhess
My Google Docs fu is weak. Anyone know how to generate a PDF of the paper
linked in the article?

~~~
ktsmith
Someone has already made it available as a PDF.

blog post: [http://irr.posterous.com/netflixs-transition-to-high-
availab...](http://irr.posterous.com/netflixs-transition-to-high-availability-
stor#)

direct download link:
[http://s3.amazonaws.com/files.posterous.com/irr/1OaUyyUq9eF2...](http://s3.amazonaws.com/files.posterous.com/irr/1OaUyyUq9eF2swicItinLGfYyl4T6syE9wGnI5O2rH4mmZc7VhruNvhEM3tM/NetflixsTransitiontoaKey_v3.pdf?AWSAccessKeyId=1C9REJR1EMRZ83Q7QRG2&Expires=1287793669&Signature=qtnZk7/Enly/bA7xuOs%2Bh9tJzfI%3D)

