
How Facebook keeps 100 petabytes of Hadoop data online - iProject
http://gigaom.com/cloud/how-facebook-keeps-100-petabytes-of-hadoop-data-online/
======
mpim
Probably better to link to the original article instead of a summary:
[https://www.facebook.com/notes/facebook-
engineering/n/101508...](https://www.facebook.com/notes/facebook-
engineering/n/10150888759153920)

------
ajays
It's not 100PB of data; it's 100PB of physical disk space. Given the
(standard) 3x replication and filesystem overhead, you're looking at about
30PB of data. Certainly a ginormous amount of data, that's for sure.

------
dude_abides
If only the secondary name node did what one would assume a secondary name
node is supposed to do, Facebook wouldn't have needed to invent avatar nodes,
and even Hitler wouldn't have gotten so upset and fired that intern:
<http://www.youtube.com/watch?v=hEqQMLSXQlY>

------
matan_a
This doesn't actually discuss _how_ Facebook manages and maintains their 100PB
of storage, rather just talks a bit about the namenode SPoF issue.
Disappointing article for sure.

------
halayli
This gigaom article is horrible.

