
Scaling the Facebook data warehouse to 300 PB - ot
https://code.facebook.com/posts/229861827208629/scaling-the-facebook-data-warehouse-to-300-pb/
======
Skywing
Based on my current connections, and experience, I can't fully see myself ever
getting the chance to do so, but I'd love to have the opportunity to work on
similar problems - at Facebook or anywhere. I've always loved working on lower
level problems such as this. I learned to program by writing computer game
hacks, reverse engineering games and coding in C / ASM. Currently, I find
myself writing C# every day for a small company at which nobody understands a
single word I say about programming.

~~~
rdl
Yeah, it's interesting how Facebook does something which on its face seems
trivial and unimportant, but due to scale, has some really amazing engineering
and infrastructure challenges -- and they solve them by building tech to make
things scale from commodity systems (like Google, etc have done), vs. buying
third-party big iron solutions (which is what eBay did when they had similar
challenges, and what most big companies do).

~~~
salemh
Do you think its a cost/control to not buy big iron (outsource to a degree) vs
build/scale themselves? I'd like to hear some HN thoughts if some don't mind
sharing.

~~~
apaprocki
"Big iron" isn't all it's cracked up to be. Everything is a trade-off. Very
few people are doing pure computation and that is where those machines excel
(in addition to lots of aggregate I/O). The government research labs and the
like get a lot of use from these machines.

If you trying to scale an Internet-style app on one of these machines, you
might need to expand past one machine after a while. By staying on one
machine, you're avoiding all the complexity needed in your software to
coordinate between multiple machines. If you lose the ability to fit on a
single box, you'll need to add that complexity in anyway. So what does 10
beefy boxes buy you as opposed to 1000 smaller ones? There is of course an
operational/DC/power cost involved with more boxes, but I think most shops
consider that an easily solvable problem. For example, a maxxed out POWER7 box
from IBM will give you 256 processors and all the memory and I/O trimmings you
need. If you need more than 256 processors or the local amount of RAM, you'll
pay the software complexity cost anyway.

~~~
jbangert
Well, the 10 beefy boxes will be much, much faster if your problem is not very
distributable. Say, Facebook as an application shards very easily, because
most users don't interact much with each other. Other applications, might have
much more interactions.

What you're really paying for when buying a 256 processor POWER7 box is the
fact that the interconnect (and therefore the time to acquire a lock/update
data from another node) is much faster and more reliable than commodity
networks/kernels/stack.

~~~
srean
Depends on what you are programming on. If its in a language far removed from
the machine your mileage may vary.

I have had the opportunity to try out Google's implementation of mapreduce
implemented in C++ way back in time (6 years ago). These would run on fairly
impoverished processors, essentially laptop grade. Have done stuff on Yahoo's
Hadoop setup as well, these used high end multicore machines provisioned with
oodles of RAM (I dont think I should share more than that). If I were to be
generous, Hadoop ran 4 times slower as measured by wall clock times. Not only
that, Hadoop required about 4 times more memory for similar sized jobs. So you
ended up requiring more RAM, running for longer and potentially burning more
electricity. This is by no means a benchmark or anything like that, just an
anecdote.

That Hadoop would require much more memory did not surprise me, that was
expected. What was really surprising was that it was so much slower. JVM is
one of the most well optimized virtual machines we have out there, but its
view of the processor is very antiquated and it does not surface those
hardware level advances to the programmer. You pay for a hot-rod machine but
run it like an old faithful crown victoria.

Four times might not seem like much, for one thing I am being generous, and it
makes a big difference when you can make multiple run through the data in a
single day and make changes to the code/model. Debugging and ironing out
issues is a lot more efficient.

I think Hadoop gave Google a significant competitive advantage over the rest,
probably still does.

------
andrewchoi
When they introduced ORCFile, I was kind of hoping that the next iteration
would be the URUKFile.

~~~
zerd
Or HUMANFile.

------
NicoJuicy
I'm actually wondering, are their any ways to play with data like this (eg.
downloading data from StackExchange
[http://blog.stackexchange.com/category/cc-wiki-
dump/](http://blog.stackexchange.com/category/cc-wiki-dump/) )

Any other ways? I don't believe there is a VM for this sort of
"experimentation"

~~~
zackangelo
[http://aws.amazon.com/datasets](http://aws.amazon.com/datasets)

------
pokstad
A lot of fancy tech for only 805MB per user (based on estimate of 500 million
users). What if we had a pure p2p web app using webrtc & local storage to
replace Facebook? Think everyone could spare 805MB?

Calculation:
[https://www.google.com/search?q=(300PB)%2F(400000000)](https://www.google.com/search?q=\(300PB\)%2F\(400000000\))

~~~
kosievdmerwe
You're a bit behind, it's 1.23 billion monthly active users or if you feel
that's a bit disingenuous: 757 million daily active users.

Source: [http://newsroom.fb.com/company-info/](http://newsroom.fb.com/company-
info/)

PS. you calculated using 400 million.

~~~
e12e
300 pb / 10^9 ~ 300 10^15B/10^9 ~ 300 10^6 B ~ 300MB/user.

Considering my latest fb backup was ~18MB (unzipped), out of which 14MB was
pictures, this doesn't sound too unreasonable to me. If anything, it sounds
very conservative. If I was actively using fb for photos, I'd easily have at
least 100 times as many, maybe 200 times as many. Not to mention that the
single (short) video I've uploaded is 1.4 MB.

------
kakoni
I wonder, how does vertica link to this.
([http://www.vertica.com/2013/12/12/welcoming-facebook-to-
the-...](http://www.vertica.com/2013/12/12/welcoming-facebook-to-the-growing-
family-of-hp-vertica-customers))

------
ForHackernews
Maybe they should just delete something once in a while. How about actually
deleting data for users that close their account?

------
oregon_engineer
Where's the power coming from to scale like that? If Facebook is relying on
power from the Columbia River for their data centers, they better talk to the
tribes and the salmon. Those dams are failing and many parties in the
Northwest want to see them removed.

A better bet might be for them to scale up in Utah and harvest methane from
waste to generate power.

~~~
taftster
Because Utah has a lot of methane? I'm not quite following. I'd think methane
production would be greater in a cow state like Iowa or Ohio than in a desert
state like Utah? _joke_

Utah's power grid is primarily coal fired, but they use a fair bit of natural
(methane) gas as well. [1] Utah has a couple of major hydroelectric dams (glen
canyon, flaming gorge), and the use of solar and wind are on the rise.

NSA is building a major data warehouse in Utah; one of the considerations
would have to have been cheap power. I'm guessing the Columbia river produces
cheaper electricity than pretty much anywhere else, but Utah has very diverse
(and affordable) power production overall.

[1] [http://www.deseretnews.com/article/700051087/Coal-mostly-
pow...](http://www.deseretnews.com/article/700051087/Coal-mostly-powers-
electricity-hungry-Utah.html?pg=all)

[edit] failed on humor the first time, trying again.

------
wehadfun
The things facebook does to keep people's shower thoughts readily available.

~~~
Theodores
_...not to mention our uber-lords at the NSA..._ (yawn)

Actually all tech can be criticised for banality. Think of the telephone - _'
they put in all those cables just so she can talk to mother...'_. Or
television - _' they dug up teh street just so they could lay those fibre
optic cables so your gran could watch the wresting...'_. Or even the trains -
the train stopping at my home town does seem a waste of time, I can't believe
they bother when you look at who gets on.

~~~
tedks
It's a fallacy that all technology can be criticized for banality.
Technologies are different. Some are more criticizable than others.

Telephone was the first infrastructure to provide real-time voice
communication. It enables families staying in contact, but it also enables
economic growth, and a more effective society writ large.

Television is now a mindless wastefield of race-to-the-bottom drivel, but
there are newer networks have haven't yet succumbed to drive, mostly on
digital cable. I only have the respect for science and technology that I do
because I grew up watching _The Magic School Bus_ and _Bill Nye the Science
Guy_. My parents watched the moon landing on television.

Facebook does not do anything novel, nor has it ever been used for anything
terrifically insightful. It provides some social value and exists for that
reason, but it is clearly not equivalent to all other technologies.

~~~
nemothekid
I don't quite get it. Being able to transmit images is "novel" but being able
to query more information that has ever been available before in the history
of the human race is not?

You should really separate the _application_ of the technology from the
technology itself. Your last sentence can be said the exact same way about the
telephone "It provides some social value and exists for that reason" but that
is clearly a ridiculous statement to make about the telephone.

------
siculars
And the moat keeps getting larger. Good for them. Yes, they are kind enough to
open source this, but how advanced is the tech that they don't? And how much
time have they enjoyed as the first to benefit from this impressive tech?
Again, good for them.

