

How Facebook stores its billions of photos - anirudh
http://www.facebook.com/note.php?note_id=76191543919

======
thehigherlife
That was much more exhaustive than i was anticipating. I'm curious how this
stacks up to sites like flickr.

~~~
viksit
@thehigherlife - I was thinking about the same thing recently (an announcement
about FB becoming the biggest photo app on the planet). I found two great
sources of information on this..

\- <http://highscalability.com/flickr-architecture>

Very deep analysis of Flickr's arch from a couple of years ago - including a
link to a presentation to Cal Henderson's slides. I'm surprised he got canned
from Y! yesterday though!

~~~
furburger
i don't know anything about cal's departure, but i can say that any large co
like yahoo, google, msft, apple etc is more or less impervious against the
loss of one or a few technical individuals. his departure may have symbolic
value but the machine that is chugging there will keep chugging, for better or
for worse

~~~
jamie
Don't confuse continuing to operate with continuing to innovate.

A few technical individuals can move the needle more than you'd imagine, even
at a big co.

------
ars
I wonder if they have some mechanism for rebuilding the raid array
periodically to handle the inevitable undetected read errors. With raid6 you
can actually do this, by comparing both parity blocks with the actual data.

Hard disks typically have undetected read error rates of 1 bit in 1E15, so
assuming they transfer 1PB per day that's about 9 errors per day. Which isn't
much I guess, but I wonder if they do anything about it.

~~~
kmavm
(Not speaking officially, obviously. I had nothing to do with Haystack
anyway).

They're images. So you flip a bit every once in a great while; as long as the
bit isn't part of image metadata, no human user will notice. The next time
they load the image, (at least, assuming it's been long enough for it to get
evicted from the CDN) they see the right data.

------
mildweed
Here's a Flowgram presentation from last year, "Needle in a Haystack", which
covers these updates. <http://www.flowgram.com/f/p.html#2qi3k8eicrfgkv>

------
ars
Why bother putting the haystack in a filesystem? Just use a raw partition.

Haystack effectively IS a filesystem - a log structured filesystem.

~~~
kmavm
I think the haystack folks made the right decision, here. A rule of thumb is
that it takes 3-5 years for a novel on-disk format to gel. There are always
random bugs that rely on particular sequences of allocations/deallocations,
races, etc. Using an off-the-shelf filesystem that allocates the blocks and
then stays out of the application's way, like XFS, probably saved these guys
at least one year of screwing around.

------
furburger
raw io? facebook better go ipo soon, they're running out of abstraction layers
to code through

~~~
okeumeni
Even if Facebook end up not making any money, it contribution to the next
generation of software engineering will be important. Solving major scaling
issues, make Facebook a full scale lab for tomorrow. They will set standards
for how to approach some unique problems.

