

Cassandra data store at Facebook - Anon84
http://glinden.blogspot.com/2008/08/cassandra-data-store-at-facebook.html

======
gaika
This is the second major infrastructure component opensourced by Facebook.
What's next? Search?

How is Google going to respond to it? Should we expect BigTable opensourced
any time soon?

~~~
SwellJoe
It seems to me that this has nothing to do with Google. Facebook is not a
Google competitor...and Google would see no benefit from open sourcing their
infrastructure. Google is an infrastructure company--the value of Google is in
the strength of their search and related technology. The value of Facebook is
the network of people--technically, it's simply not a difficult problem. Take
away the user community and Facebook has almost zero value. Facebook happened
to hire a lot of smart people who wanted to work on interesting things, and so
NIH has led to lots of new technologies being created.

Their solutions may be optimal for Facebook, and they may even be generally
useful to folks building large scale websites. But, their Open Source tools
are no threat to Google, and would have no impact on Google's decisions about
what to Open Source.

Hadoop/HBase has already been offering BigTable like capabilities in an Open
Source project. This is merely another option--and frankly, one that sounds a
bit half-baked at the moment. But I could be wrong about that. And, of course,
Thrift is yet another RPC system. It makes pretty bold claims about being
uniquely lightweight and cross-language, but the RPC and data interchange
problem has been solved many times in the past in many ways. I don't think the
world has been set on fire by Thrift.

~~~
gaika
I use Thrift and it is underhyped. I looked at Cassandra, and it offers more
than HBase with a lot cleaner and smaller codebase. That's not just NIH,
that's some really important thing happening here: standard tools like
relational databases do not work for companies like google or facebook.
Suddenly to process these volumes of data you do not have to worry about the
back-end. That lets small startups compete on the same terms with big guys.

For example JS-Kit.com generates 10G-100G worth of data a day. They had to
create an equivalent of Amazon's Dynamo to handle it. Easily half of their
resources are dedicated to maintaining and extending this infrastructure. Now
imagine if they could just use Cassandra and focus on their core? I do not
know what Disqus or FriendFeed are using (probably mysql?), but they can
benefit from it too once they grow.

That could affect Google too, there will be more competition from smaller
startups like webmynd or cuil because of it. And if startups adopt Facebook's
tools instead of Google's, that would make them less attractive to Google as
an acquisition target.

