
Looking with Cassandra into the future of Atlas - jbellis
http://metabroadcast.com/blog/looking-with-cassandra-into-the-future-of-atlas?
======
jemfinch
Database system combining two published technologies battle-tested by two of
the largest Internet websites performs better under real world conditions than
database written by amateurs, news at 11!

(In all seriousness, what did they expect? Have they looked at the MongoDB
code? Do they seriously believe that the 10gen folks are smarter or better at
solving problems than the masses of engineers Google and Amazon have thrown at
this problem?)

~~~
jbellis
I appreciate the sentiment, but it's worth distinguishing between the design
and implementation here. Cassandra does take a lot of inspiration from the
Bigtable and Dynamo papers, so we benefit from the thinking of a handful of
very smart engineers at Google and Amazon, respectively, but the actual code
is our [Apache's] own and for the battle testing you need to thank companies
like Netflix, Reddit, Spotify, and others. [1]

That said, part of the reason Cassandra was attractive to me from the
beginning is that unlike master-slave designs like MongoDB (or Bigtable/HBase,
for that matter), a p2p design doesn't have the many corner cases around
failover and recovery that complicate troubleshooting so badly. This is a
primary reason Cassandra has had a very good reliability story since very
early on.

[1] <http://www.datastax.com/cassandrausers>

~~~
jemfinch
Sorry, I didn't mean to imply that Google or Amazon engineers implemented
Cassandra (though it's notable, of course, that it was initially implemented
by Facebook, adding a third Internet behemoth to its pedigree).

What I really meant to say is that it's clear that the engineers behind
Cassandra have done their research and chosen an extremely well tested design,
while the engineers behind MongoDB seem to be completely winging it, ignorant
of the literature and writing (based on my last examination) extremely
amateurish code.

------
stevencorona
Where Cassandra REALLY shines and is often overlooked is ease of maintenance.
Cassandra's ability to bootstrap new nodes, replicate, reshard and handle down
nodes (w/ hinted handoff) is almost magical. I use it in production and it
works very reliably.

Sure, it's got some cool big data stuff, but try doing any of those
"maintenance" operations on other databases without ripping your hair out. For
example, even bringing up a new MySQL slave is a huge pain in the ass, let
alone doing something non-trivial like promoting a new master.

~~~
pestaa
Interesting. I thought sysadmin operations like MySQL sharding and master-
promotion are usually automated/scripted? Not to say it makes it conceptually
easier, but it limits the opportunities to lose hair to 1.

~~~
jbellis
If it's complex enough you need to script it, it's complex enough to make
troubleshooting a bitch when something goes wrong in that script...

------
monstrado
Did you guys investigate any other choices such as DynamoDB, or HBase? I know
that Facebook (inventors of Cassandra) have moved off of Cassandra over to
HBase for its back-end messaging services due to its inherent consistency
problems.

~~~
jbellis
"Consistency problems" is a red herring; Cassandra can be easily asked to
sacrifice availability for consistency [on a per-request basis], if that's
what you want: <http://www.datastax.com/docs/1.1/dml/data_consistency>

Facebook never used modern Apache Cassandra, so when they tasked a team of
experienced Haddop/HDFS engineers to build Messaging, it was natural for them
to choose HBase. But the things they've had to do to deal with the problems in
that architecture (e.g., sharding into "cells" to deal with namenode SPOF) [1]
make me think that they would have done better with Cassandra.

[1] [http://www.slideshare.net/brizzzdotcom/facebook-messages-
hba...](http://www.slideshare.net/brizzzdotcom/facebook-messages-hbase/23)

~~~
dhruba
If you have three replicas in casandra and you want your reads to be
consistent, you would have to read from at least two replicas . This doubles
the number of iops to the disk subsystem. Most disk based storage systems are
bottlenecked by random iops, and a solution that requires double-the-iops is a
non-starter.

~~~
3amOpsGuy
I don't think that's strictly true.

With cassandra you choose where the extra reads (or writes) to ensure
consistency occur as best suits your use case:

* At write time (write quorum) like a traditional replicated datastore

* At read time (read quorum)

* In the background (read one, with a non-zero chance of read repair)

This is quite nice to be able to bend Cassandra on a use case by use case
basis (NB: these are not cluster wide approaches, for different columns /
circumstances i can choose different patterns).

------
chaostheory
I think in the near future Mongodb will no longer have a global write lock.
Yeah but I feel their pain. Given my still limited experience with it, Mongodb
may be great for reads, but not so much for writes.

~~~
achompas
EDIT: deleted b/c I didn't make my point clearly.

~~~
taligent
Call me stupid but I can imagine that losing data is a problem for a database
company.

That said MongoDB has a simple flag which allows you to choose how to handle
writes. The app can 'wait' for writes to be applied to one node, all nodes,
some number of nodes etc.

~~~
achompas
Heh, that probably wasn't phrased well. Let's try again:

Dynamo (and thus Cassandra) is designed for write speed and durability. What
is MongoDB designed for?

------
einhverfr
The problem of course with Cassandra is getting results you will believe out
of it. Also I understand Cassandra to be a Trojan. Use at your own risk. When
Atlas lets you down and AJAX has gone crazy (like slaughtering all your cows
in a fit of drunken pique), and your city is in flames don't say I didn't warn
you. But you won't believe me either ;-)

(ok, sorry for the obligatory Illiad joke there.)

------
hoodoof
<http://news.ycombinator.com/item?id=3984642>

