

Cassandra: Daughter of Dynamo and BigTable - aouyang1
http://www.insightdataengineering.com/blog/cass.html

======
leef
What's interesting to note is that Dynamo/Cassandra usage was killed at both
Amazon and Facebook (DynamoDB from AWS is actually not based on Dynamo tech
except in terms of what not to do).

[1] - [https://www.facebook.com/notes/facebook-engineering/the-
unde...](https://www.facebook.com/notes/facebook-engineering/the-underlying-
technology-of-messages/454991608919)

edit: Added link to facebook post on hbase vs cassandra

~~~
jchrisa
Damien Katz (one of my cofounders at Couchbase and therefore has a dog in the
fight) wrote this take-down of the Dynamo model, for folks wondering why?

[http://damienkatz.net/2013/05/dynamo_sure_works_hard.html](http://damienkatz.net/2013/05/dynamo_sure_works_hard.html)

TLDR quote:

 _The Dynamo system is a design that treats the probability of a network
switch failure as having the same probability of machine failure, and pays the
cost with every single read. This is madness. Expensive madness._

~~~
jedberg
I don't agree with his fundamental premise:

> Network Partitions are Rare, Server Failures are Not

Network partitions happen _all the time_. Sure, the whole "a switch failed and
that piece of the network isn't there anymore" doesn't happen a lot, but what
_does_ happen a lot is a slow or delayed connection, or a machine going
offline for a few seconds.

~~~
jchrisa
Our customers tend to be the kind who need extreme performance, so they aren't
spanning cluster across WANs. For well-tuned datacenters rack awareness
(putting the replicas in sane places), is more useful.

For WAN replication we have a cross-datacenter replication which works on an
AP model.

~~~
jedberg
To be clear, I was in no way making a judgement about CouchDB vs. Cassandra.
I've only give Couch a cursory glance so I wouldn't be qualified to make such
a judgement.

I was simply trying to point out that while you may have a very good argument
as to why Couch is better, the network partition argument is not sound, and
you may want to look for a better argument to make.

I'm personally against single masters because they are SPOFs. With a master,
at some point there needs to be a single arbiter of truth, and if that is
unavailable, then the system is unavailable.

~~~
strmpnk
A nitpick, but an important one which I wish the Couchbase folks wouldn't let
slip as often as they do, CouchDB has very different properties from Couchbase
and should be considered entirely different database designs regardless of the
availability of a sync gateway for replication with a number of JSON stores.

~~~
jchrisa
Different database designs, but similar document model. In fact, Couchbase
Sync Gateway is capable of syncing between Couchbase Server and Apache
CouchDB. Also our iOS, Android, and .NET libraries can sync with CouchDB and
PouchDB. Everything open source, of course. More info:
[http://developer.couchbase.com/mobile/](http://developer.couchbase.com/mobile/)

~~~
strmpnk
It's not that it can sync, nor the data model.

It's that these are fundamentally very different databases with different
trade offs. You can't just take one and adjust some API calls and expect
things to work in a similar way. It only confuses people when it's quietly
ignored and others assume that since it wasn't pointed out to be wrong that it
must be the same thing.

I've had far too many conversations with people who use Couchbase that can't
tell the difference that I would say that it's just general confusion. It's
lax work on Couchbase's part and a thorn in the Apache CouchDB project that
there is no effort to help clarify the fact that they are indeed independent
and now very different databases.

------
digitalzombie
Cassandra is a pretty good NoSQL database. I'm a big fan of it.

Of course if your data change often and very quickly you shouldn't be using
Cassandra. You'll end up tombstone death. Deletes are not real delete, they're
soft and just have a timestamp that eventually will be deleted (tombstone).
Every delete creates a tombstone, if you're hashkey/column key have 50
tombstones, it have to go through those tombstones before getting the values.
The reason is some trade off for faster write but shitty read if you update a
lot of the same key.

Overall, one way of thinking of it is a souped up hashkey db that can do some
relational queries which is much more than the usual hash key noSQL. Seeing
how it's a hash key type database you can see the trade off of Cassandra and
it's siblings versus say MongoDB or CouchDB.

I used it for time related stuff that has simple data. It's mostly immuatable
so cassandra was perfect for it. Example is a daily tv show release date.

~~~
jermo
Increasing the size of memtables should reduce the tombestone overhead since
entries will get overwritten in memory.

------
salex89
Sometimes I feel that the Cassandra row length limit is not taken seriously
enough. If you have some high frequency data it will not take too long until
you hit the limit. Also, the guide states that you should not really attempt
to get it that far. I'm not sure if schemes like the one in the example are
future-proof. In a couple of years developers will start hitting the limit and
we'll see mailing lists with postmortems :) . A bit apocalyptic, but I believe
we should find an idiomatic solution for it. I use rounded time periods as a
part of the partition key. Yes, it makes the data a bit more complicated to
query, but I have not thought about a better solutions.

Time series databases like KairosDB are quite good, but for simpler data
structures and something describable as a metric. Also you may face issues
introducing relatively less known software, and get locked in, in your
company.

------
dimi-31
Pretty interesting read, the concept of tunable consistency was foreign to me,
keep on the good work !

~~~
samplonius
Riak has the same sort of concept:
[http://docs.basho.com/riak/latest/theory/concepts/Eventual-C...](http://docs.basho.com/riak/latest/theory/concepts/Eventual-
Consistency/)

I think Riak is richer in this area than Cassandra, because if the inconstancy
can not be resolved, Riak can keep both versions and let you deal with it at
the application level.

See also: [https://aphyr.com/posts/294-call-me-maybe-
cassandra](https://aphyr.com/posts/294-call-me-maybe-cassandra)

------
IndianAstronaut
Article isn't loading.

