
WTF is a SuperColumn? An Intro to the Cassandra Data Model - ropiku
http://arin.me/code/wtf-is-a-supercolumn-cassandra-data-model
======
StrawberryFrog
So it's not in first normal form.
<http://en.wikipedia.org/wiki/First_normal_form> What's old is new again, and
I wonder if they even know it.

What's not addressed is that this may not be an advance over Relational
databases. In fact, when relational databases were first designed, databases
like this were all-to-common, and RDBMS rejected this kind of data model
explicitly, for good design reasons. For instance, how do you select from or
join to parts of the complex values inside a column?

~~~
jbellis
> What's not addressed is that this may not be an advance over Relational
> databases

It's not. The relational model is much better, for many applications.

BUT.

The missing context is, the relational model cannot be sanely scaled out
across multiple machines. Replication mostly works with some pain, but scaling
writes is just a nightmare -- and what you have to do to partition your data
across multiple rdbms nodes means giving up all that relational goodness. So
what you end up with is neither of {easily scaled, relational}.

So if you can't have a scalable rdbms, the next best thing is a scalable
key/columnfamily dbms. Something like Cassandra. Which, as a bonus, gives you
significantly better per-node performance on modern hardware.

~~~
harkham
I'm curious: how easily does "easily scaled" mean? How much advance planning
needs to be done about the eventual size of a cassandra cluster?

I think I can see how you can add machines when you need more capacity, but
what about when you don't need all that capacity anymore? How do you go about
removing machines from the cluster, and how does all of the nicely scaled out
content get rebalanced when you do?

~~~
jbellis
> what about when you don't need all that capacity anymore

It's not really an interesting use case... For the same reason that in every
language I can think of, hashtables grow as you insert items but they don't
shrink as you remove them, because if it got to size X once, it will probably
do so again in the near future.

That said, decommissioning nodes will fall out naturally from our work on
automatic load balancing for the 0.5 release.

------
cosmohh
looks pretty interesting. anyone already used it and maybe compared it with
CouchDB, too ?

~~~
jbellis
Cassandra is a scalable key/columnfamily database designed for supporting low-
latency applications with vast amounts of data partitioned across many
machines. (Facebook has 40TB on a 150-machine Cassandra cluster.)

CouchDB is a document database that supports two-way replication so you can
re-sync after taking part of the data offline, but it's still designed around
the concept of a single master that holds all the data.

Completely different animals, in other words.

~~~
ddiljoy
Does a Cassandra cluster stay write-available in the event of a network
partition?

If so, how does it reconcile writes when the partition heals? Last I looked,
Cassandra doesn't use vector/logical clocks - doesn't that potentially cause
data loss when the partition heals if you're using a simple last-write-wins
based on physical timestamps for a reconciliation policy? Does Cassandra use
merkle trees for anti-entropy?

From what I can tell, although Cassandra claims to be write-fault-tolerant,
the dependence on physical timestamps and the lack of the self-healing
properties that merkle trees provide make me nervous about data loss and
inconsistency when deploying it at scale.

~~~
jbellis
> Does a Cassandra cluster stay write-available in the event of a network
> partition?

The client can specify whether it wants consistency (refuse writes if not
enough write targets are there) or availability.

If it chooses availability, then Cassandra sends extra copies to nodes it
_can_ reach, with a tag that specifies who the "real" destination is. When
that node is reachable again it will be forwarded. ("Hinted handoff.")

> how does it reconcile writes when the partition heals?

As you said, last-write-wins. The experience with Dynamo showed that most apps
don't want to deal with explicit conflict resolution, and don't need it. (But,
I suspect we will end up adding it as an option for those apps that do. In the
meantime, if Cassandra isn't a good fit, we're not trying to hard-sell anyone.
:)

> Does Cassandra use merkle trees for anti-entropy?

Not yet, but my co-worker Stu Hood is working on this. Should be part of the
0.5 release.

> the dependence on physical timestamps and the lack of the self-healing
> properties

Whether the first is an issue is app-specific. As to the latter, I'm excited
to get the merkle tree code in, too.

In the meantime, Cassandra _does_ do read repair and hinted handoff, so in
practice it's what I would call "barely adequate." :)

