

Cassandra from a Relational World - bsg75
https://medium.com/@mustwin/cassandra-from-a-relational-world-7bbdb0a9f1d

======
dwenzek
I find this post interesting because working with Cassandra requires truly
different patterns compared to standard RDBMSs and even other NoSql systems.

I worked recently on a project where we used Cassandra with so many surprises,
often learned the hard way with bugs in production, that the team felt
discouraged more than once before we succeed to setup a reliable, efficient
and scalable system.

Much more than said in the post, this is really tricky in practice to get
eventual consistency and data modeling right. Not only because the concepts
are new or hard to grasp. But rather because Cassandra fails on the principle
of _least surprise_. Two features, which are working fine in isolation, may
behave in unexpected way once combined.

\- Getting consistency right is complex, because the actual consistency you
get is the result of consistency levels scattered along _all_ the queries.

\- Denormalization works fine at first, but quickly reaches a point where the
right combination of columns is missing.

\- Having a large count of columns is fine, till you try to delete some and
get tombstones.

\- Tombstones are automatically removed during compactions, unless there are
too many tombstones leading to memory exhaustion.

\- The joining of a node may be aborted in case of a long GC pause, due to a
greedy data model or too many tombstones.

\- We even experienced a Cassandra bug - counters were corrupted by a new node
joining the cluster (this has been fixed with version 2.0)

In other words, many aspects are deeply tied together \- data modeling, query
coding and system tuning. This makes things really complex to grasp.

What worked for us, was to design in team. This gave us maximum opportunities
to share experience, to explore the solution space, to discuss pros and cons
and to design right.

------
michaelmior
For some other helpful intros to Cassandra data modeling, check out [0], [1],
and [2]. I've been working on trying to automate the process of coming up with
a good data model for systems like Cassandra. You can see some pretty early
results here[3] although there's been a lot of progress since then which will
hopefully be shareable soon.

[0] [http://www.datastax.com/dev/blog/basic-rules-of-cassandra-
da...](http://www.datastax.com/dev/blog/basic-rules-of-cassandra-data-
modeling)

[1] [http://www.slideshare.net/nkorla1share/cass-
summit-3](http://www.slideshare.net/nkorla1share/cass-summit-3)

[2] www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-
part-1/

[3]
[https://dl.acm.org/citation.cfm?id=2602624](https://dl.acm.org/citation.cfm?id=2602624)

------
bluecmd
Hm. How does Cassandra really handle rows if the author is recommending
storing timestamps in the same row? Sounds like a really good way of getting
un-shardable atoms (whatever that's called in Cassandra) and hotspots.

~~~
PretzelPirate
Please forgive my formatting. It seems formatting text on HN is more difficult
than using Cassandra.

Cassandra is interesting because you really have a partition key to define a
"row" and then a set of columns (cells). When you define your clustering key,
you are really defining how your data is stored on disk.

If you had the following table (the columns are meaningless, so just ignore
that you wouldn't ever use this dumb table):

table1 (personName, addedtime, color) where PartitionKey = personName
ClusteringKey = addedtime

and lets say you had the following rows:

personName= "n1", addedtime= "6/16/2015 1:00AM", color = "blue"

personName= "n1", addedtime= "6/16/2015 1:15AM", color = "black"

personName= "n2", addedtime= "6/16/2015 1:15AM", color = "red"

would be stored like:

RowKey = n1

-> name = 6/16/2015 1:00AM, Value = , timestamp = <some ticks from when the data was written>

-> name = 6/16/2015 1:00AM:addedtime , Value = "6/16/2015 1:00AM", timestamp = <some ticks from when the data was written>

-> name = 6/16/2015 1:00AM:color, Value = "blue", timestamp = <some ticks from when the data was written>

-> name = 6/16/2015 1:15AM, Value = , timestamp = <some ticks from when the data was written>

-> name = 6/16/2015 1:15AM:addedtime , Value = "6/16/2015 1:15AM", timestamp = <some ticks from when the data was written>

-> name = 6/16/2015 1:15AM:color, Value = "black", timestamp = <some ticks from when the data was written>

RowKey = n2

-> name = 6/16/2015 1:15AM, Value = , timestamp = <some ticks from when the data was written>

-> name = 6/16/2015 1:15AM:addedtime , Value = "6/16/2015 1:15AM", timestamp = <some ticks from when the data was written>

-> name = 6/16/2015 1:15AM:color, Value = "red", timestamp = <some ticks from when the data was written>

NOTE: The value of timestamp and added time are both stored as ticks, but I
was too lazy to convert/calculate real ticks.

Notice how the the column names (denoted by name) is actually the value of the
clustering key concatenated with the actual column name. Also notice that
columns are stored first in order of the clustering key, but each column with
the same clustering key is also stored in order of its column name. Also
notice that the first column that represents addedTime, doesn't have a value
because its value is stored as the column name.

Where in SQL you might expect each set of columns you insert to be a physical
row, in Cassandra, you really have a partition and a set of cells which are
physically sorted by clustering key and column name.

You can see this yourself if you setup Cassandra and then use cassandra-cli to
interact with your data using thrift.

~~~
bluecmd
I know all that, I have extensive experience with NoSQL databases like HBase.
But the thing I'm asking about how does Cassandra handle CPU hotspotting on a
single row.

If you have timestamp as a column you will never be able to shard a metric
across servers, so accessing that metric will be I/O bound to whatever
underlying storage.

My hypothesis is that if you save data as "metric.timestamp" as row instead
you will get much better scalability.

~~~
PretzelPirate
The short answer is that Cassandra doesn't handle hotspotting, its up to you
to data model to avoid that.

Timestamp is your clustering key, its only the partition key that determines
how the data is sharded.

If you have a table with firstName, lastName, loginTime and partition key =
firstName, clustering key = loginTime, all rows with firstName ="Tim" will
live on the same server regardless of loginTime. If you are often querying for
logins, you may think that you should instead make the partition key =
(firstName, loginTime), but that would leave your data in an almost
irretrievable state since you would need to know the exact login times. It
would also leave you looking at multiple partitions in every query, which
isn't as performance as a single partition.

A better partition key would be (firstName, dayAndHourOfLoginTime) so your
partitions would be for each first name and each hour. Then your query would
still have to know what hours to look for, but its a simple query pattern.

If you are often looking at data across more than one hour, you probably want
to rethink the key even further because its more efficient to look at a single
partition.

~~~
bluecmd
Aha! Re-reading your previous comment I understood what you meant.

So Cassandra has a "logical" key that is the primary key, and the actual key
from HBase-like DBs is derrived from cluster key and the primary key. So in
essence it's "metric.timestamp" as a row key but only schema defined.

Nice.

