
Cassandra vs MongoDB For Time Series Data - jbellis
http://relistan.com/cassandra-vs-mongo/
======
arohner
One thing I always find interesting about these kinds of problems is that most
DBs don't describe how they're implemented. It's easy to use the wrong tool,
and then once you learn how e.g. Mongo is implemented, it's obvious, "oh,
that's why things are slow".

I'd love to see [http://eagain.net/articles/git-for-computer-
scientists/](http://eagain.net/articles/git-for-computer-scientists/), but for
every DB technology.

~~~
mtkd
A pattern I've seen on a lot of the negative Mongo articles has been people
using it for things they probably shouldn't.

I've yet to use it for any load, and am struggling to triangulate from all the
articles I read on whether it does/doesn't scale efficiently.

All the issues I have hit so far have been self-inflicted, it is still one of
the best new technologies I've used in years - but is has taken a while to
stop thinking in SQL equivalents and start thinking natively.

~~~
arohner
> whether [Mongo] does/doesn't scale efficiently

It doesn't. Three words: "global write lock". Writes block reads, reads block
writes. Implications: if you run a query in production that doesn't hit an
index, all traffic stops. The notablescan setting is a very, very good idea.
This also means all queries must have an index, so Mongo ends up with more
indexes than say, postgres would.

It's impossible to configure a clustered mongo environment to not lose data:
[http://aphyr.com/posts/284-call-me-maybe-
mongodb](http://aphyr.com/posts/284-call-me-maybe-mongodb)

Sharding configuration is baroque, and limited.

~~~
snmaynard
> Implications: if you run a query in production that doesn't hit an index,
> all traffic stops

Even if Mongo did have a global write lock, which it doesn't as has already
been covered, it yields on page faults which means that other queries are
minimally impacted. See:
[http://docs.mongodb.org/manual/faq/concurrency/#does-a-
read-...](http://docs.mongodb.org/manual/faq/concurrency/#does-a-read-or-
write-operation-ever-yield-the-lock)

~~~
arohner
Mongo does have a global write lock. A per-db lock does me very little good
when I only have one DB.

As to your linked doc, emphasis added:

> In _some_ situations, read and write operations can yield their locks.

> Long running read and write operations, such as queries, updates, and
> deletes, yield under _many_ conditions.

In practice, I've been bitten hard by this. A new feature rolls out, and users
can't log in anymore, because a query is taking 2 minutes to run.

------
snowwindwaves
i work on scada systems which usually come with a time series database built
in so that operators can do some basic plotting. Often these products have a
10 or 15 year legacy.

The scada vendors seem to be careful to avoid making the time series database
and plotting tools which come with the HMI packages too powerful, as this
might cut in to their sales of Historian type products.

If it wasn't a commodity already, the rash of startups which all seem to write
their own tools for storing and plotting metrics from the operations of their
servers and software services has certainly made the guts of a capable
historian readily available for free.

for storing data open tsdb, based on hbase+hadoop,
[http://opentsdb.net/](http://opentsdb.net/) kairosdb, based on cassandra,
[https://code.google.com/p/kairosdb/](https://code.google.com/p/kairosdb/)
timeseriesframework
[http://timeseriesframework.codeplex.com/](http://timeseriesframework.codeplex.com/)

for plotting data I am hoping to find a library that allows for real time
plotting and zooming, scrolling with the mouse wheel. so far I have found
openhistorian
[http://openhistorian.codeplex.com/](http://openhistorian.codeplex.com/) kst
[http://sourceforge.net/projects/kst/](http://sourceforge.net/projects/kst/)
veusz [http://home.gna.org/veusz/](http://home.gna.org/veusz/) chaco
[http://docs.enthought.com/chaco/](http://docs.enthought.com/chaco/) guiqwt
[https://code.google.com/p/guiqwt/](https://code.google.com/p/guiqwt/)
pyqtgraph [http://www.pyqtgraph.org/](http://www.pyqtgraph.org/) lots of D3
based libraries:
[http://selection.datavisualization.ch/](http://selection.datavisualization.ch/)

so there are lots of tools out there if you've got the patience to figure out
which one is right for your application and glue it together

~~~
Ecio78
What about Graphite
[http://graphite.wikidot.com/faq](http://graphite.wikidot.com/faq) ? I've
never tried but it is described as "Scalable Realtime Graphing". It seems to
use an internal database so maybe it's not ok for you..

EDIT: I read another comment from you, you said it's rrd-like so it gets rid
of old data, not what you're looking for..

~~~
drather19
The standard 1s time resolution on Graphite/Whisper also seemed to be a
limiting factor for use with some of these systems, where you want to observe
things on the order of milliseconds (or beyond).

------
espeed
KariosDB
([https://github.com/proofpoint/kairosdb](https://github.com/proofpoint/kairosdb))
is a time-series database for Casssandra. It's an OpenTSDB
([http://opentsdb.net](http://opentsdb.net)) rewrite that supports sub-second
precision.

~~~
vosper
Have you any experience using it?

------
mnutt
My company made the same switch from MongoDB to Cassandra, a little over a
year ago. We were storing analytics counters in Mongo and wanted better
consistency guarantees.

What we found when we switched was that Cassandra had better consistency with
similar performance to MongoDB. Then a few months later, as we accumulated
more data, performance started to take a nosedive. Counter increments began
impacting other database operations and the nodes would become unresponsive.
Eventually we moved all of the counters to an in-memory aggregator that
flushed to Postgres a couple of times a second.

Counters were introduced in 0.8 (when we started using them) and are pretty
half-baked. There has been some good discussion about overhauling counters,
though I'm not sure when they're scheduled to land.

~~~
gigq
The company I work for had a similar experience with Cassandra. We were
running it back in the 0.6 days. Once any of our nodes got around 500GB on
them the performance would tank.

It seems that they later fixed the performance in 1.2
([http://www.datastax.com/dev/blog/performance-improvements-
in...](http://www.datastax.com/dev/blog/performance-improvements-in-
cassandra-1-2#comment-177166)) but by that time we moved our data over to
HBase and haven't had any regrets.

------
bryanh
It seems a bit out of the ordinary, but we've found ElasticSearch + facets to
be _wonderful_ ways to consume time series data. You can dump in a bunch of
information about an event (for example, unix timestamp or even response time
in ms) and then return a facet over the range and get bucket counts.

ES also has quite nice clustering abilities that make it pretty painless to
scale out. If you are clever about your routing keys you can even go crazy
pre-shard hundreds of shards very early on and not have any performance hit
for map reductions, but the capacity to scale out with another node without
reindexing.

We've been surprised at the swiss army knife like ability of ES.

------
ghc
We had some major issues using MongoDB for time series data due to the write
volume (real time sensor data). The solution for us was Riak, mainly because
we never need to update a vector clock, leaving us without the need for
conflict resolution (and last write wins is fine for sensor data).

~~~
k_bx
Why wouldn't generating some hashed _id work? It would then scale for writes
easily (or, as in current mongodb, you could use hash-based indexes).

I'm just asking since Riak seemed much slower for me when I tried it.

~~~
ghc
Hashed _id's aren't the bottleneck. Disks are the bottleneck. MongoDB is
seriously lacking in the compression department and bogs down our disks
constantly.

Riak is not slow as long as you're not running your cluster on VPSs. At our
scale Riak's performance has been significantly better than MongoDB's due to
our heavy write load, and there are fewer issues with using big disks with
somewhat larger seek times.

------
abalone
I'm a little confused by the schema he describes. He says represent "periods"
in columns and "records" in rows. Are these two different units of time?

For say stock data that is sampled every second, is he saying there'd be one
row per symbol per minute (a "record"), with 60 columns holding the value for
each second (a "period")?

If so, does that mean the data is buffered in memory for 1 minute before
getting written to the DB?

 _" Cassandra is really good for time-series data because you can write one
column for each period in your series and then query across a range of time
using sub-string matching. This is best done using columns for each period
rather than rows, as you get huge IO efficiency wins from loading only a
single row per query. Cassandra then has to at worst do one seek and then read
for all the remaining data as it’s written in sequence to disk._

 _We designed our schema to use IDs that begin with timestamps so that they
can be range queried over arbitrary periods like this, with each row
representing one record and each column representing one period in the series.
All data is then available to be queried on a row key and start and end times.
"_

~~~
nemothekid
No, OP's wording can be confusing if you have never used Cassandra before.

You are right. There is one row per symbol. However with Cassandra any given
row can have any number of columns, so when you want to write a new value, you
just create a new column for that second (/period).

The writes are not buffered in memory.

~~~
abalone
I see. From what I know about Cassandra, this is a much more expensive write
than doing it as a new row.

To do this he has to be using dynamic columns, and those are stored as one
serialized blob per row. So the more data you have in the row, the more
expensive the deserialization/reserialization is with each column you add. For
very large series this could be an issue.

But it sounds like this is tolerable for his app because the writes are
distributed over time in a predictable fashion.

I am a little surprised though at the author's claim that fetching a single
big row results in "huge IO efficiency" over a range of small rows. I'd expect
a small amount of overhead, but isn't it more or less the same amount of data
being retrieved? What am I missing?

EDIT: I see the author mentioned that it reduces disk seeks because it's all
serialized together already. Sort of like you're defragging the series data on
every write. I guess that makes sense.

Personally I would probably look at using SSDs and keep the schema more "sane"
and have more scalable writes, but that's just me.

~~~
nemothekid
1.) You do not have to use dynamic columns for this. Unfortunately I've found
in my own experience, as Cassandra has matured over the last year, alot of
terminology has fallen in and out of fashion and its hard to recognize what is
actually current. Dynamic columns in CQL3 has nothing to do with the behavior
OP is talking about and in dynamic columns are sort-of a deprecated feature in
Cassandra 1.2. In CQL3, OP's use-pattern in actually hidden if you didn't know
any better.

In short, there is no deserialization/reserializaion. OP's writes are append
only. I have a similar use pattern to OP, and I haven't seen any performance
issues with 100,000s of columns (on SSDs)

2.) The "huge IO efficiency" is similar to what you would see in any columnar
data store. Wikipedia has a good walkthrough of it
([http://en.wikipedia.org/wiki/Column-
oriented_DBMS](http://en.wikipedia.org/wiki/Column-oriented_DBMS)). The short
story is now there is fewer meta data between his values.

\--

In any case, it works out because Cassandra is far more well suited for this
type use pattern than Mongo is. We migrated from MongoDB (on SSDs) to
Cassandra for similar reasons. The perf-killer on Mongo in this scenario is
the write lock.

~~~
abalone
Thanks. I definitely need to read up on Cassandra wide rows now.

------
CurtMonash
It would be surprising if, for data with a consistent, predictable structure,
MongoDB had the best, most consistent, and most predictable performance.
MongoDB's raison d'etre is that data doesn't always have consistent,
predictable structures.

That said, there's cool stuff out there in the Mongo ecosystem. E.g., TokuMX
is a whole new Mongo storage engine.

------
cnlwsu
For what its worth, we have been using Cassandra for storing time series for
about 2 years now at ~2k writes a sec. I would say every issue was self-
induced and Cassandra has been amazingly patient with us. It works amazing
with this scenario. We experimented with MongoDB a lot initially (along with
riak, hbase, etc) and found about the same thing. Turns out using a database
in a way its designed to work turns out in your favor. That said hbase did
well too, but it scared our ops team.

All the new changes in 1.2 and 2.0 with cql really make it seem like datastax
is focused on being mysql and ignoring the time series use case though which
makes me nervous.

~~~
koeninger
I understand the nervousness, but I was able to convert a thrift/hector time
series model to exactly equivalent CQL3 without too much trouble. The (perhaps
non-obvious) options involved were "WITH COMPACT STORAGE" for wide rows and
"WITH CLUSTERING ORDER BY (blah DESC)" for a reversed comparator.

[http://www.datastax.com/documentation/cassandra/1.2/webhelp/...](http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/cql_reference/create_table_r.html)

~~~
relistan
Thanks for the pointers. We are also looking at that as a future plan. We are
currently using Thrift which has been rightly pointed out and which I should
have mentioned.

------
gwu78
I'm assuming this website is being served from S3 based on the HTTP headers.
As I'm typing this, I get an empty Content-type: header. Is that a
configuration oversight or is this par for the course if serving web pages
from S3?

~~~
relistan
An odd oversight. Thanks for pointing it out. I'll see what's up with that.

------
astral303
I didn't see the size of the cluster described, but watch out for having
timestamps as your row keys, for you are going to have hot spots (all
timestamps in one token range).

What is your replication factor and the size of the cluster?

This might be improved with vNodes, though I'm not sure how granular and
automatic the subnodes are. If they are just an even range (e.g. 256 vnodes
across the same 00-ff range), then you will have the same problem.

This is the major reason why Datastax pushes random ordered partitioning so
much, it's easy to get into hot water with byte-ordered keys.

~~~
relistan
You're right. As I mention in the article, the row keys are not timestamps,
the columns are timestamps. We use the RandomPartitioner for rows.

~~~
astral303
Sorry, I misread!

------
topbanana
We use KDB for time series data but it is expensive.

He's anyone checked this project out?
[http://www.monetdb.org/Home](http://www.monetdb.org/Home)

~~~
agentultra
My go-to for any time-series data is a column-oriented database.

fyi, Cassandra is row-oriented.

~~~
zorked
Well, but in a column-oriented database you would store your time series data
in columns. Cassandra is "row-oriented" and you store the time series in rows.
In both cases the data ends up sequentially on the disk, which is what you
want.

------
kailuowang
I am not sure if this is really an apple to apple comparison because from the
article it seems that the data schema in Cassandra was carefully designed
while it's not obvious whether it's the case for the MongoDB one("it was
supposed to be a temporary one").

MongoDB has many many limitations. Data schemas have to be carefully designed
taking them into consideration, otherwise they are going to have a huge impact
to the performance.

~~~
nemothekid
When it comes to timeseries data, there aren't a lot of ways to design your
schema. In my experience when working with a write heavy load such as
timeseries data, you are always fighting the writelock with MongoDB. AFAIK,
there isn't much you can do about your schema to combat this.

------
aerolite
I'm considering using Cassandra for time series data. What exactly did you
mean by the columns were for the "period"? What is the period here?

------
wcdolphin
I find it strange that there is no discussion of the performance of writing
data.

Cassandra and Mongo differ hugely in this respect, and I expect you will see
huge performance gains in write performance. Mongo's write locking will mean
that reads will be blocked while you are inserting data. Reads in Cassandra
may trigger compaction and or require sequential IO if the table has not yet
been compacted, so the tradeoff is interesting

~~~
rbranson
Reads in Cassandra will never trigger a compaction. Compaction is only be
triggered by writes. If the compaction queue gets behind due to overloaded
CPUs / disks, reads will begin to slow down.

------
mtkd
Would be really useful to get more background before/after and a lot more
detail on the structures, the clustering and the queries required - and any
perf improvements that were attempted previously successful or not - and any
lessons learned from making the change.

~~~
relistan
Good ideas, I'll see if I can generate some time to post more about what we
did.

------
lnkmails
We (Cloud Monitoring @ Rackspace) have built a time series data store for
persisting metrics on top of Cassandra. Check it out:
[https://news.ycombinator.com/item?id=6257401](https://news.ycombinator.com/item?id=6257401)

------
jsemrau
Makes one wonder what the performance of Mysql/Memcached would have been.

From my work experience, time-series data is quite stable in definition. I
would see this more as a business case for a relational database than a NoSQL
database.

~~~
relistan
Disk usage patterns are the problem with most relational stores (and were for
Mongo). When you do a query on time series data, you want to look up a start,
read some data, and stop when you hit the end point. Getting it laid out
largely sequentially on disk is a big win there.

~~~
jsemrau
What GB/TByte Volume are we talking about here? Since memcache would be able
to store some parts in RAM.

------
rishid
Isn't graphite designed specifically for time series data? Or is there another
use case to use Cassandra over Graphite?

~~~
snowwindwaves
graphite's back end is called whisper and it is a rrdb type datastore which
throws away data as it gets older. You can see a plot of your time series from
three years ago with a datapoint per day, but not one data point every .5
seconds that you initially recorded at.

------
pjmlp
It is interesting to see that most NoSQL databases, implemented in Java/Erlang
scale better than a C++ one.

------
ddorian43
Anyone uses cassandra with OrderedPartitioner and is it good ?(i know random
is best)

~~~
Aaronontheweb
OrderedPartitioner is definitely an expert-mode feature. You really have to
have a good feel for the key proximity of reads and writes, or otherwise you
can create some really nasty hotspots on individual nodes. We stick with
Random.

------
Finster
But is Cassandra web scale?

~~~
AHconsidered
some vague graph from this year's Cassandra Datastax conf:
[https://pbs.twimg.com/media/BMfdOagCcAAxMVU.jpg](https://pbs.twimg.com/media/BMfdOagCcAAxMVU.jpg)

------
leoplct
For Time Series Data there is also TempoDB

~~~
vasilipupkin
how's that, anybody try using it ?

