

Benchmarks of Cassandra, HBase, VoltDB, MySql, Voldemort and Redis - chillax
http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf

======
jhugg
VoltDB Engineer here. It seems like the authors made some unfortunate choices
in the configuration and usage of VoltDB.

A limited number of synchronous loads are always going to present scalability
problems and will do a poor job measureing the throughput achievable. Even if
it takes 500usecs to round-trip a transaction from the client, that means each
synchronous client can only do 2000 per second. To scale linearly to 12 nodes
might require hundreds of these synchronous clients.

VoltDB is commonly used with many parallel synchronous clients, such as web
front-ends, or asynchronous workloads, such as event feeds. Both of these
workload scale very well, so long as there is enough parallelism.

They also used global, consistent, multi-partition transactions for the YCSB
scan. I'm not sure if that is necessary, and it would also limit scalability.

I can't speak too directly to the work because the code and configuration
doesn't seem to be public. Perhaps I'm mistaken.

It would have been nice if they had reached out to us at VoltDB to verify
their configuration and client. They also could have contacted Andy Pavlo at
the H-Store who has some experience with YCSB on H-Store.

Still, we can always learn from something like this. One thing we're working
on in VoltDB is better performance for workloads that aren't designed to be
parallel. This means more performance for synchronous clients and for clients
that do global consistent reads. Another area we could improve is more
prominent diagnostics to show you why your cluster isn't running faster. That
information is all available in our system tables and management tools, but we
could probably boil it down to a single status field telling you whether your
cluster is starved on client requests, intercluster-transaction-agreement or
slow procedures. In our experience, it's usually the first one. I'll go file a
ticket now.

~~~
jbellis
Hmm, there has to be more to it than that. VoltDB throughput went _down_
scaling from 1 node to 4 in every workload but RSW. This is not a sign of a
system that would handle many more requests if only there were more client
parallelism.

~~~
arielweisberg
Another Volt developer here.

Short answer, Volt establishes a global order across transactions.
Transactions can't be executed immediately because their position in the
global order isn't known when they initially arrive. The exchange of ordering
information is driven by in-flight transactions. In the absence of sufficient
in-flight transactions to drive the ordering process heartbeats are sent every
5 milliseconds resulting in increased latency and lower throughput as the
system waits on heartbeats. 2-4x the concurrency should have shown different
results.

We're finishing a different transaction initiation system that doesn't produce
a global order so that people can do this kind of benchmark and get the
expected result. You can switch the beta version on in 2.8.

I also wonder if they had clock skew. The global ordering process is always
delayed by the amount of clock skew you have. With NTP properly configured
this is 10s of microseconds, but with a typical out of the box NTP config it
will be milliseconds unless you have a lot of uptime.

I don't really see this as an excuse and it is why we are eating the pain of a
transaction initiation rewrite so people can use Volt the traditional way with
small numbers of synchronous client threads.

------
LeafStorm
This is like publishing the comparative performances of a Nissan Altima, a
Ford F-150, a Honda Odyssey, a Smart Fortwo, a Kawasaki Ninja, and a Gillig
Advantage.

Sure, it's possible to determine which one has the best land speed, range, and
fuel economy, but you're also sure as heck not going to use a Gillig Advantage
to do an F-150's job or vice versa.

------
llasram
With these sorts of academic benchmarks I frequently find myself questioning
what exactly got tested. The researchers obviously know what they're doing,
but that doesn't make them experts on all of the technologies under test.
Their description of their HBase configuration in particular leaves me
skeptical of how closely it reflects a well-tuned production setup. I think
publishing the actual configuration files they used for each technology would
be a big help.

~~~
linuxhansl
My thoughts exactly. The HBase numbers point to a "strange" setup.

Disclaimer: HBase developer here.

------
antirez
In a system like Redis Cluster (or any other where clients talk with the right
nodes directly) it is conceivable that you always see linear scalability as
the number of nodes goes up. This fact is not captured by the setup of this
benchmark.

If you think at it, in a real-world scenario if you have 100 database nodes,
you likely also have not threaded clients, but N different processes running
in M different computer systems, querying different nodes independently.

~~~
jbellis
Unless you're CPU bound on the client, it's irrelevant whether you're using
multiple threads or multiple processes to generate the load. Assuming of
course that you're not using a GIL-bound client, which the Java-based YCSB
does not.

~~~
antirez
If you look at the Redis graphs there is no linear scalability with N distinct
nodes. Now given that Redis has no proxy nor any other node-to-node chat in
this setup, how it is possible that it's not linear scalable?

~~~
jbellis
I wondered the same thing, but I do know that it's not threads vs processes.
:)

------
lazyjones
Interesting, though I miss Riak, Postgres and also a benchmark of degradation
effects after running the workload a few 10 or 100 times (due to on-disk /
index fragmentation).

------
lucian1900
It's refreshing to see a benchmark that is somewhat thorough and doesn't even
make sweeping statements like "X is faster than Y".

Everyone who's ever written a hello world benchmark? Learn from this.

~~~
freemonoid
Well they did say that Cassandra is better than all the others - especially
when compared to HBase - on nearly all measurements (except for high write
scenario latencies.)

This makes the decision of Facebook to go with HBase for their new messaging
platform back in 2010 all the more strange. Though that was two years ago so
things might have changed in Cassandra's favor since then.

~~~
ot
> This makes the decision of Facebook to go with HBase for their new messaging
> platform back in 2010 all the more strange.

Speed isn't everything to a database. AFAIK they chose HBase over Cassandra
because of consistency guarantees: eventual consistency is a bad choice for a
messaging platform.

~~~
erichocean
_eventual consistency is a bad choice for a messaging platform_

Strange you would mention that in the context of Cassandra, since it allows
for per-read/write configuration of consistency, from "eventual" to "strong".
You get exactly what you ask for with Cassandra, whether its availability or
consistency.

AFAIK, HBase only supports strong consistency.

~~~
dialtone
This is not the main reason people would choose HBase. For time series data
HBase is much better at load balancing the data and at storing the same data
in sequential blocks so that in a single operation you can fetch all of the
data points that are interesting. Cassandra supports range queries but the
last time I saw it wasn't super awesome at load balancing data across nodes
when using the OrderedPartitioner. Do I remember wrong?

~~~
jbellis
Time series works very well on Cassandra. If anything, avoiding global-
ordering hot spots is a Good Thing.

[http://rubyscale.com/blog/2011/03/06/basic-time-series-
with-...](http://rubyscale.com/blog/2011/03/06/basic-time-series-with-
cassandra/)

[http://www.datastax.com/dev/blog/advanced-time-series-
with-c...](http://www.datastax.com/dev/blog/advanced-time-series-with-
cassandra)

[http://www.youtube.com/watch?v=F-fYqPu2ciQ&feature=youtu...](http://www.youtube.com/watch?v=F-fYqPu2ciQ&feature=youtu.be)

~~~
dialtone
Does Cassandra support Range queries with the RandomPartitioner now?

~~~
jbellis
Within a partition, yes.

------
rb2k_
> Since the Redis cluster version is still in development, we implemented our
> own YCSB client using the sharding capabilities of the Java Jedis library.

I just glanced over this, but at this point they might have very well
benchmarked driver performance. I also don't see why somebody would compare
Cassandra against MySQL and against Redis.

While they all persist application data, their feature sets are so completely
different that I don't see for which task people would have to decide between
them.

~~~
lucian1900
I think they wanted to discriminate between them on tasks where someone might
reasonably decide for either of them.

~~~
rb2k_
I haven't seen those usecases yet.

Either you want to scale over multiple systems or you don't. Either you have
more data than RAM or you don't.

~~~
michaelt
How about Hacker News, and more generally internet discussion forums?

Many discussion forums are SQL backed - but both Reddit and Digg are reported
to use Cassandra, a NoSQL system.

~~~
rb2k_
In this case Redis would be the odd man out. Unless you want to do manual
sharding and replication, MySQL also is targeted at a different use-case. I
also don't think anybody would run a 40 node mysql master-master cluster

------
jdefarge
Why they didn't tested MongoDB or Riak? Is Voldemort really more relevant than
Riak or MongoDB? WTF?!

 _Sorry, boys, but this paper looks like the result of a ill performed
research job, done in a hurry of getting published._

------
weinbergm
Another vote for MongoDB

------
hastur
Should'ev tested MongoDB too!

~~~
enlighten_user
Wanna mongo measurements too

