

Performance comparison: key/value stores for language model counts - cgbystrom
http://anyall.org/blog/2009/04/performance-comparison-keyvalue-stores-for-language-model-counts/

======
moe
Flagged for utter nonsense.

Author doesn't know what he's talking about, much less what he's measuring.
Submitter put link-bait title that is not even related to the article.

How did this get voted up?

~~~
wheels
I know this area fairly well and the only thing that I see in the numbers that
looks surprising (or rather, wrong) is that the disk cache obviously wasn't
being flushed after writes.

~~~
moe
Which is rather a big deal when you benchmark a persistent kv-store versus a
volatile one, don't you think? The persistent one is eventually i/o bound.

Furthermore whatever he is measuring, it is neither tokyo tyrant nor memcache.
Memcached doesn't break a sweat doing upwards of 10k ops/sec on moderate
hardware, especially when all you're doing is increments. He managed to get
2000/sec out of it.

So again, whatever he's measuring, it's not the "performance of key/value
stores". Both the article-title and the linkbait HN title are wrong.

------
swombat
That sounds very surprising... these numbers look pretty appalling for
memcached... can anyone confirm that this is not simply due to a badly set-up
memcached? Or perhaps the author is not using memcached for its intended
purpose?

I find it hard to believe that a widespread solution like memcached would be
100 times slower than one of its alternatives.

Also, all those numbers look awfully low.

~~~
wheels
It's pretty easy to see how much of the performance drop is protocol overhead
since the numbers are there for Tokyo Cabinet / Tyrant over the memcached
protocol and not.

This isn't terribly surprising. One is a distributed network protocol, the
other is just moving things around in memory.

The poor performance for memcachedb doesn't surprise me a lot. In my tests
I've found BDB (which it uses behind the scenes) to be frustratingly slow with
writes.

------
Maro
1\. The HN title is completely misleading. 2. The linked article is reporting
some odd numbers. Eg. Facebook has reported several hundred thousand
operations / second for memcached, the article is reporting 120. Also, the
article is only reporting numbers for disk-based BDB, but BDB also has an in-
memory mode, google for "bdb in-memory". I don't actually know how it
performs, but it's possible.

~~~
sanswork
The linked article is testing on one machine where Facebook has a very large
cluster. He tests all of them on the same machine so it's the relative
performance that matters in this case.

~~~
Maro
That FB number is for a single 8-core server. Also, for in-memory stores,
reads and writes are not that different I assume.

