

MongoDB: A Light in the Darkness - mattculbreth
http://www.engineyard.com/blog/2009/mongodb-a-light-in-the-darkness-key-value-stores-part-5/

======
gfodor
I've got serious "Key value fatigue." Inevitably these articles are always
glowing but somewhere in the comments or on usenet you find the thread saying
"we actually tried this and it fell on its face in production." I'm tired of
articles that spend the whole time talking about features and showing single-
machine "hello world"-esque 'performance tests' while neglecting the things
people making IT decisions care about: does it actually work?

For us, we're using tokyo and it explodes at 70GB of data, though I'm guessing
its a configuration issue or something. We purge it every week now since its
just used as a persistent cache and I haven't looked into why, but it puzzles
me how it can just basically break at a certain limit and not just start
pushing things to disk.

To cut through the noise, can anyone here vouch for a simple "plug and play"
kv store that actually works as advertised, at scale, and ideally is
distributed so I can just add nodes as needed? Third party tall tales and
anecdotes don't count, I want you to explain in detail your own personal
experience running one of these things on a real, live, many-noded system.
CouchDB, MongoDB, Tokyo, Redis, HBase, MemcacheDB, Voledmort, Cassandra, the
list goes on (I realize not all of these are strict k-v stores), who out there
other than the original authors can get up front and say these things work
well?

~~~
moe
_does it actually work?_

Someone has to step forward and try it. You did that with tokyo and noticed it
doesn't. Report your bug and after it's fixed it will work better for the next
person to try.

 _can anyone here vouch for a simple "plug and play" kv store that actually
works as advertised_

It doesn't exist. For the simple reason that most K/V stores are under 2 years
old and haven't gotten enough real world exposure to work all the corner cases
out.

If you want rock-solid then you're looking at the wrong market. Use
PostgreSQL. It's not optimized for that use-case, but it's as chuck-norris as
you'd expect after 20 years of development and production vetting. It won't
segfault under load, it won't corrupt the datastore and it won't plot
sawtooths while plowing through a large index.

And at the end of the day a K/V store is a table with two columns, right?

~~~
gfodor
A few things. First, the "shoot the messenger" reply was inevitable, but
doesn't really detract from my point that these types of blog posts trumpeting
features and hype are not worthwhile. We will report this as a bug if and when
we have a useful bug report to submit. As of right now we don't.

Second, we use PostgreSQL for our RDMBSes -- I can't really say we've tried
using it with the same use cases we'd be using for our K-V store and if it
would work. We're talking tens of millions to hundreds of millions of HTML
documents keyed by URL, and I'm skeptical the performance we'd get would be
close to the 1000-10000 tps we experience with tokyo. Would be a good
experiment, though!

------
gstar
MongoDB works very well for us with 100GB of data per collection, although we
did run into a severe bug with .count not using an index, totally killing
performance (we're talking 60 seconds to return).

Inserts and indices, however are very very fast, and the bug was fixed
incredibly quickly and now works in trunk.

It doesn't seem quite cooked yet, but it's a very very nice start, and
promises much. I prefer it to the other KV stores that are out there right
now, anyway.

