

MongoDB Performance & Durability (2010) - wisty
http://redbeard0531.s3.amazonaws.com/mikeals_blog_backup/MongoDB+Performance+&+Durability.html

======
wisty
I'm just starting MongoDB, and I really like it. It's got nice documentation,
and a nice culture.

But it does have weaknesses, and having just been bitten hard by one.

I had a schema something like "db.post.insert({'tags'=['python','mongodb']}).

One tag just couldn't be reliably deleted (via "$pull:{'tags':'immortal tag
that will not die'}.

It turns out, I had made "tags" unique, but somehow managed to get a duplicate
tag in. (Before the unique index? After? I forget). Trying to delete the tag
would "throw" a E11000 error, but since mongodb doesn't check errors it would
just result in this immortal tag sitting there. Smirking.

Apparently, using a unique index on an array is the wrong thing to do anyway
(does it mean different posts can't share tags? or that you can't have 2 posts
with exactly the same set of tags? I'm not sure), and you are just meant to
use $addToSet. I'm not sure how you are meant to clean an existing array with
duplicate elements though. Perhaps you just have to check in the app layer,
then update: db.posts.update({'_id':post['_id']},{'tags':set[post['tags']}.

Lesson 1: if something is failing a lot, or might fail in an annoying way, use
db.error() (in Python ... or whatever the last_error command is in your
client).

Lesson 2: don't try to use unique indexes on array elements.

Otherwise, it's a very nice db.

------
jwilliams
Keep in mind this is a year old and MongoDB has since implemented journalling
(although afaik it's not on by default yet).

It's not specifically a response to this article - But you can read 10gen's
view on durability here: <http://blog.mongodb.org/post/381927266/what-about-
durability>

~~~
wisty
As I would put it:

In the event of a single server crash, Mongo may need to be restored from
backup, or another master. CouchDB handles this much better, and can just
restart where it left off. But in the worst possible single-server scenario,
there's smoke coming out of your server and the hard drive is toast. CouchDB
and MongoDB perform about the same - if you had replication or a recent
backup, you are fine. If you didn't, you are f*cked.

CouchDB is better for small servers which crash a lot and can't use
replication. You can run it on a mobile phone. Mongo is better for logging
(where you need fast writes, and don't care about a couple of lost records).
Most uses will be somewhere in between, and you need to know the limitations
of the stack, so you can work around them.

~~~
rit
If you are running with journaling enabled, you should get a much stronger
crash recovery case.

In the event of a crash, the journal will be replayed:
<http://www.mongodb.org/display/DOCS/Journaling>

~~~
rdtsc
And that is a good feature on a database to have! Why not make it the default?

~~~
rit
When we added it, being a new feature the decision was made not to initially
flip it to default in case of bugs, unexpected behavior, etc.

As noted, it's possible that will change in future releases; there's been a
lot of improvements to journaling since its initial release.

