However, one of the great things about MongoDB is that, in some cases, you can easily afford to lose 1 minute worth of inserts in exchange for huge performance gains. Large chunks of data that is used for analytics is a good example since lost data won't [likely] impact final aggregates/percentages.
Some of our inserts, like user registration we run with safe=>true. Others, like audit logs (which, for us aren't as important as they might be for others) we don't.
My more mundane problem was that I didn't know whether database said the (insertion) operation was ok (or, for example, I tried to reuse an unique key value). Without using getLastError (or, indeed, safe=True in Python), I have no idea whether any possible errors (eg. a bug in my code) have occurred.
For data for which you can ignore occasional error (e.g. some logging, or click tracking, or similar) I agree getLastError may not be needed. I believe that it's not a very good default for most users with use-cases similar to mine - you have a VPS, you build a simple app on it, and use MongoDB in it.
Here's a rewrite of the article: Don't use monogdb unless you know what your doing and have the hardware to do it right.
> Don't use monogdb unless you know what your doing
That can just as well be applied to anything, not just mongodb. Whenever you start using something, you're going to make mistakes.
> and have the hardware to do it right.
AFAIK, running it on two instances (master + slave, and then stop/cycle the slave for backups) should be just fine. So you don't need to have "web scale" hardware for mongodb.
MongoDB is an interesting database and can fit nicely into some use cases - by which I mean data organisation, not just scale. So I don't think it should be avoided by people running simple things with not-humongous data-sets. We just have to look out for a few things we might not have expected. That's why I didn't call them "bugs" or "problems" - just "gotchas".
I wish there was a MongoDB Guru site where I could contract out some MongoDB related maintenance activities such as validating my MongoDB installation and making sure that I have not made dumb errors, demystifying performance issues etc. So far I am making do with documentation and mailing lists but I would rather contract this out to a specialist.
Anyone know of any provider like this with affordable rates ?
I was looking more for guys like contract DBAs that are available for Oracle or even PostgresSQL. Maybe 10gen could create certification programs for admins such that we could have a pool of knowledgeable admins who could support MongoDB newbies such as myself.
No certification right now but that's good feedback, thanks
You can't shard an existing collection that's surpassed 50GB.
All collections are currently created on the primary shard of the database. (Although this is slated to change: http://jira.mongodb.org/browse/SERVER-939 )
If a collection already has a unique key, that has to be your shard key.
You cannot update the value of a shard key.
You learn a lot about mongodb just by playing with it. It is super simple to set up and just hammer with different tests. I'm playing around with 4 extra large instances wired to 4 drives each in raid 0 on amazon right now and having a blast.
I really wish that PostgreSQL (and other SQL databases) would allow one to enforce such policy, it's a lifesaver.
"[MongoDB is] so simple and natural to use from dynamic languages"
"In my test code, I had an 'async' remove() call (ie. I didn’t wait for it to finish) and was then inserting new entries, and previous remove() happiliy removed them (all of them, or some, or none, depending on the race). Those were very confusing few hours."