Not that all of "NoSQL" doesn't, but two major "NoSQL" databases, Redis and MongoDB, both seem to care more about performance than durability.
Personally I think this is a bad foundation for data that is important. There are probably a lot of use cases where data not being on disk for n seconds (or one minute in the case of MongoDB) is ok.
Even when that is the case, I still think that is the most important question to be addressed when choosing MongoDB as a data store.
The flexible query API, schema-less document format, secondary indices... those are siren songs of rapid development.
You shouldn't choosing SQL over noSQL, you choose both (or none, or other stuff as your problem requires) and use them in appropriate places in your infrastructure. Sometimes you want performance over durability.
Just curious what your experiences were that made you think MongoDB isn't a reliable data store.
In practice you probably don't. Financial transactions should take another path, but for almost everything else, you can probably afford the few seconds of data loss you may encounter.
Reading the Google BigTable paper would also be a good idea, as it represents another major strand of work.
My blog post on Dynamo: http://untyped.com/untyping/2011/01/21/all-about-amazons-dyn...
A lot of NoSQL solutions are about solving specific problems. MongoDB is a general solution which can (and probably should) be used in 90% of the cases that you currently use an RDBMS.
As I use don't use any NoSQL solutions today, can someone list out a few of these cases where I am using an RDBMS and I should be using MongoDB?
For people using MongoDB daily, their data is in the correct format, it is persisted to disk, it is stored reliably and it is fast. On top of that, it gives them flexibility as their web app changes and new features and data structures are added, it is easily viewed and manipulated in JSON-format, and if and when the day comes that they need serious scaling, it helps there as well. Also, in the world of EC2, it is straight-forward to set up replica sets to offer redundancy.
All these conversations sound eerily familiar to the Java guys bashing Ruby and Ruby on Rails about five years ago. If you want to stick with what you have, then no worries. I just think people should be excited that over the last couple of years, the "golden hammer" approach to storing data has finally been overtaken and developers have a choice about what technology best solves their data problem.
Thanks for the ad hominem attack though.
The reputation (specifically of Mongo) is probably not undeserved.
I am not denying your experience, just curious about the whole picture.
As for the data corruption, I didnt see you mention if you were checking for a response on saves or not.
I think the problems you experienced were due much more to the fact that you apparently had more data than the machine could handle, not necessarily the database engine used.
Just curious what the magic incantation is for this information.
If only we could make database software that didn't hold grudges!
Why don't you check out mongly.com/tutorial/index and get a quick feel for it?
By the way, thank you for your comment. I up-voted it.
For example, currently MongoDB is a good choice if you're looking for indices, ad-hoc queries and want to get up and running quickly on eg. a web project. Interestingly, with indices and ad-hoc queries, MongoDB becomes a lot like Mysql, only you're writing weird And() and Or() functions instead of "SELECT ... WHERE AND ... OR". Also, indices are tied to the documents themselves, so it's not clear how that scales.
Or you could go more bleeding-edge and check out ScalienDB, which is a straight key-value store built on Paxos and sharding. Nevertheless, getting started is easy:
Too many shortcuts, inaccuracies, half-truths and implied errors to list here. The core idea of learning about NOSQL through first-hand experience (& dissection) of each project is sound, though.
The emulation of secondary indexes is inefficient. In a RDBMS without built in index maintenance you would create table (LeaderboardId, ScoreId), not (*LeaderboardId, ScoreIds). There's no need for comma separated fields.
This was the article: http://cacm.acm.org/magazines/2011/4/106584-a-co-relational-...
1: Use Mon" CTRL-W