i would love to see a solid technical reason to choose mongodb over any of the other NoSQL db's (couchdb, riak, redis, etc..) other than "it's popular".
Geolocation, specifically GeoJSON. That's the main reason why I chose it (I started working on my app while it was at 2.0). When 2.4 came out with better geospatial indices (albeit basic compared to PostgreSQL+PostGIS) and GeoJSON support, I moved to using GeoJSON, and I am happy so far.
The website/app is at https://rwt.to , and an example route search is; from "Milky Way, Johannesburg" to "O.R. Tambo International Airport".
I should note that I've had a look at geocouch and it didn't fit my use case, I'm not doing trivial 'find my 3 places near [y,x]' queries, but am traversing a pseudo-network of routes to calculate directions. Neo4j also wouldn't have worked in my case. TokuMX is based on MongoDB 2.2 as far as I'm aware, so them too.
That's a very good reason, and the first real one I have heard, thanks man.
Also, I used to work for MapBox, and I know we did one project on mongo which I was not involved in, and afterwards we built everything with CouchDB (which is how I got acquainted with it).
For the geo stuff we actually used a lot of sqlite and to a lesser extent spatialite. We would pre-calculate things and build them into the rendered tiles in mbtiles format, or stream the point/polygon data from the couch database for realtime client-side compositing.
But yeah, routing is pretty high level stuff. I think they are only now putting the finishing touches on their openstreetmap driven routing system many years later.
I would consider "Its easy to get started with" a valid technical reason.
Of all the "We moved from MongoDB to Cassandra/Riak/etc and gained massively!" I've rarely seen - and its possible that this is selection bias - companies start with the other NoSQL options.
I want to say, that unlike MongoDB - the others actually force you to think about your data and actively decide how you are going to store it. With MongoDB you can pretty much add an index on anything, but with Cassandra (maybe Riak/Dynamo too) you only get one free index before you have to denormalize and write application code to keep your performance.
Then lastly, MongoDB is good enough for most use cases. We didn't see major performance issues until we started constantly writing data to it (high write/low read) (basically we were wrestling with lock contention). I'd wager for a significant amount of MongoDB deployments, not only is Mongo easy to use, but fast enough too.
So while the other NoSQLs are (probably) more complicated and likely more performant, MongoDB, to me, hits a sweet spot of ease of use and performance that is good enough for most applications out there.
However, considering other "raw" technical aspects like performance, durability and scaling I've never seen anything that has shown MongoDB to be a leader.
"I've rarely seen - and its possible that this is selection bias - companies start with the other NoSQL options."
It seems like everyone starts with Mongo, because everyone starts with Mongo.
This means that you don't have the deluge of posts from people moving from other databases,
a) there are much fewer of them
b) they chose them for solid technical reasons (not just because everyone does this)
So as for your perception that other NOSQL databases are "probably" more complicated, you should know that complexity is an objective measure. I think that mongo is definitely a lot more objectively complex than couchdb, and from what I have read around the subject, many of the other NoSQL databases.
What Mongo could well be is 'easier', which is relative. It seems like it's more familiar to certain programmers, which is kind of echoed by the fact that there's an incredibly popular object relational mapper (mongoose), that is being used with what is supposedly a non-relational database.
It's from a very insightful presentation by the creator of the Clojure language, and I only wrote a summary because I got sick of trying to get people to watch an hour-long video before trying to discuss systems on this level.
Mongoose adds a bit to the table. It actually adds schema validation, which mongo doesn't inherently have, and should be part of the application anyhow. I feel that's the biggest reason to use mongoose over the straight mongo driver in many cases.
I've used Mongo in a couple projects where it was a great fit. The scale wasn't huge, but having pre-shaped data for a mostly read scenario was great. I've found that it works really nicely for a lot of situations, and would definitely be a consideration.
I find that document databases work best when your data is read far more than written to, and when you can shape your data structures for simple key reads in most cases combined with indexed searches. I would consider the use of ElasticSearch or RethinkDB in most cases where you might look at MongoDB. It really depends on your needs here.
Riak and Couch offer other advantages, and like anything it really depends. Cassandra is another nice option for larger scalability, but everything has a cost.
Mongo is very reasonable, and to be honest, if you don't need more than a single server for your needs, it's really easy to get up and running quickly, and development tooling is decent enough, and the concepts are pretty easy to get up to speed with.
I can't speak for everyone, but to me MongoDB is far easier than any other NoSQL engine I've looking into. The reason why I said "probably" because I can't speak for every NoSQL database out there.
We had a 5 node cluster in Mongo that we moved to Cassandra last summer. While our experience with Cassandra is by and large much more performant and cost effective than MongoDB, getting setup with Cassandra was not as easy with MongoDB. With MongoDB you can literally start throwing data in your database, then add an index after the fact. With Cassandra we had to make sure our data was modeled correctly, and decide where we would denormalize. Riak from what I remember has a similar data model to Cassandra, and Redis isn't something you just "start up and go" (mainly because its an in memory store).
So I know for a fact that Cassandra, Riak, Dynamo, and Redis are far more complex than MongoDB. Cassandra even requires you run a "repair" command periodically, and that alone makes it more ops work than Cassandra. We can even throw HBase in there too as it requires Zookeeper nodes, Named Nodes, and all that Hadoop goodness.
Now none of these databases are hard to use, but compared to mongo, mongo is a cakewalk. You literally spin it up, throw json inside, and get json back. There is no query writing, and for most cases there is very little ops management. In most cases if a query is slow, you can fix that by adding an index, or moving to SSDs, only once you have exhausted these options do you really have to consider anything else.
FullContact also has a similar story : http://www.fullcontact.com/blog/mongo-to-cassandra-migration...
tl;dr Mongo was great for getting the product up and iterating quickly, but then they moved once they thought they needed too. Its my opinion that its far easier to get started with MongoDB that it is to get started with Postgres/MySQL.
Lastly, damn the technical reasons why its so popular, Mongo/10Gen used be a huge marketing engine around ~1.6/8. They captured a lot of developer mindshare and I'd attribute that to why its so popular now as well. Wasn't much longer after that when they naysayers & those hurt by the initial hype came out of the woodwork and we got the now infamous "MongoDB is webscale" video.
It allows you to query json documents in a way similar to sql. redis is key/val and sits in ram, couch requires complicated design documents to query and is better as a key/val, but I'm unfamiliar with riak.
I think trello uses mongo primarily for production. technically it's feasible but I've found it to be more trouble than it's worth to scale -- too many machines are required per shard. I'm currently looking into rethink db as a replacement now though.
I think it wasn't 'ready' at that point in time, and the json based query language was closer to what we needed.
The real problem was that the data was being imported in bulk by the user, from a many-meg-sized csv . It would grind couchdb to a halt trying to build views, so having elasticsearch be a separate process that could work through it made a lot of sense.
Thanks for answering. I have used elastic search before and I was very impressed by it. Now I am trying to evaluate couchdb-lucene to see if it can prove to be a good alternative.
i think a lot of the reasons I've seen come down to business reasons, not technical. Someone wants an app fast, like now, and MongoDB is fast to setup and get running with.
I guess I'll be able to confirm once I'm forced to build something in it, but I don't think it can really be faster to setup and get running with than CouchDB.
Usually first thing you need to do is write a REST layer on top of it, and with CouchDB that part is just done already.
Obviously there's certain kinds of data I wouldn't put in Couch, or any kind of NoSQL database.
You need to know what the right tool for the job is,but I just want to figure out when that tool is mongo.
Why would that tool ever be mongo ? I think Mongo is a thing because people coming from Rails ORM libraries feel like "wow I can jump on the no-sql bandwagon just by using a library that feels kind of like the ActiveRecord I'm used to".
It's only popular because there's less of a conceptual gap between mongo and the relational database tools that a lot of people are used to.
Couchdb on the other hand requires you to actually learn and use map/reduce.. which is a pain for people who don't feel like having to learn something new, but Couchdb is MUCH MUCH better in a lot of ways and Mongo is pretty much fundamentally flawed in my opinion
I do wish rackspace luck though with their offering. I think it was smart of them to create this mongodb product for one simple reason: a good number of people are already using mongodb so it makes sense to help them get the most out of it.
No, I'm not trolling. I really want to know : https://news.ycombinator.com/item?id=7446919