

Ask HN: When is MongoDB the Right Tool for the Job? - AdrianRossouw

I&#x27;ve recently been helping a friend through the process of learning node.js to find a new job.<p>When it is time to teach him a NoSQL database, I have trouble recommending he learn anything other than MongoDB. The absolutely one and only reason would be that all of the jobs I have seen available in my recent stint on the market, have been for MongoDB.<p>Now, I do have my favorite toolchain, which I&#x27;ll add into a comment to avoid this getting sidetracked too much. It&#x27;s also not really relevant. Suffice to say, my current toolset works pretty well for the kind of thing I need it for, but mostly ... I just haven&#x27;t ever seen the reason to learn MongoDB.<p>The only reason that seems to come up is that everybody uses MongoDB because it&#x27;s popular. Tautology aside, I don&#x27;t really have any idea of what it&#x27;s sweet spot is.<p>What kind of data structures, access and write patterns, and volume of data is the situation where you would choose mongo over any of the other NoSQL db&#x27;s (including things like redis).<p>I&#x27;d love to get some real feedback on this so I know what to tell my friend. Also, I would prefer if people didn&#x27;t just point to things written by mongodb.com, because that feels like yet another tautology.
======
wcummings
Based on my experiences w/ MongoDB:

* If you have a read-heavy workload, it should perform pretty well

* While it lacks transactions, MongoDB supports some fairly powerful atomic operations which make it fairly flexible

* It's all fun and games until you start sharding, which adds a huge amount of complexity and administrative troubles; a few gotchas I ran into: mongoc lists in mongos configs are order sensitive, if they're out of order, your cluster will fail after some indeterminate amount of time and you'll have to step down the primary to fix it, if a mongoc instance goes down, mongos will keep querying it, indefinitely. So every query will get the added latency of that timeout. The proposed workaround from the MongoDB folks was to use a script to firewall off the offending mongoc (what?).

* Rebalancing is a clusterfuck. You can either wait for every write to propagate to your slaves (could literally take weeks), orrrr... disable this "throttle" and get balls-to-the-wall rebalancing which, in my case, pushed the write lock to the point if was dramatically effecting application performance. You basically have to write your own script to deal with this correctly. Rebalancing requires careful planning.

* There's still only only db level locking, so you'll need a db per-collection in many cases, which means you'll need a connection pool per db (and even a mongos instance per db), which can add up to a LOT of connections (thousands)

* 16MB limit for aggregation framework! Because the result of EACH STAGE of aggregation is stored in a document, NO PART of your pipeline can exceed 16MB! This is a huge and very frustrating limitation (though I understand this will be addressed in 2.6). Map/reduce is an option, but far from ideal

What I like: Very flexible, easy to configure (save sharding), great for
prototyping!

Aggregation framework! Building queries as a data pipeline is very cool.

What I don't like: Does _not_ handle write load well (having had to scale
writes in production, even w/ SSDs and sharding), but I can see a read-heavy
unsharded application working splendidly. Compression seems poor, but I have
no hard numbers to back this up.

But keep in mind, mongo is still improving. I would say its still beta-quality
(but not marketed as such! tsk tsk!), but 2.6 is literally right around the
corner (they're on RC1 now IIRC)

------
AdrianRossouw
So I have mostly used CouchDB over the last few years. I've been really happy
with it.

I mean, generally the first thing I would have to do if I had a datastore is
write a REST layer on top of it. With CouchDB, that's just done already. I
love that I can just use streams to pipe things around to and from the
database. On the simpler apps, I basically just end up writing a small node
proxy server that passes HTTP requests to the server, and optionally
filter/sanitizes the data on it's way through.

I absolutely adore the _changes feed, which allows me to open a socket to it
and handle events from the db in "real-time". And the replication is just so
simple and powerful too.

Now I'll admit that it's views have some real deep problems, but I hardly ever
use them except for the simplest of simple things. Mostly when I have any kind
of somewhat complex query i need to do, I add elasticsearch with the couchdb
river. This listens to the _changes feed and indexes the data. So I query
against the ES instance and PUT/GET against the couchdb.

I really love those tools, they are so simple and powerful, and hardly give me
any real trouble (well, nowadays).

Other than being forced to for a job, I can't see why I would use mongo
instead of couchdb for anything. But really, this question is mostly about
what I should tell my friend.

------
mmcclellan
Hey not really answering your question, but since it sounds like you're
heading for a "javascript all the way down" approach, I thought I'd suggest
the mean stack ([http://mean.io](http://mean.io)).

Digital Ocean offers a one-click mean install
([https://www.digitalocean.com/company/blog/announcing-mean-
on...](https://www.digitalocean.com/company/blog/announcing-mean-one-click-
install-application/)) that could be hosted for $5 or $10 as a showcase.

But for an answer, I'd say that MongoDB seems to make for rapid MVPs.

~~~
AdrianRossouw
it feels to me like "it makes rapid mvp's", because that's what everybody uses
so all the rapid mvp's you hear about are mongo.

anyway, it's important that he builds his own so he understands how the build
tools work.

------
bliti
Let's put it this way: There are not many jobs a regular SQL database can't
do. MongoDB is used in many types of environments, but in _my_ experience, it
is best used as a key value store. If you are using a pure Javascript stack,
then it might make sense to store some data into it. But don't try and fit a
complex schema into it.

~~~
AdrianRossouw
If you need a key-value store, why would you use mongo over redis?

There are jobs I wouldn't trust any NoSQL db to do though. The moment
financial data is involved I wouldn't use anything other than an RDBMS.

~~~
bliti
I would not use Mongo over Redis. TBH, I would just _not_ use it. Ever. In my
experience, it has proven itself to not be worthy of my trust. It corrupts the
data.

------
re_todd
I've heard a couple people say they like it because you store new types of
data without going through a bureaucratic nightmare. Some organizations can
take weeks or longer to get a new table column approved, due to red tape or
the DBA not playing well with others. With NoSQL, new data types are easy to
introduce into your document without much political drama and mountains of
paperwork.

~~~
AdrianRossouw
That's pretty much true of all of them though. And mongo seems to have more
schema requirements than most.

