
Goodbye, MongoDB - zedr
http://www.zopyx.de/blog/goodbye-mongodb
======
mitchitized
How tiring. We can litter the internet with posts like this, but it would be a
_lot_ more useful to post reasoned, factual and detailed posts when discussing
the merits or pitfalls of a given technology.

I've launched very high traffic websites using MongoDB, where it was the least
of my worries. I've also launched very high traffic websites using MySQL,
where it was my primary source of pain.

Shall I now run around screaming loudly about how badly MySQL sucks? On the
surface you might assume so, but with reasonable effort to consider all the
facts, it becomes clear that there's more to this than just "this database is
better than that one because I hate it."

This post is more of an angry payback rant from someone who got banned from
IRC for being abusive to newcomers and greenies. On top of that, it is
factually incorrect in some spots, and misleading in others. Not impressed.

That said, NoSQL databases are evolving extremely quickly, and have reached a
point of maturity that they can excel in the right scenario. Just like
relational databases, they can provide big benefits, given that the user takes
the time to learn how to use them, as well as making sure they have a use case
that merits the strengths of their chosen tech.

I really enjoy working with MongoDB and Redis, and have been a community
cheerleader of sorts for PostgreSQL for years. It _is_ possible to have
multiple tools in your toolbox, and use them when appropriate. No tool is
perfect, and no tool is perfect for every job.

Complaining bitterly that your shiny new piano makes a lousy boat, on the
other hand, simply adds to the noise.

~~~
flippyhead
would love to hear specifically what was factually incorrect

~~~
herman_toothrot
Here's one I noticed: I have a small app running on 2.0.1 and there are _two_
prealloc.X files in the journal directory, each one is 256mb. Not quite 3GB
unless my math is wrong. (There's a third file in the directory named `j._0`
which is also 256mb)

------
mark_l_watson
It seems like ranters against MongoDB often don't really understand how it
works and thus what it is good for.

My simple mental model for MongoDB is: indices (should!) fit in memory and
documents are stored contiguously on disk. That is it in a nutshell. A query
involves in-memory lookups and maybe only one disk seek. Writes in place are
usually possible.

I have been happiest with MongoDB in two different scenarios:

The first is in developing small web applications where there is no scaling
issue, and the fact is that MongoDB is so easy to develop against and simply
provides a great developer experience. As needed, I use a cron job to do a
mongodump a few times a day; or, if I really need high availability (which,
frankly, often I don't: if a system is unavailable one or twice a year for an
hour it is no big deal) then replica sets are OK.

The second scenario where I have really liked using MongoDB was doing
analytics on a modestly large stream of social media data. A single Mongo
master on a large EC2 instance was adequate to handle writes and slaves on
other large EC2 instances each fed a different analytics application. This
setup of apps reading from a slave on the same server worked really well for
me. This was a low hassle experience.

I do have one customer with really large MongoDB setups on multiple data
centers, and I am working around right now on some hassles, but we haven't
found anything else as cost effective for the customer's applications.

All that said, when I can use it, just using a single (no horizontal scaling)
PostgreSQL server is for me the most hassle free developer experience, but I
have always used PostgreSQL for small or medium sized applications - nothing
that needed to scale.

~~~
dinedal
You mention two different scenarios that use MongoDB effectively, and both are
not what MongoDB purports it's supposed to be used for.

The first is a small setup. MongoDB is called that because it's supposed to be
for "humongous" data sets. Also, in what case is 3GB of data preallocated for
journalling, which is one of the points in the article, good for a small
installation?

The second is "modestly" sized stream of social media data. Having myself
worked with a much larger stream of social media data in Mongo, I can attest
that the second you leave the land of a single Mongo server, you have a much
bigger problem, sharding. Sharding is terrible in Mongo, writes are _shard_
locked, not collection, your shard keys are immutable (imagine having an
indexed field you can only set ONCE), and fraught with data loss. Did you have
a drop in network connectivity between your mongos and you config server? You
just silently lost data. Safe being on doesn't matter, if the config server
doesn't get the write it isn't able to report where the data is for a read,
even if it is confirmed it was written to disk.

To your point about mongo ranters not understanding what it's good for,
MongoDB tells everyone it's good for "big" and "fast" data. However, it fails
at both of these, because it doesn't easily scale up from one node, and it
doesn't write quickly when you actually want to make sure that your data is
there. What's the point with writing data quickly to something that looses it
quietly? Might as well pipe it to /dev/null/

------
dlokshin
I feel like 1 - 2 years ago I was reading a slew of blog posts with the title
"Why We Chose MongoDB." Now it seems like all of the blog posts are some sort
of "We Just Finished Migrating off of MongoDB, Here's Why."

I know nothing about MongoDB and have never tried it. But the message seems
pretty clear.

~~~
ax
The message is, just like every technology, there's an initial period where a
vocal minority loves it and tries to use it for everything. Then, there's a
backlash where a vocal minority hates it and thinks anyone who uses it is
clearly an idiot. All the while, the silent majority go on getting work done.
It's been this way for as long as I can remember.

~~~
roqetman
I fully agree here. A couple of years ago, when I went on interviews at
startups, they were all using nosql db's and were proud of it. More recently,
I've been interviewing at startups that are now bigger, need to mine the data
that they've collected over the last few years, and now are migrating off of
nosql db's to rdbms' (or creating strange amalgams of the 2). I did see this
coming, but it was very hard to make them understand back when it was the
coolest thing.

~~~
jasonmccay
The reality is that developers are maturing in their understanding of
different technologies and they are learning how to apply them in correct use
cases.

There continues to remain a "golden hammer" syndrome where white horses and
unicorns run free, but it doesn't exist.

Instead, the vision of "NoSQL" was to tell developers that they did not have
to use relational data for everything, but could, instead, use the right tool
at the right time. Why is this such a hard concept?

If you are a developer and you don't understand the tool you are wielding
(it's pretty clear the author of this blog didn't), then you will incorrectly
use the tool and experience pain.

That is the fault of the developer, not the tool.

~~~
fadzlan
Care to specify at which point the author of the blog post misunderstand
MongoDB? Its not pretty clear to me about that.

Are you saying that one of the way he is using MongoDB is incorrect? If yes,
what point did he say reflects that?

His rants seems quite specific.

------
sdm
While some of the author's criticisms are valid, some of them are completely
wrong:

> Having no option to perform an operation comparable to UPDATE table SET
> foo=bar WHERE....

What? db.collection.update does exactly this. See:
[http://www.mongodb.org/display/DOCS/Updating#Updating-
update...](http://www.mongodb.org/display/DOCS/Updating#Updating-update%28%29)

MongoDB fit a nice niche for a read heavy mid-scalability db solution. Every
DB has it's niche. Trying to use it outside of what it's good for is going to
get you burned. If people just did their research before blindly committing to
a platform, we'd see a lot less posts like this.

~~~
dreadpirateryan
Agreed. I found the mapreduce criticisms to be a little off:

> Now instead of fixing a bad implementation or fixing the underlaying
> architectural issues, MongoDB is moving to Hadoop.

I don't think that's accurate. They have a new "aggregation framework" coming
that is meant to replace mapreduce. It could be a wrapper around hadoop, but I
couldn't find anything documented about that. I completely agree that a
blocking mapreduce is annoying, however, does any framework have a non-
blocking mapreduce? I haven't tried many mapreduce implementations out, so
this is a genuine question.

~~~
bunderbunder
I don't believe RavenDB blocks. Instead, it returns the results along with a
flag to indicate if any of the source data may have been modified while the
operation was running.

~~~
mattwarren
You're right, all RavenDB indexing operations (including Map/Reduce) are done
on background threads. When you query, it returns a flag (as you say)
indicating if the index is currently stale or not.

------
haberman
> Leaving memory management to the operating is nice idea - in reality it does
> not scale and does not play very well.

This is why I think that Linus's tirade against O_DIRECT is misguided:
<https://lkml.org/lkml/2007/1/10/233>

Here's the thing: the kernel is a library. It took me a long time to fully
understand this deep idea. The kernel is just a library that has a different
and more expensive calling convention (syscalls) and runs at a higher
privilege level.

It's also much less flexible than user-space libraries. Its interface is an
unholy mix of syscalls, ioctl(), /proc, vdso, etc. There is a high bar to
adding new interfaces. Removing or changing existing interfaces is basically
not allowed.

The resources that the kernel uses are much harder to account for or predict.
How can you ensure that a process always gets at least X MB of page cache, and
that some enormous "cp" that some sysadmin is running won't evict all your
MongoDB pages that are caching your database? Sure you could mlock() your
pages, but now you're basically side-stepping all of this smart kernel cache
management that was supposed to be helping you so much in the first place.

User-space management of buffers and caches is more flexible, easier to
account to its owner, and more predictable. It can't handle page faults with
Linux's current interfaces, but the L4 guys have figured out how to let pagers
run in user-space and handle page faults. I hope that someday this work
becomes mainstream. Our 20-year-old OS design is showing its age.

~~~
MattRogish
This is why most DBMS products do their own memory management. They just suck
up as much memory as you will allow and use it for their own devices.
Specialized, tunable, application-level memory management will probably always
beat a general-purpose, application-ignorant OS-based scheme.

But, there's something to be said for the simplicity - for most folks you
don't need to manage the memory yourself. When you need it though, there's
really no easy substitute.

~~~
haberman
> But, there's something to be said for the simplicity - for most folks you
> don't need to manage the memory yourself.

Yes, and most people don't need to implement printf() themselves, which is why
there is libc. Just because you're doing it in-process, in user space, doesn't
mean you're rolling your own!

------
PakG1
_There is no single way to control the memory usage using system tools except
maintaining mongod instances on dedicated virtual machines without running
further services. There are numerous complaints from people about this stupid
architectural decision from various side and 10gen is doing nothing to change
this brain-dead memory model._

Can someone explain to me why this is actually a big issue? Except for really
tiny apps, I imagine that having dedicated VMs for your MongoDB actually would
be perfectly fine? Probably even preferred?

~~~
bunderbunder
I was wondering the same. It's pretty standard for databases to not be very
good about sharing with others, memory-wise. That's why having a dedicated
server is such a popular best practice for non-puny applications. And MongoDB
says it's not designed for puny applications right there in its name.

~~~
jasonmccay
There are fairly straight-forward ways to control memory usage in MongoDB,
especially if this is a concern for you. All you need is a bit of OS know-how
and it works just fine.

We run both large (multi-shard clusters) and small memory (500MB - 2GB) use
instances of MongoDB and have no problems.

It would be good to have developers acknowledge that, perhaps, they may not
have all the information instead of declaring that something can't be done.

~~~
rbrcurtis
Care to explain how, say in linux?

~~~
moe
Either jason has something really interesting up his sleeve or he simply
doesn't know what he's talking about.

Last I checked linux had no interfaces to partition the pagecache in a
meaningful way, short of rather extreme gymnastics involving kernel-patches or
tmpfs abuse.

If that has changed then I'd certainly also be curious to hear about it.

------
PaulHoule
Hell yeah.

From what I've seen of MongoDB I'm not impressed at all. In some carefully
controlled cases, performance would be acceptable, but change anything at all
(even the order that data is inserted) and it just sucks.

For one particular application, the performance difference between MySQL and
Mongo was like the difference between the Space Shuttle and a Chevy Sonic.

~~~
jeskeca
I think it's important people understand WHY this is the case, because it's
not simply an issue for MySQL vs Mongo.

The simplest way to get really good performance for multi-row queries (even if
you fall out of cache) is to physically order your data in query-order. (that
is, if you are going to ask for the most recent 100 blog comments, order the
comments by (blog-post-id, reverse comment-date).

MySQL -Innodb makes this really easy (your data is physically ordered by
primary key). In MongoDB it's not possible.

Here is a more elaborate explanation...

MySQL-Innodb stores record data in primary-key order (it puts the data right
into the b-tree). This means that if you want to access 100+ records in
primary-key order, it's pretty darn efficient. Even if it's out of cache, it
could be just one or two disk seeks (depending on how many records fit in a
block)

MongoDB stores record data in a heap-table in a semi-random order based on
insertion and freespace, with each document receiving a "doc id". You can make
an index on whatever you want, but when Mongo fetches multiple records, it
cross-references every index entry with the doc_id. If your data is out of
cache, this means a seek for _every single document_. AFAIK, as of 2012, there
is no way around this, because there is no way to get mongodb to store the
document data directly in the b-tree. This is a big part of why Mongo is
super-slow if you fall out of cache.

HOWEVER, there are some other systems that also have this problem, including
some ORMs that sit ontop of MySQL. Ruby-on-Rails forces you to use an
auto_increment primary key for every record -- which means even if you use
MySQL, you are forcing your data to be in a semi-random insertion order.
Django does this as well. If you want to efficiently fetch a bunch of records
(like 100 comments on a blog post), then you want them to be in primary key
order.

In the SQL world, Oracle, MS-SQL, and Postgres also normally use a form of
heap-table for records... This allows records to have a physical "ROWID" which
can be used to directly look them up (it's a physical block address with an
O(1) lookup). However, it also means they are in semi-random order. The ROWID
was an important join and foreign-key optimization back in the days of nightly
SQL jobs on machines with very little RAM. Today, it's not a good
optimization. B-tree indirect blocks are always in RAM, so direct ROWID
lookups have little benefit over b-trees, and the huge drawback of no natural
primary key ordering. One can workaround this problem with fully-covered-
indicies, table-in-index, and key-clustering -- all of which have their own
frustrating tradeoffs.

Does primary-key ordering have downsides? References to records are bigger
(They have to contain the full primary key), there is no guaranteed stable way
to reference a record (for foreign key constraints), and if you change fields
in the primary key, the entire record must be moved. In the "old days" there
was another big drawback, b-tree O(log-n) lookups are much slower than ROWID
O(1) lookups.. Today this is not an issue because all b-tree indirect nodes
fit in RAM always.

Bottom line, if you want the easiest way to order data properly, use MySQL-
Innodb, and choose your primary key wisely. If you are using another storage
system, study up on how you can control physical ordering becuase every system
is different.

~~~
bsg75
Does MySQL(InnoDB) allow you to cluster on something other than the PK?

Big advantage in that option (Postgres, MSSQL, Oracle).

------
heelhook
"My essage to companies building applications on top of MongoDB: assigned
smart people to MongoDB and don't leave the database work to people that can
hardly spell their name or that can just count to three. Yes, this paragraph
is harsh and does not comply with diversity but it is true and reality. The
number of people that should not do any database related work, people without
reasonable background, people lacking basic skills in understanding databases
is extraordinary high."

Isn't that true for _any_ database? What point are you trying to make? That a
large MySQL deployment can be flawlessly be maintained by people that can
"hardly spell their name"?

~~~
drats
I also like "essage" and "assigned smart people" instead of "assign smart
people" in a sentence talking about intelligence and spelling. He's probably
not a native speaker but that only excuses the second bit, and only partly.

------
mrkurt
This is kind of a strange list of complaints.

MongoDB memory management is a legitimate concern... but not because it's hard
to control memory usage of a single mongod.

"More granular locking" is a temporary, non-scalable solution?

I've run out of energy, actually, but really?

~~~
harryh
Ya, there are lots of well reasoned complaints about MongoDB out there. This
is...not one of them. It's weird that this is getting voted up.

------
dkhenry
This reads more like a rant then an actual discussion of problems the company
was having with MongoDB. There is a place for valid criticism, but this is the
polar opposite. I am actually more interested in the fact that this made it to
the front page so fast tehn the actual content of the article. Are there so
many people upset with Mongo that even something as poorly done as this rant
can get publicity ?

------
kyt
We dumped Mongo for a lot of the same reasons listed in the article. Their
python driver isn't great either.

~~~
mydigitalself
In favour of...?

~~~
drano
In my case, Redis. Really good choice for my project

~~~
jshen
I don't understand how you would switch from mongodb to redis. They don't do
the same thing, at all.

~~~
chris_wot
I think your second sentence answered your first.

~~~
jshen
How does someone mistakenly choose mongo for a job fit for redis?

------
Uchikoma
Wow, how many "Goodbye, MongoDB" stories are coming? The last days had several
ones. Not sure if this is already a trend?

~~~
kennystone
The trend is this: any loved technology that's been around a little while
(Rails, Node, Mongo, etc) makes for dramatic farewells on hackernews. It's not
unlike supermarket tabloids.

~~~
zacharyvoase
I disagree. The trend is this: any hype has an equal and opposite anti-hype.
10gen have spent (probably) millions of dollars and countless hours
representing MongoDB and singing its praises in various media (conferences,
user groups, Web, etc.). If you do so, you need to expect equally fierce
reactions when people realise your claims are unsubstantiated (or, at the very
least, not as universal and problem-free as you originally portrayed).

~~~
bsg75
I wonder where all those millions come from? All venture capital, or is 10gen
gaining paying customers?

~~~
orthecreedence
As a former Aol employee, I know of a large support contract between them and
10gen. I think a lot of money comes from large companies who have teams that
want to use MongoDB, and the large company buys a huge support contract for
that team and any other possible team who wants to use it. Maybe that was just
Aol, but I'm guess that pattern is that same in other large tech companies
too.

------
white_devil
But.. but.. MongoDB is web-scale: <http://www.youtube.com/watch?v=b2F-DItXtZs>

(Note how the video raises some of the same concerns as the blog post)

~~~
lucisferre
Because nothing adds to the cogency of one's argument like cartoon animals...

------
pooriaazimi
I'm amazed at all these (excuse me) idiotic articles. People/projects have
different requirements, so there are many databases around(relational and
nosql and key/value). Just because your needs do not match MongoDB's (or
MySQL's or ...), does not mean the technology is useless.

------
daurnimator
The biggest problem with mongoDB IMO is that BSON dictionaries are ordered.
Let that sink in for a sec: the hash data structure must be ordered.... The
solution most drivers run with is to just alphabetically order each
dictionary.... a ineffcientcy I'm not really happy with.

------
madhadron
Look, someone else realizing that Codd was right.

~~~
weavejester
Right about what, exactly? The relational model certainly has its advantages,
but it is not without its disadvantages, either. When it comes to databases,
there's no one model that caters for every situation.

------
mangler
This is completely unrelated to the subject, and I have not used mongodb, have
no idea if it's good or bad, and I don't even know how to spell it, but isn't
it ironic that this post shows up on the day when the top HN post's title is
"Please learn to write"?

As I said, no idea how good or bad mongo is, but I'm guessing, if you are as
sloppy in your code as you are in your English, I'll be happy to give mongo
the benefit of the doubt...

------
triathlete
Its unfortunate that right now none of the 3 major document stores seem to be
doing all that well or are easy to use straight out of the box. I use and like
mongodb but only for prototyping. I havent decided what to go with longer term
if my projects have a need. Couchdb is interesting but seems to be going
through some serious growing pains right now with the couchbase product being
very confusing to figure out and use. Riak is also interesting but it seems
more specialty then a general purpose tool.

Kind of a bummer.

~~~
orthecreedence
I've done a lot of research and testing with Riak (not had the pleasure of
using it in production yet), and although some of the things that are easy in
other DBs are backwards (like listing all records, for instance), I have a
secret love affair with it. MongoDB claims to have simple scalability, but
I've worked with it on a large project at Aol and found this to not be true at
all. We basically had to implement our own sharding on top of it since its
auto-sharding was so poor, and only needed to shard in the first place because
of the global write lock. Riak, on the other hand, is an operations dream, and
if you're a dev and ops guy in one (like me) I stay up at night thinking about
using it for every project.

That said, a lot of people have started to use Riak for a general purpose
tool. There are some development growing pains associated with this (and you
have to think _very carefully_ about your keys and data structure) but it's
only getting easier with things like secondary indexes and Riak search. If
there was a non-expensive way to enumerate all the data in a bucket, I think
that'd be the one last item on the checklist before I jumped on it.

------
gggggood
I keep reading about how mongo's use of memory mapped files is real bad. Isn't
that the same technique used by varnish cache and that's what makes it
awesome? Can someone explain please?

~~~
moe
Varnish has a much simpler access pattern (which happens to fit the page-cache
semantics like a glove in almost all use-cases) and was developed by a
drastically more competent team.

Comparing varnish to mongodb is akin to comparing a precision Rolex from Swiss
to a plastic mickey mouse watch from a gumball machine.

------
lucian1900
Actually, using mmap-ed files is a great idea. It's precisely what Varnish
does too.

~~~
stingraycharles
That was exactly what my thoughts were too: letting the OS do all the memory
management, caching, is a strategy many great projects use, among which
PostgreSQL and Varnish.

However, I do feel there is something "wrong" about the approach MongoDB is
taking. They need to allocate new files in huge buffers, which completely take
up all I/O while being filled with zeroes. There is no logical hierarchy in
the files, and it just feels a bit weird.

Perhaps they should've taken the approach PostgreSQL did, which is to simply
use files and read from them instead of using mmap. The whole reason they went
for a global lock instead of more granular lock is because the whole mmap'ed
area is one big blob, and it was the most "obvious" approach.

~~~
orthecreedence
Thanks for the insight on the global write lock. I've searched and searched
and wasn't able to find anything on _why_ they have the global lock.

Out of curiosity, is there a simple way to explain why someone would mmap
instead of just reading files directly (I've never done any programming with
mmap, so I'm a bit ignorant of its use cases)?

~~~
stingraycharles
This SO entry seems to answer your question fairly well:
[http://stackoverflow.com/questions/258091/when-should-i-
use-...](http://stackoverflow.com/questions/258091/when-should-i-use-mmap-for-
file-access)

------
pedalpete
So what is the new solution? Back to relational? Or another type of document
store?

~~~
jandrewrogers
MongoDB is fine at the conceptual level. The problem is that the architecture
and implementation are consistently poor in myriad ways. It is a case study of
what can happen when well-meaning individuals with little experience in
database architecture and implementation attempt to build a scalable database
engine.

That said, a competently engineered RDBMS can do everything NoSQL databases
can, particularly limited databases like MongoDB. The caveat is that you have
to learn how to use those databases; they are very feature rich and powerful
but that flexibility makes them more complicated. PostgreSQL is a very good
choice from the open source world and is just as fast as NoSQL du jour in the
hands of someone that knows it.

I currently design extreme-scale real-time analytical database engines, so I
have no vested interest in any particular solution (we are not really
competing with the current market). If I was going to build a large-scale web
app today and needed a backing database, I would go with PostgreSQL -- it is
very capable and well-engineered.

~~~
the_paul
...a competently engineered Turing Machine can do everything NoSQL databases
can, particularly limited databases like MongoDB. ... Emacs is just as fast as
NoSQL du jour in the hands of someone that knows it.

FTFY.

The difference between using a good competently engineered distributed
database and PostgreSQL is that when your prime concerns are horizontal
scaling and operations costs, the distributed database can be an order of
magnitude simpler and faster than PostgreSQL given the same amount of effort.

------
salimmadjd
Blame yourself before blaming MongoDB. If you've been around the software
industry you should always be mindful of fads and vaporwares. When you make a
decision to use MongoDB you better have done your homework first or do some
testing yourself. Given the low cost of renting bunch of EC2 machines for a
few hours, it's idiotic to build a business around MongoDB or any other system
that has not been fully proven without doing bunch of stress testing yourself.
Yes, and don't trust software vendors, get independent advice or test it
yourself.

------
debacle
A good read. I'm working on a relatively large project now, written in Node
(whee), and I was considering going full-koolaid with Mongo. I think I may
stick to MySQL.

~~~
getsat
Give PostgreSQL a shot if you're still considering databases.

~~~
debacle
I love Postgres, but I'm building something for OSS use and most unzip-and-
deploy devs don't know how to set up Postgres.

Any additional information that you might be able to give me that might
convince me?

~~~
getsat
Ah. If your target is people who aren't going to be reading docs, I don't
think any DB is really preferable. :)

Any more details about the project, or is it under wraps?

~~~
debacle
It's basically a simple blogging platform, going to be released OSS.

Most devs know how to get MySQL running without too much thought, but many
don't even know what Postgres is.

~~~
chris_wot
It might be worthwhile educating them. It shouldn't be that much harder to
show them how to setup Postgres than MySQL :-)

If it's only a simple blogging platform, it might well be that MySQL will do
the job just fine though.

------
drano
My Xen VM crashed several times by day during several weeks. And, this problem
disappears when I removed my mongodb database...

~~~
jonny8
Some kernel version had issues with block devices causing kernel panics, a
minimum of 2.6.36 is needed.

------
petercooper
_using JSON as a query language was a bad decision. The current JSON query
language works for standard queries but the functionality of the operators is
limited._

These two things don't go hand in hand. JSON _could_ be used to elegantly
represent complex queries. A problem with the query system isn't necessarily a
problem with JSON.

~~~
macspoofing
>JSON could be used to elegantly represent complex queries

I think it could be used to represent complex queries, I don't think elegantly
would be it. I'm thinking it would be a small step up from an XML
representation.

------
dazoot
We started out with just MySQL. Then added MongoDB + replicasets. Then added
Cassandra. And now we just finished adding Elastic Search. All of this for the
same Web Application. Use the right tool for the job. The pattern i've noticed
is that indeed we started migrating DATA out of MongoDB, mostly to Cassandra.

------
stewie2
is there any replacement for mongodb?

~~~
bsg75
PostgreSQL: [http://nosql.mypopescu.com/post/14114368488/postgresql-
hstor...](http://nosql.mypopescu.com/post/14114368488/postgresql-hstore-the-
key-value-store-everyone-ignored)

------
hmans
tl;dr We LOVED MongoDB (<http://www.zopyx.de/blog/plone-using-highcharts-and-
jqgrid>) but we got burnt so it's USELESS and BRAINDEAD!

~~~
paulsutter
That's the common pattern. MongoDB is easy and convenient when you first start
using it, but you hit the limitations pretty quickly. I had the same
experience.

~~~
orthecreedence
I really think if they had document-level locking, it would be a much more
successful DB. 10gen's answer to this is "just shard!!1" except that sharding
in Mongo really isn't as easy as they make it out to be...not to mention what
database makes you shard just so you can support more than a few hundred
writes per second? Maybe this has improved a bunch since I got burned by it on
v1.6, but I believe it's very misleading to claim scalability of a product
that has a global write lock. Being forced to shard when you reach the small-
medium size is just sad.

All other aspects about it I love, though. It really makes development and
deployment so much faster. Replica sets aren't perfect, but setting up MySQL
replication w/ automatic failover on three or more machines is a recipe for
disaster unless you have a DBA to sit there and baby sit it full-time.

~~~
mathias_10gen
Locking in particular has improved substantially since 1.6. For example, 2.0
introduced yielding in some cases where MongoDB would go to disk rather than
page-faulting with the lock held[1]. This has been extended and improved for
2.2, along with increasing the granularity of the lock from process-wide to
per-db. There are plans to increase the granularity further in future
releases.

[1] To see an example of the difference this makes see
<http://blog.pythonisito.com/2011/12/mongodbs-write-lock.html>

------
scubaguy
mmap files and sharding...

It seems like the problem is that you're not using MongoDB in a sharded setup
to begin with. For good or bad, MongoDB targets the scale where you need
sharded and replicated setups. In other words, a large enough operation to
require multiple servers for data storage. If you need the opposite of that,
which is multitenancy, MongoDB is not going to be a good fit.

On the other hand, MongoDB has always been sold as a rapid prototyping and
easy to iterate datastore, which is attractive for people working on small
projects. Then they have an "oh shit" moment when they run into operational
issues.

