We've been using MongoDB in production for our main content store at Conversocial for over a year now. We've gone from 1.8 to 2.0 to 2.2 and we're happy. We have ~400gb of data, 220 million documents all on Amazon's SSD instances.
We have definitely had our moments where we've screamed that we hate Mongo and are going to rip it out. That's normally where we've overlooked a detail of how it works... and we've had this same experience with every database technology we use - including MySQL and Redis.
The day after, when we've cleared our heads... we're happy again. The same as the other technologies.
I think that with all technologies you're going to get bitten by some detail you didn't know about or had forgotten. The trick is to mitigate these disasters by thinking about your failure cases.
They're not particularly exciting. Example: we accidentally left detailed logging enabled on a secondary server and weren't monitoring the space left on the drive for logs. The disk got full and the secondary failed. Annoying that it failed but it was our fault for not monitoring that space and also leaving detailed logging on!
At our company we have a much bigger MongoDB dataset in production. I can relate well to your comments, since it is the same stuff that I hear from our developers as they defend Mongo after yet another horrific situation. They have a tough time admitting that Mongo is the problem, and instead always find ways to say that it was their fault.
This is such a strange thing for me to observe, but I think that it has to do with the fact that MongoDB works so smoothly initially with a default install that it hooks you in, and then later when you have a large dataset and it stops working well, it's hard to understand how what was such an amazing technology can now fail so badly.
It's also a huge problem that once you run MongoDB at scale, you desperately need experts to fix things, but it's so hard to find any so-called experts who can help. 10gen did a great job of marketing to developers, but unfortunately they seemed to spit in the face of DBAs and Ops people a long time ago by proclaiming them to be unnecessary and archaic. Running any significant MongoDB database in production requires as much expertise as someone running a big MySQL instance, but there isn't a community of database lovers around MongoDB who you can hire.
The DBAs and operations people I know dislike or even hate MongoDB, often for valid reasons that 10gen should address such as the as-yet-unfixed 'write lock' issue, code instability and inaccurate/misleading documentation. 10gen has done a great job at evangelizing to developers and making features that developers love. Now that 10gen has so much money, I hope that they can now afford to start making MongoDB a database that Ops people and DBAs can love.
As a final note, this ServerDensity guy is clearly looking at the world through mongo-colored glasses. It makes sense I suppose since he is all-in with MongoDB for his company. But we were using his service up until three months ago, and had a lot of problems with intermittent performance issues that seemed to be database-related since the site was still working and only certain pages would take 30 seconds or so to load. It's possible that the problems were temporary and no longer exist, but it brings home my point. If MongoDB experts can't make their own services 100% reliable, what hope does a regular startup have of getting MongoDB to work well at scale.
A developer may feel comfortable making the decision to go with MongoDB, but if they are wrong it won't cost them their job and they won't need to pull their hair out dealing with Ops issues all the time. If a DBA or Ops guy is being hired to manage a company's datastores, I don't see MongoDB (even 2.2) being a contender. At this point there are simply other DBs available that can perform the same or nearly the same without all the fussiness. Developers may be unhappy since nothing yet is as easy to develop on, but they'll be happier in the end when stuff 'just works'.
This is an excellent anecdote and one that others should pay attention to, regardless of the technology.
Developers often have confirmation bias for their choices—it's a common human fault. We do it for purchases, life decisions, and software decisions alike. But it's important to watch out for it and be aware when you might be in the thick of your own bias.
A friend of mine (let's call him "me") used Adobe Flex for a relatively large computationally-expensive project, and advocated for it because of many small details and relatively quick learning curve, and ease of UI prototyping. After we got entrenched, we started running into many shortcomings which I (er, my friend) should have realized early on were due to the nature of the platform itself, but I continued to defend Flex because it was my decision and I had put so much time into making it work.
In the end, we realized it wasn't the right fit for this specific job and moved to a more capable platform with fewer issues (aka, "any other platform than Flex").
Before making big decisions, I try to remember that one mistake, and get out of my own way first to look at it from an outside perspective. I try to differentiate when I'm making decisions based on fact, or if I'm just trying to justify patching holes in the titanic. This "looking at the world through [insert technology]-colored glasses" idea is spot on, and everyone—not just those using or debating MongoDB—should be aware of.
Exactly. ..and while I can appreciate the frustrations and tales of woe, I read the #*%#k out of the docs as well as build a number of prototypes before I go all in. I also read all the top questions on StackOverflow on that tech, as well as any blogs that mention it.
Also: google "technology name" sucks to find out all the naysayers so you know what the downsides are from people in the field and make sure you are prepared to deal with them.
If you do the five minute quickstart guide, and read the feature list, and throw it into the mix, well...
Of course. But this isn't a question about what is or isn't in the documentation. Of course everything you need to know should be in the documentation, and you should RTFM.
No, it's about obfuscation vs clarity, and to an extent, about how good the overall design really is. Look at it as a measure of quality. If I have to read every word of the documentation to find the tiny part on page 18 where it tells about that one flag I need to start it up with in order to ensure that one feature works the way I want; versus a sensible default and clear documentation and even clear design where perhaps that flag isn't even required (think automated memory management that Just Works, versus a dozen command-line or config-file switches about memory buffer size and such).
When you run into issues, it's not a useful excuse to say "Well, you should have read the documentation, it was all there." That's like a shady credit card contract where the rate goes up 30% if you miss a payment. It's easy, even if you read it, to say "well I won't ever miss a payment," and even easier to miss that clause completely. Can you blame people for not reading the fine print? Sure, it was their responsibility. Will people still do it? Of course. Will people who read the contract fully still run into issues if they make one mistake? Probably.
And the point: is it a crappy contract? Yes. It has a crappy feature to begin with, and it's made worse by its obfuscation. This is almost unarguable: a credit card with a lower rate penalty is better. A piece of software with good defaults and sensible design is better.
The requirement for asinine and lengthy documentation isn't just a big warning sign that you should read it—it's a sign of poor design, or at least the word everyone's been using to describe Mongo: immature.
Good design includes the whole experience of using the software, and takes into account good integrated systems and human interaction. Bad design requires you to read the documentation extremely carefully. These are not hard and fast rules, but they're certainly warning signs. Really clear and obvious warning signs. And the overall point is that the well-documented issues that come up with Mongo are not just stupid people who don't read documentation—they are statistically relevant pieces of evidence pointing to some poor design decisions.
You should really think of this whenever you see a trend. You have a choice. You can blame individuals for "not reading the documentation"—or you can look at the systematic trend and statistically evaluate the problem. The former allows you to quickly dismiss issues on an emotional basis and make yourself feel better, while doing absolutely nothing to solve the issue. The latter lets you collect useful information and make real changes that have a real impact on the issue at hand. Your call.
> But unfortunately they seemed to spit in the face of DBAs and Ops people a long time ago by proclaiming them to be unnecessary and archaic.
WTF ? I have never heard 10gen say anything remotely like this. And it surely hasn't come across in their marketing. I mean seriously. Which developer thinks that in a production environment they are going to be the ones supporting the database ? Nobody.
> If MongoDB experts can't make their own services 100% reliable, what hope does a regular startup have of getting MongoDB to work well at scale.
The fact that you base your impression of MongoDB based on ZERO evidence just conjecture that the slowness of their site is database related says a lot about you too. There are many other reasons it could equally be: app server, network etc.
> If a DBA or Ops guy is being hired to manage a company's datastores, I don't see MongoDB (even 2.2) being a contender.
Have you even worked at an enterprise company before ? DBA/Ops aren't the ones in control. If the development team wants MongoDB installed and have business justification it gets installed.
> Developers may be unhappy since nothing yet is as easy to develop on, but they'll be happier in the end when stuff 'just works'.
But it does 'just work' that's the whole point. Developers aren't stupid and MongoDB is not the only database around. You just need to (god forbid) understand how the thing works.
How on earth do you equate that statement with "we don't need a DBA when we get into production" ?
Do you need a DBA to get MySQL running ? No. Oracle ? No. SQL Server ? No. That's all it means. Normal people understand that there is a difference between getting something running and deploying it into production.
Point me a doc(or a tiny little red note link, like in the 32 bit installer download page) where 10gen says that you will need a dedicated MongoDB DBA to maintain production server. You see 10gen marketing machine have successfully created a perception that MongoDB is so much magical that you don't have to worry about data. New programmers with less knowledge in RDBMS are buying this argument. And most of the time they start into development just after reading basic tutorial. This the problem GP is trying to address.
> The fact that you base your impression of MongoDB based on ZERO evidence just conjecture that the slowness of their site is database related says a lot about you too. There are many other reasons it could equally be: app server, network etc.
Those other things are trivial to fix (e.g. add more app servers) so I too think that it's safe to assume it's probably the data store's fault.
None of what you said makes ANY sense. Have you ever actually scaled an app before ?
Adding more app servers increases the number of requests you can handle. It doesn't make slow apps or latency faster. Both of those are possible causes for why certain requests may be slower. It's not necessarily a database problem.
> Did you read the article?
Yes. I read the documentation before installing MongoDB so I haven't had any problems (so far).
There's no reason other than the data store that would make their site that slow. Yes I have experience in this area. Until they come out and tell me otherwise I'm gonna bet that the data store is at fault. I could be wrong but I'm willing to give you 5 to 1 odds against.
Could you elaborate on the things that MongoDB brings to the table versus a database like MySQL? If you assume that the entire dataset fits in memory for both MongoDB and MySQL, as well as SSD backing for both, what particular advantages does MongoDB have for you in production?
With 400GB of data I'd imagine you are going to be ripping your hair out at some point or another regardless of what database you use ... there is no way to completely avoid tech issues over the lifetime of a product/service.