First, I tried to find any client of ours with a track record like this and have been unsuccessful. I personally have looked at every single customer case that’s every come in (there are about 1600 of them) and cannot match this story to any of them. I am confused as to the origin here, so answers cannot be complete in some cases.
Some comments below, but the most important thing I wanted to say is if you have an issue with MongoDB please reach out so that we can help. https://groups.google.com/group/mongodb-user is the support forum, or try the IRC channel.
> 1. MongoDB issues writes in unsafe ways by default in order to win benchmarks
The reason for this has absolutely nothing to do with benchmarks, and everything to do with the original API design and what we were trying to do with it. To be fair, the uses of MongoDB have shifted a great deal since then, so perhaps the defaults could change.
The philosophy is to give the driver and the user fine grained control over acknowledgement of write completions. Not all writes are created equal, and it makes sense to be able to check on writes in different ways. For example with replica sets, you can do things like “don’t acknowledge this write until its on nodes in at least 2 data centers.”
> 2. MongoDB can lose data in many startling ways
> 1. They just disappeared sometimes. Cause unknown.
There has never been a case of a record disappearing that we either have not been able to trace to a bug that was fixed immediately, or other environmental issues. If you can link to a case number, we can at least try to understand or explain what happened. Clearly a case like this would be incredibly serious, and if this did happen to you I hope you told us and if you did, we were able to understand and fix immediately.
> 2. Recovery on corrupt database was not successful, pre transaction log.
This is expected, repairing was generally meant for single servers, which itself is not recommended without journaling. If a secondary crashes without journaling, you should resync it from the primary. As an FYI, journaling is the default and almost always used in v2.0.
> 3. Replication between master and slave had gaps in the oplogs, causing slaves to be missing records the master had. Yes, there is no checksum, and yes, the replication status had the slaves current
Do you have the case number? I do not see a case where this happened, but if true would obviously be a critical bug.
> 4. Replication just stops sometimes, without error. Monitor
> your replication status!
If you mean that an error condition can occur without issuing errors to a client, then yes, this is possible. If you want verification that replication is working at write time, you can do it with w=2 getLastError parameter.
> 3. MongoDB requires a global write lock to issue any write
> Under a write-heavy load, this will kill you. If you run a blog, you maybe don't care b/c your R:W ratio is so high.
The read/write lock is definitely an issue, but a lot of progress made and more to come. 2.0 introduced better yielding, reducing the scenarios where locks are held through slow IO operations. 2.2 will continue the yielding improvements and introduce finer grained concurrency.
> 4. MongoDB's sharding doesn't work that well under load
> Adding a shard under heavy load is a nightmare. Mongo either moves chunks between shards so quickly it DOSes the production traffic, or refuses to more chunks altogether.
Once a system is at or exceeding its capacity, moving data off is of course going to be hard. I talk about this in every single presentation I’ve ever given about sharding[0]: do no wait too long to add capacity. If you try to add capacity to a system at 100% utilization, it is not going to work.
> 5. mongos is unreliable
> The mongod/config server/mongos architecture is actually pretty reasonable and clever. Unfortunately, mongos is complete garbage. Under load, it crashed anywhere from every few hours to every few days. Restart supervision didn't always help b/c sometimes it would throw some assertion that would bail out a critical thread, but the process would stay running. Double fail.
I know of no such critical thread, can you send more details?
> 6. MongoDB actually once deleted the entire dataset
> MongoDB, 1.6, in replica set configuration, would sometimes determine the wrong node (often an empty node) was the freshest copy of the data available. It would then DELETE ALL THE DATA ON THE REPLICA (which may have been the 700GB of good data)
> They fixed this in 1.8, thank god.
Cannot find any relevant client issue, case nor commit. Can you please send something that we can look at?
> 7. Things were shipped that should have never been shipped
> Things with known, embarrassing bugs that could cause data problems were in "stable" releases--and often we weren't told about these issues until after they bit us, and then only b/c we had a super duper crazy platinum support contract with 10gen.
There is no crazy platinum contract and every issue we every find is put into the public jira. Every fix we make is public. Fixes have cases which are public. Without specifics, this is incredibly hard to discuss. When we do fix bugs we will try to get to users as fast as possible.
> 8. Replication was lackluster on busy servers
This simply sounds like a case of an overloaded server. I mentioned before, but if you want guaranteed replication, use w=2 form of getLastError.
> But, the real problem:
> 1. Don't lose data, be very deterministic with data
> 2. Employ practices to stay available
> 3. Multi-node scalability
> 4. Minimize latency at 99% and 95%
> 5. Raw req/s per resource
> 10gen's order seems to be, #5, then everything else in some order. #1 ain't in the top 3.
This is simply not true. Look at commits, look at what fixes we have made when. We have never shipped a release with a secret bug or anything remotely close to that and then secretly told certain clients. To be honest, if we were focused on raw req/s we would fix some of the code paths that waste a ton of cpu cycles. If we really cared about benchmark performance over anything else we would have dealt with the locking issues earlier so multi-threaded benchmarks would be better. (Even the most naive user benchmarks are usually multi-threaded.)
MongoDB is still a new product, there are definitely rough edges, and a seemingly infinite list of things to do.[1]
If you want to come talk to the MongoDB team, both our offices hold open office hours[2] where you can come and talk to the actual development teams. We try to be incredibly open, so please come and get to know us.
One addendum to Eliot's "both our offices hold open office hours"; we (10gen) also recently opened an office in London.
Although we don't yet have a fixed office hours schedule, we typically hold them every 2 weeks. The exact dates are announced via the local MongoDB Meetup Group°; we always hold the hours at "Look Mum No Hands" on Old Street.
At least one (and often several) of our Engineers make themselves available during this time to answer any questions and assist with MongoDB problems.
We've been using Mongo for almost a year now, and we've not seen any of the major issues such as data loss referred to. We've seen some of the growing pains of a quickly moving, dynamic platform, but nothing outside of the realm of what is reasonable for such a powerful solution. It's true that implementing sharding is no simple task, but with enough planning up front, you'll find yourself able to scale horizontally very quickly. After a couple of weeks of planning, we wound up making a few small changes in our codebase to migrate from master/slave to a sharded environment. Not a huge undertaking by any stretch, provided the current flexibility of our platform. Also, due to the fact that 10gen does make all bug information publicly available, we've managed to get it done with zero surprises.
> If you want to come talk to the MongoDB team, both our offices hold open office hours[2] where you can come and talk to the actual development teams. We try to be incredibly open, so please come and get to know us.
I envy how all your (potential) customers are from California.
Besides office hours in California NY and London we also have user groups in many cities http://www.10gen.com/user-groups and have (one day, very inexpensive) developer conferences frequently (next two in Dallas and Seattle).
Most of the best practices/gotchas can be found by reading the online documentation. Of all the replies Eliot gives they were either plainly obvious (oh, you have a system under heavy load and you're surprised that it gets worse when you give it another task to do?) or mentioned in the documentation. If you're planning on using something - especially for a production system - I sure hope you at least read all the available documentation.
I don't think a short doc is of any help for evaluators. You shouldn't be basing your decision on 400 words and some bullet points. If you're serious about your datastore then you should treat it seriously.
When I was doing my research and came across a bunch of "Why not to use MongoDB" articles, I looked at alternatives solution to see if there was anything "better." Granted NoSQL is the new kid on the block but I wanted to see what my options were. Guess what I'm using, MongoDB. Why? Their documentation is fan-f'n-tastic. Their newsgroup support is just as good, lots of folks who help troubleshoot issue, including the developers themselves.
The original story was submitted by nomoremongo, not nmongo. The original story was very detailed and identified known problems with MongoDB. This post is a one-liner.
So prove you were the one who wrote the account in pastebin.
More evidence: nomoremongo and nmongo have some differences in their writing style. nomoremongo uses semicolons properly, nmongo does not. An even bigger difference: nmongo doesn't capitalize his Is.
Yes, i am a troll, and things have gotten a little out of hand.
Just because a story was very successful at fishing for up-votes, it doesn't have to be true, people around here need to be a lot more sceptical.
And i think everyone who truly pays attention will know by now that MongoDB is the next MySQL.
Whether you are the original poster or not, you're not a troll, you're a sociopath emboldened by anonymity.
Cloak yourself in some idealistic mission if it makes you feel good- but your mission isn't to make the point that "people around here need to be a lot more skeptical"- You're a sociopath that enjoys kicking a hornet's nest just to watch the reaction.
My intention was to troll as many hipsters as possible and make them a little more aware of how easy to manipulate they are, without even providing the slightest bit of evidence.
It cracks me up that there are startups out there right now, making foolish architecture decisions based on the FUD i'm spreading.
Start thinking for yourself!
And in the process of discrediting, you might have turned many people away from MongoDB. You actions seem irresponsible to me. Unbeknownst to you at the time of posting, I'm sure, but your blog has gone somewhat viral, and it could take 10gen a while to recover from the negative press. Did you consider this when posting?
Kudos to Eliot for coming on and answering your phony accusations. I feel sorry for him though as he has obviously spent a great deal of time in responding, when he could have been doing other important things, like fixing urgent bugs. As others have pointed out, this is the mark of a company who take very good care of their customers. Customer service is what differentiates chiefs from cowboys.
HN is an important community resource, especially for people with little startup / dev experience. I would urge you to think next time before being so irresponsible.
There are only a few comments from credible sources in this thread, and none of those had anything negative to say about MongoDB, don't believe blindly.
Interesting you characterise mongodb users as hipsters - why is that? (at the risk of engaging a troll)
We use mongodb extensively, but I get the hipster feeling also, mostly because they hold office hours at Look Mum No Hands in Old Street, which is ultra proto-hipster.
If true, you do realize that you falsely tarnished a real company and product. If this was supposed to be some lesson in verifying sources and information, I think you went about it in the wrong way. What if someone started spreading misinformation about nmongo to prove a point (even an insignificant and unrelated one), would you like that?
What exactly was a hoax? The document pasted was rather detailed and, while somewhat overblown, was obviously written by someone who knew what they were talking about. It contains a lot of criticism of design decisions by MongoDB; these are pretty common and being opinion, can't really be called a hoax.
There's also a couple of anecdotes of MongoDB supposedly failing in various ways in the author's experience. Are you saying those were fake?
Just because you submitted the document here does not mean you wrote it. Pastebin logs the document as being submitted on 5th Nov. http://pastebin.com/FD3xe6Jt
I don't buy it. I don't think nmongo wrote the doc on pastebin. Maybe I'm overrating my character-detection abilities, but it didn't smell like it was written by some immature time-wasting kid.
edit: I use mongo in prod; very much a student of the "right tool for the job" school. Not trying to add or subtract weight from the original text; ambivalence reigns supreme regarding internet nosql battles. Just saying that my possibly unreliable circuits detect quite a gulf between the original document and the OP's hysterical, caps-lock-engaged cry for attention here.
This admission has my "spider-sense" tingling also. The communication style between this guy and the author of the pastebin log seems so different.
It is plausible that someone guessed the password of nmongo's throwaway account, quickly changed that password, and then started posting the whole thing was a hoax.
First, I tried to find any client of ours with a track record like this and have been unsuccessful. I personally have looked at every single customer case that’s every come in (there are about 1600 of them) and cannot match this story to any of them. I am confused as to the origin here, so answers cannot be complete in some cases.
Some comments below, but the most important thing I wanted to say is if you have an issue with MongoDB please reach out so that we can help. https://groups.google.com/group/mongodb-user is the support forum, or try the IRC channel.
> 1. MongoDB issues writes in unsafe ways by default in order to win benchmarks
The reason for this has absolutely nothing to do with benchmarks, and everything to do with the original API design and what we were trying to do with it. To be fair, the uses of MongoDB have shifted a great deal since then, so perhaps the defaults could change.
The philosophy is to give the driver and the user fine grained control over acknowledgement of write completions. Not all writes are created equal, and it makes sense to be able to check on writes in different ways. For example with replica sets, you can do things like “don’t acknowledge this write until its on nodes in at least 2 data centers.”
> 2. MongoDB can lose data in many startling ways
> 1. They just disappeared sometimes. Cause unknown.
There has never been a case of a record disappearing that we either have not been able to trace to a bug that was fixed immediately, or other environmental issues. If you can link to a case number, we can at least try to understand or explain what happened. Clearly a case like this would be incredibly serious, and if this did happen to you I hope you told us and if you did, we were able to understand and fix immediately.
> 2. Recovery on corrupt database was not successful, pre transaction log.
This is expected, repairing was generally meant for single servers, which itself is not recommended without journaling. If a secondary crashes without journaling, you should resync it from the primary. As an FYI, journaling is the default and almost always used in v2.0.
> 3. Replication between master and slave had gaps in the oplogs, causing slaves to be missing records the master had. Yes, there is no checksum, and yes, the replication status had the slaves current
Do you have the case number? I do not see a case where this happened, but if true would obviously be a critical bug.
> 4. Replication just stops sometimes, without error. Monitor > your replication status!
If you mean that an error condition can occur without issuing errors to a client, then yes, this is possible. If you want verification that replication is working at write time, you can do it with w=2 getLastError parameter.
> 3. MongoDB requires a global write lock to issue any write
> Under a write-heavy load, this will kill you. If you run a blog, you maybe don't care b/c your R:W ratio is so high.
The read/write lock is definitely an issue, but a lot of progress made and more to come. 2.0 introduced better yielding, reducing the scenarios where locks are held through slow IO operations. 2.2 will continue the yielding improvements and introduce finer grained concurrency.
> 4. MongoDB's sharding doesn't work that well under load
> Adding a shard under heavy load is a nightmare. Mongo either moves chunks between shards so quickly it DOSes the production traffic, or refuses to more chunks altogether.
Once a system is at or exceeding its capacity, moving data off is of course going to be hard. I talk about this in every single presentation I’ve ever given about sharding[0]: do no wait too long to add capacity. If you try to add capacity to a system at 100% utilization, it is not going to work.
> 5. mongos is unreliable
> The mongod/config server/mongos architecture is actually pretty reasonable and clever. Unfortunately, mongos is complete garbage. Under load, it crashed anywhere from every few hours to every few days. Restart supervision didn't always help b/c sometimes it would throw some assertion that would bail out a critical thread, but the process would stay running. Double fail.
I know of no such critical thread, can you send more details?
> 6. MongoDB actually once deleted the entire dataset
> MongoDB, 1.6, in replica set configuration, would sometimes determine the wrong node (often an empty node) was the freshest copy of the data available. It would then DELETE ALL THE DATA ON THE REPLICA (which may have been the 700GB of good data)
> They fixed this in 1.8, thank god.
Cannot find any relevant client issue, case nor commit. Can you please send something that we can look at?
> 7. Things were shipped that should have never been shipped
> Things with known, embarrassing bugs that could cause data problems were in "stable" releases--and often we weren't told about these issues until after they bit us, and then only b/c we had a super duper crazy platinum support contract with 10gen.
There is no crazy platinum contract and every issue we every find is put into the public jira. Every fix we make is public. Fixes have cases which are public. Without specifics, this is incredibly hard to discuss. When we do fix bugs we will try to get to users as fast as possible.
> 8. Replication was lackluster on busy servers
This simply sounds like a case of an overloaded server. I mentioned before, but if you want guaranteed replication, use w=2 form of getLastError.
> But, the real problem:
> 1. Don't lose data, be very deterministic with data
> 2. Employ practices to stay available
> 3. Multi-node scalability
> 4. Minimize latency at 99% and 95%
> 5. Raw req/s per resource
> 10gen's order seems to be, #5, then everything else in some order. #1 ain't in the top 3.
This is simply not true. Look at commits, look at what fixes we have made when. We have never shipped a release with a secret bug or anything remotely close to that and then secretly told certain clients. To be honest, if we were focused on raw req/s we would fix some of the code paths that waste a ton of cpu cycles. If we really cared about benchmark performance over anything else we would have dealt with the locking issues earlier so multi-threaded benchmarks would be better. (Even the most naive user benchmarks are usually multi-threaded.)
MongoDB is still a new product, there are definitely rough edges, and a seemingly infinite list of things to do.[1]
If you want to come talk to the MongoDB team, both our offices hold open office hours[2] where you can come and talk to the actual development teams. We try to be incredibly open, so please come and get to know us.
-Eliot
[0] http://www.10gen.com/presentations#speaker__eliot_horowitz [1] http://jira.mongodb.org/ [2] http://www.10gen.com/office-hours