If you tell a database to store something, and it doesn’t complain, you should safely assume that it was stored.
This has nothing to do with the 2Gb limitation. Nowhere in the documentation does it mention that it will silently discard your data. What will happen with the 64-bit version if you run out of disk space, more silently discarded data?
I know a lot of you may have cut your teeth on MySQL which, in its default configuration, will happily truncate your strings if they are bigger than a column. Guess what? Anyone serious about databases does not consider MySQL to be a proper database with those defaults. And with this, neither is MongoDB, though it may have its uses if you don't need to be absolutely certain that your data is stored.
EDIT: Thanks for pointing out getLastError. My point still stands, since guaranteed persistence is optional rather than the default. In fact, reading more of the docs points out that some drivers can call getLastError by default to ensure persistence. That means that MongoDB + Driver X can be considered a database, but not MongoDB on its own.
I'm just struggling to imagine being willing to lose some amount of data purely for the sake of performance, so philosophically it's not a database unless you force it to be. Much like MySQL.
EDIT2: Not trying to be snarky here, but I would love to hear about datasets people have where missing random data would not be an issue. I'm serious, just want to know what the use case is that MongoDB's default behaviour was designed for.
EDIT3: (Seriously) I'm sure MongoDB works splendidly when you setup your driver to ensure that a certain numbers of servers will confirm receipt of the data (if your driver supports such an option), nowhere am I disputing that. But that number really should have a lower bound of 1, enforced by MongoDB itself. And to the guy who called me stupid: you are what's wrong with HN.
Demonstrably false. http://www.mongodb.org/display/DOCS/getLastError+Command
"MongoDB does not wait for a response by default when writing to the database. Use the getLastError command to ensure that operations have succeeded."
Let's say, outside of the tech world: When you send a post card (cheap ones) to a friend, you won't receive any delivery confirmation. You just send it and go do whatever you please, believing the post card will be there. If the envelope don't get there, no biggies, you will send another on your next trip anyways. No hurt feelings.
But, let's say you need to send me a check. You want to know if I received it or not, specially because sometimes I don't cash checks right away. Without confirmation it would be difficult to you to decide if you cancel the previous check and send another, or do nothing, because I could be at that very time trying to cash the check or it could be lost somewhere. The delivery confirmation is an add-on where you receive a confirmation that the envelope got there, but see, it will take time for that confirmation to arrive. It's expensive. If you are sending a 0.01 check, you can just send another if the recipient asks.
If I ask my bank why my account does not reflect my latest deposit and they say 'Sorry, I guess we didn't get it', I'm getting a new bank.
And the flaw of your argument: Even if there are other more important things for an application, let's just make anything else than the #1 feature shit.
I don't actually understand what you mean, here, but since you say it's the flaw of my argument, I'm very interested in it. Could you rephrase briefly?
I'm not saying it's not at all important that chat messages actually get sent, and if it happens every single day to a user, then they might well look for alternatives, but it's not of the same importance as losing a banking transaction. If accepting that occasional writes will be dropped on the floor allows you to get your product out in October instead of December, that could be an acceptable tradeoff. Certainly not every use case is like this, but some are.
I guess what I'm trying to say is this: You cannot ignore all the other features except the biggest one.
I take your point though, but I think consistency is still one of the most important attributes for anything that is going to store data. Why even use a database, if your data matters so little? Just throw it into memory or memcache.
If I'm logging upvotes on a post or comments on a blog, which is about as serious as 99% of these b.s. startups are doing, I think it's fair to ignore errors.
I do agree that this should be pointed out in huge blinking letters though, or be a driver flag that is on by default. The amount of people who don't know this about Mongo, but are still using it to store gigs of data, is horrifying.
If I'm logging upvotes on a post or comments on a blog,
[...] I think it's fair to ignore errors.
Acceptable software, particularly in the class of databases, is obligated to tell you that it didn't complete your request. This is not an option.
IMO, this is important enough information that it should be mentioned from the start, but it isn't in the tutorial, nor can I find it in the FAQ.
There's also a reference to the issue in your second link, though it's not super clear. http://docs.mongodb.org/manual/faq/replica-sets/#are-write-o...
EVERY database call should be wrapped in exception handling to make sure that any errors e.g. connection errors are handled appropriately. MongoDB is no different in this case.
You can only handle the errors that you know how to handle, in this case retrying the operation may have created a bigger problem.
It's like, literally, right there in the brief manual. Takes an hour to read and understand.
Perhaps a better option would be to have an 'unsafe_write' option. But then of course, benchmarks would look less impressive which didn't use a function with 'unsafe' in the name.
[Ed: The following is an unusual default requirement]
Me: "MongoDB, did you store what I asked?"
MongoDB: "Nope! Good thing you checked!"
Me: "MongoDB, please store this: ..."
MongoDB: "Okay, I've accepted your request. I'll get around to it eventually. Go about your business, there's no sense in you hanging around here waiting on me."
Or, if you really want to be sure it's done:
Me: "MongoDB, please store this. It's important, so let me know when it's done."
MongoDB: "Sure boss. This'll take me a little bit, but you said it's important, so I'm sure you don't mind waiting. I'll let you know when it's done."
To me, the choice of performance over reliability is the hallmark of mongodb, for better or worse.
That said, I think that people really do overblow the issue and make mountains out of that particular molehill, because all the tools are there to make it do what you want. Many times, it comes down to people expecting that MongoDB will magically conform to their assumptions at the expense of conforming to others' assumptions. Having explicit knowledge of the ground rules for any piece of technology in your stack should be the rule rather than the exception.
And I say this as an old-skool C guy who does do this in critical sections of code... But for everything else I'm in a language like OCaml that behaves sanely, using a DB like Oracle that behaves sanely.
'Success' and 'Failure' are fuzzy concepts when writing to distributed databases, and you need to tell Mongo which particular definition fits your needs. The 'unsafe' default in mongo is controversial, but ranting about what a "proper database" is without even reading the docs is stupid. Instead, let's rant about what a "proper developer" should do when using a new system...
A foursquare check-in database could be an example where performance is actually way more valuable than consistency. (I have no idea what database they use)
Nice ad homien there. MongoDB isn't DB2, just as MySQL wasn't. Both can still be used to build very good products; in fact, I'd go so far as to say they lead to better products than "proper" databases.
I'm really glad I haven't deployed mongo now in a production 32-bit system.
Response to EDIT2: Where can data loss be acceptable?
If you are having a relatively speedy message system where messages are removed/outdated on rx. I'm sure there are other specialty needs.
So by default Mongo write operations are asynchronous and you have to explicitly ask for error codes later.
It's legit to criticize a language or a database. However, it seems to me that when MongoDB gets involved, the tone is far more aggressive and defensive. What's up with that? It's just software, bits and config files. It's not like someone called your mom a harlot.
Here's what I think. New developers, for a long time, have come into the industry and become overwhelmed with everything they need to learn. Let's take typical database servers. Writing a SELECT is easy enough, but to truly be an expert you have to learn about data writing operations, indexing, execution plans, triggers, replication, sharding, GRANTs, etc. As it's a mature technology, you start out barely an apprentice, with all these experienced professionals around you.
In recent years, software development has really been turned on its head. We're not building apps using the same stack we've used for a long time: OO + RDBMS + view layer + physical hardware. The younger the technology, the better, it seems. In theory, a 3 year developer and a 20 year developer are now pretty equal when we're talking about a technology that's been around 2-3 years. That wouldn't be true if we were dealing with say, OO design patterns. (Even when new languages come along, you still get to keep your experience in the core competencies.)
Attacks on these new technologies are perceived as an assault on this new world order, and those who have walked into being one of the "newb elite" respond emotionally to what they see as a battle for the return to the old guard. Am I totally off base here?
Mongodb was very aggressively marketed; its advocates produced benchmarks comparing it directly to traditional relational databases as though the use cases were the same. I think that set the tone for future discussion in a way that's still being felt.
If you're as old as your opinions suggest you'll remember the early days of Java were very similar - Sun marketing pushed it no end, and so tempers ran high and discussions were emotionally charged in a way that never happened when talking about perl or python or TCL.
From the beginning I was a consumer of RDBMSes. Started with
Access and moved on to SQL Server. There wasn't a need to know the full DB, only the pieces you needed for CRUD. Perhaps for newbs that has changed, and they have to learn the full SQL administrative experience. Personally I doubt that. Do some db migrations in Rails: you don't even need to know what SQL engine you're running on. (A good thing, IMO, but still means a lesser body of knowledge)
Good point that a lot of products try so hard to be the "new sexy" that they suggest an inaccurate comparison, or at best, implement a subset of what they're trying to replace.
This is a case where, although the ultimate complaint of the author is the behavior of the product (which is documented, but un-intuitive in nature unless you've read up on the issue), it's the way in which he chose to frame the problem that is getting people upset.
This is a known issue, even if it seems like a completely poor design decision. The issue I think most people here are taking is that because the author did almost no research on the topic, he got himself into a problem, and is trying to blame it on Mongo.
Telling somebody they are wrong is one thing, calling them moronic or stupid is quite another.
I think this is an evolution of the language wars wherein immature developers align themselves with a technology and mix up criticisms of the technology with criticisms of themselves. This seems to be part of the need humans have to be part of a community.
1. Immature in this context has nothing to do with age. Rather, it is an attitude that shows when any developer has not experienced and internalized enough technology to realize every single technology has fundamental problems, sucks in some way, yet is still usually pretty amazing nonetheless, especially within the context of its creation.
Hopefully the 20 year dev can recognize the new thing as new and possibly immature, can identify some areas of weakness when compared to tools with a successful history.
> Attacks on these new technologies are perceived as an assault on this new world order, and those who have walked into being one of the "newb elite" respond emotionally to what they see as a battle for the return to the old guard. Am I totally off base here?
2) Feel they had wool pulled over their eyes unexpectedly.
Let's talk about the wool. MongoDB was marketed initially with stupid little benchmarks (that were later removed as a policy). Those benchmarks were what people saw, showed their bosses, colleagues and decided -- "this is the one". Yes they picked a bad tool should have RTFM, I would normally say but not for MongoDB.
They marketed themselves as a "database" while at the same time shipping with durability turned off. Yes, you can write very fast if you don't acknowledge that data has hit the disk buffers. I wasn't fooled, I saw the throughput rates and thought, something is fishy. But a lot didn't.
Most of all I have no problem with this design decision given that there is a bright red flashing warning on the front page saying what the default settings are and what it could do to your data. There wasn't.
As developers (programmers whatever you want to call it), we feel that perhaps when other developers market things aimed at us, they would be somewhat more honest than say someone selling rejuvenating magnetic bracelets at 4am in the morning on TV. I think that is where the passionate discussion comes from.
Aside from that, though, the 32 bit limitation is clear in the documentation and present on the download page. It's fine not to read the documentation before you use something but you can't then complain that it did something you did not expect. Mongodb is a little different from other databases. So is Redis. You can't blow everything off that is conceptually different.
There are plenty of valid arguments for not using MongoDB, but this is the weakest I have seen so far.
If you're talking about Ubuntu, I can attest that the default PM there is several versions out of date for a lot of things, and thus to get the version you'd expect, you're forced to install by hand.
Also, even using the PM version, didn't you get a warning when you started the server? I thought Mongo threw up a warning at start time about this exact issue (the 2GB limitation, not the silent failures)
What does the author's complaint have to do with the version Ubuntu is distributing? Are the 32-bit limitations present in Ubuntu's version not present in the most recent version? If they are, than who cares which of them he installed?
they can't be more explicit about it than pointing it out on the download page and giving a message upon the database startup
Uhh, yeah they can. On Debian-derived systems like Ubuntu you can make your .deb packages throw up dialogs that the user has to read and agree to before installation via debconf (http://www.fifi.org/cgi-bin/man2html/usr/share/man/man8/debc...). There's probably a way to do the same thing in RPM-based systems as well. If the warning is something that every user of the software needs to see, putting up a warning dialog and requiring the user to confirm having seen it before installation starts would probably be appropriate.
They could also write an error to the database's error log whenever data is discarded due to the 32-bit limitation. Someone mentioned above that it puts a message at the start of the log, but if that's the case IMO it's insufficient; most of the time people interact with logs by looking in them for a particular moment in time, not by reading them from the first line on. Logging the error on or near the moment the data loss happens would make the issue visible to people using logs in this manner.
Right, but the only person with the ability to do that is the Ubuntu maintainer of the package. Mongodb has no control over what they do and should not be held responsible for their actions.
Nope. In fact fedora developed packagekit said the idea was broken and caused a controversy over supporting it for ages
I'm sure this only bit the author because he was using MongoDB for a toy project, and in a real system he'd have done due diligence first.
I'm not a fan of MongoDB myself, but if I were to use it I know that I must read about every option available because by default MongoDB's team chose settings that are suited for speed and not reliability, durability, or (if i'm being less charitable) even sanity.
I've noticed a trend across about 20+ candidates, all of whom are smart people: people are using Mongo without actually understanding what the hell it's trying to solve by getting away from the RDBMS paradigm.
I'm not sure if this is because 10gen markets it as a general purpose tool, but I have yet to talk with a candidate who can actually describe why they were using the DB vs. a SQL database. I'm all for learning new things, but I can't help but wonder if the string of negative MongoDB posts is coming from people who pick it b/c it's new, then realise pretty far in that this is nothing like a normal DB, and "having no schema" isn't really a reason to go with a tool as foundational as a data store.
I think Mongo is great for really specific problems that its designed to solve. It's probably pretty bad for a general purpose tool, but I'd be surprised if anyone serious actually considers it one.
My observation has been that a substantial number of people pick NoSQL stores because they don't really understand RDBMSs, and can't be bothered to learn.
I don't mean this as a dig at NoSQL in general - there's perfectly valid reasons to want some NoSQL features - but the hype train does attract a lot of people who just want the new hotness.
I have talked to more than one 10gen marketing bro who insisted that MongoDB is appropriate for any and all use cases, transient to archival. It's pretty disingenuous if you ask me.
We did not have this experience when I worked at a large datamining company. It was a nightmare.
Mongo markets ease of sharding as an advantage, and if that is not the case at times, it limits the attractiveness of losing RDBMS features.
It is just a setting you know that right ?
There is a discontinuity between the ease-of-use story and the blame-the-user story, regardless of how well documented the async insert behavior is.
And it doesn't have to be this way. There are ways of designing interfaces, APIs, and even naming that go a long way to prevent your users from shooting themselves in the foot.
Take postgres. It also supports at least a couple kinds of async insert, one of which is a part of libpq (postgres C client library). It's called "sendQuery" and it's documented under the "Asynchronous Command Processing" section. It's hard to imagine a user trying to use that and expecting it to return an error code or exception. Even if the user doesn't read the docs, or reads some fragment from a blog post, they will still see that the name suggests async and that it returns an int rather than a PGResult (which means it obviously doesn't fit into the normal sync pattern).
There is no reason mongo couldn't be clear about this distinction -- say, rename "insert" to "async_insert" and have "insert" be a wrapper around async_insert and getLastError. But instead, it's the user's fault because they didn't read the docs.
Careful API design is important to reduce the frequency of these kinds of errors. In postgres, it's relatively hard to shoot yourself in the foot this badly in such a simple case. I'm sure there are gotchas, but there is a conscious effort to prevent surprises of this sort.
Because if you don't read enough of the docs to understand that 'insert' is asynchronous insert, you don't understand MongoDB and haven't done your research.
Why should 'insert' default to synchronous? Why shouldn't we instead have a sync_insert function instead? The only reason is that you're assuming familiarity for people coming from SQL/synchronous-oriented DBMS, but why should they be forced into an awkward design just because it's what people are familiar with from other DBMS?
Expecting the user to be an expert in your product from the start is simply not realistic; a well-designed system facilitates use by people of varying levels of expertise.
Not if you're choosing a system that's explicitly marked for performance over safety.
> Expecting the user to be an expert in your product from the start
The 'product' in this case is a non-relational database, not an iGadget. The user can and should be expected to be familiar with the main strengths and weaknesses of the database as a whole.
There is no way you can convince me that someone who has done a reasonable level of due-diligence in investigating MongoDB can be surprised when it behaves asynchronously.
I think you're right though: MongoDB should not be used without _lots_ of research into its limitations.
That's true about any database, not just MongoDB; nothing new here.
> then you're very much at odds with (my perception of) the 10gen marketing message.
10Gen is fairly straightforward about the original issue, having blogged openly several times about their decisions - but at the end of the day, any engineer should do research beyond the simple marketer's pitch.
I won't doubt that there are people who make snap judgements about fundamental architecture based on marketing pitches, but that's very unfortunate, and the marketers really can't be blamed, especially when they make no effort to conceal the truth or deceive you!
That's exactly the point where we started. A well-designed system fails "safe"; it should obey the principle of least surprise. Specifically: MongoDB should default to synchronous writes to disk on every commit; official drivers should default to acknowledging every network call; MongoDB shouldn't allow remote access from the network by default. Once you want higher performance or remote access, you can read about the configuration options to change and learn on-the-fly, evaluating the trade-offs as needed.
Other systems are safe by default (e.g. PostgreSQL), and their out-of-the box performance and setup complexity suffers because of it. MongoDB could ship "safe" (with the same trade-offs), but chooses not to. That sort of marketing-led decision-making has no place in my technology stack.
'Surprise' is relative to the current environment and paradigm (in this case, asynchronicity)- if you find that surprising, then that means that you should have read the basic documentation properly.
> MongoDB could ship "safe" (with the same trade-offs), but chooses not to.
Because that's one of the main points of choosing MongoDB...
I admit that the default unsafe tuning of MongoDB becomes quite obvious when you read more of the manual, but I can hardly say 10gen is without blame for causing this confusion.
I hope you continue to explain these caveats to everyone considering MongoDB. I hope you recognize that not everyone is an expert in these limitations, and that you clearly explain to those that might not know it that MongoDB's "2GB limit" really means "data loss"; as does 'asynchronous'. Then you'll see fewer blog posts from people that didn't see through the marketing speak and were bitten by the defaults.
Right now, I think all these blog posts describing MongoDB losing data or performing poorly are getting upvoted because people are learning of these limitations for the first time.
It's because it's a reasonable assumption to make. Data loss shouldn't be a surprise, if I need speed and am willing to risk dataloss I should have the option, but should explicitly choose to use it.
You did, by choosing to use MongoDB.
(And if you chose MongoDB without being aware of that implication, you didn't choose MongoDB for the right reasons or didn't do your due diligence, because you cannot understand MongoDB's use case and tradeoffs if you were unaware of this.)
1) it does not behave exactly like SQL
2) the user didn't read any more than a Quickstart Guide
3) the user fundamentally misunderstands the aim of the new technology or the application it is intended for
Ember.js suffers from the same ignorance.
What makes it worse is all the morons who upvote without even reading the detail purely because the title reinforces some misconceived bias they already have.
'NoSQL' is part of the problem. This technology has absolutely no comparison with SQL other than it persists data.
Except that apparently under certain circumstances it doesn't persist data, which was the author's point.
Personally I wouldn't be upset about a limitation like the one described as much as I would be upset about the database not logging an error when it discards the data. Logs are a primary way you figure out what's wrong when your application isn't behaving as expected. If you open the logs and see a bunch of "32-bit capabilities exceeded, please buy a real computer" messages, you learn what the problem is. If the database error logs are empty, that implies that everything is working fine, when in this case it clearly isn't.
Almost all of the complaints against MongoDB are down to assumptions and lack of understanding about the database.
You call people "morons", yet it appears that you did not read the article yourself.
Whether SQL or not, scalable or not, old or new, or whatever... Is completely immaterial here.
When a database silently stops accepting data, and apparently has done so for 3 years, you have to at least admit that there are strange design goals at play.
Now, the entire claim of the article might be incorrect. Did you verify that yourself?
And anyone who has read more than an introduction to mongo knows that you SHOULD use getLastError to be safe. If you do that, no data will be dropped.
[root@li321-238 tmp]# dd if=/dev/zero of=./filesystem bs=1M count=128
[root@li321-238 tmp]# mkfs.ext2 filesystem
[root@li321-238 tmp]# mount -o loop filesystem myfilesystem
[root@li321-238 tmp]# dd if=/dev/zero of=myfilesystem/im_too_large bs=1M count=129
dd: writing `myfilesystem/im_too_large': No space left on device
A program that continues along without issue, only changing its behavior in some unannounced (documented or not) way, is not "limited". It's free as a bird.
With a getLastError model, you can do your work, then go check for errors when you're really ready.
I'm not saying it's a great api, but it does make sense in context. No idea why the tutorial the op followed didn't talk about the differences, or why asynch is hard.
This brings me back to the recent discussion about reading other people's code: it is almost certainly smarter to extend an existing database until it's capable of meeting your needs, rather than write one from scratch.
The fact that many programmers don't see it that way is a testament to their irrational fear of diving into other people's code.
People need to stop acting like PostgreSQL is some holy grail database. It isn't.
And making a solid, featureful, and performant database is vastly harder.
I'm hardly an inexperienced programmer. I've used Cassandra, SimpleDB, Voldemort, etc. I wrote part of the Inktomi Search Engine in the 90s, and plenty of (what today would be called) NoSQL stores over the years.
A default that's so counterintuitive for a database should be featured prominently with a huge neon sign. It wasn't in the Ruby tutorial, or in any of the many documents I read. It's buried deep in the Mongo website, and the first Google match about the 32-bit limitation is a blog post from 2009.
Sometimes you just have to admit you screwed up and didn't read the documentation. Everyone does it, we're hackers, we'd much rather play with technology than read docs.
That 2009 post is the canonical post about the issue, which is why it has such page rank. Its position is a consequence of the fact that it's linked to from all over the web, not because nobody has discussed it since.
I kinda like TCP vs. UDP analogy. Sometimes you care more about speed than precision. A few dropped items in a log. Not a big deal. I'd rather have that, than to be forced to use a more expensive machine for the job.
That said, I absolutely think the default should be the TCP way.
Look, I agree that in most cases you probably want to do everything you can to make your data 100% complete. But failed writes should be really rare, and there are plenty of times I'd trade the rare missing write for cheaper/faster database servers.
However, it starts to feel like Anti MongoDB is just considered cool today, when I see someone that worked with MongoDB for a year, upgraded to 2.2, knows it inside out and still hates it, I would listen, and start to worry. but until then, I'm going to keep using it, and saving time.
People who would rather not bother, can stick with their tools, work slower, and be happy.
In all seriousness, I built a 10 machine Mongo cluster, talked with a 10gen consultant a full day, went to Mongo meetup, and ran all sorts of benchmarks before ever using it in production. I still don't feel like I have the expertise to write a snarky blog post about it.
Not really following the snark there. Are you trying to compare MongoDB to MySQL's MyISAM storage engine? Like there aren't numerous other extremely valid RDBMS solutions out there, which don't do table locks during a write? (MySQL InnoDB, Percona, Maria, Aria, Postgresql, Firebird, etc...)
To give an analogy, it'd be as if someone read this post and decided to use a SQL database solely because they care about write-durability... and then complaining when they "suddenly" encounter an error when trying to include an extra field with an INSERT on-the-fly. ('You mean SQL has fixed schemas??')
If they just wrote asynch on that page somewhere most experienced programmers would immediately understand the implications. And also understand how the amazing performance was being achieved.
The MongoDB "way" is that clients know the importance of their data and can choose write strategies which make the proper trade-off between insertion throughput/latency and durability/consistency.
So, assuming you are writing an ecommerce application, here's where I think these flags come in.
- Session data: fsync = true. Wait for a response, and ensure it's written to disk
- Internal web analytics: safe = false. Who cares if it's written, I've got an application to serve!
- Orders: fsync = true. I know, RDBMS, transactions, blah blah blah.
People tend to look at NoSQL and wonder why it doesn't function like MySQL, then they loudly complain how bad the software is. Nobody is writing articles about how Memcached doesn't function like MySQL.
Yes, I realize that there's a "safe=True" option to my python driver. But I'm writing to a database. As others have said here and elsewhere, the default behavior of a database and its drivers should be to complain loudly when a write fails. It is ridiculous that safe!=True by default. If I want to turn off this feature to improve performance, I will.
Yes. Without question.
Is this his own fault for not reading the documentation and understanding that he should have opted for the 64bit version outright?
Exception throwing database drivers are a relatively new thing not an old thing. The only thing MongoDB does differently is that the writes are fire and forget in that the database hasn't returned a response of any kind when the function returns.
In native code you can forget about using exceptions in a database driver because exception handling can be exceptionally broken on some platforms. SmartOS I am looking in your direction.
No excuse for not reading the docs, though.
There seems to be a number of people commenting, telling you to read the documentation, but I'm with you, that is completely counter-intuitive behaviour and should be viewed as a bug.
This reminds me of the attitude that I had to correct in developers that worked for me:
- There is a huge difference between "it works" and "it does what the user expects in a friendly way."
Steve Jobs said that if you need to read a user manual (particularly to do the most vanilla usage of a product), the problem is the product. Not you.
He's talking about consumer products, not databases that were intended for use by technology experts. There's a big difference there.
The onus is on you to understand the limitations of software before you start using it. You complain that the 32-bit warning doesn't show up in the package manager, but you still should have read the documentation before committing to a new technology. It's that simple.
Is it a flaw that mongo doesn't work well on 32 bit systems? Maybe. Probably.
Is it a flaw that you didn't do the requisite research before committing to a database and subsequently complaining about it? Definitely.
If you were working for me as a developer and had the attitude that you shouldn't have to _thoroughly_ read the manual and notes for something like MongoDB, I'd let you go. Steve Jobs was not a programmer.
Heck, I learned about error handling in Mongo the first hour I started learning it. Same for the 2Gb limitation of 32-bit. The mongo manual is very well done and also happens to be fully indexed in Google.
You are using a quote about UX/UI to make a point about and API/Dev tool I do not think that they are or should be related
Also, you expect it to work in a certain way. That where you are doing it wrong.
Beyond that I'm not sure why anyone would run a production system on a 32 bit system anymore. Sure the failing silently part sucks but really this seems much more like a poor deployment then a actual bug in mongodb being the root cause.
Another problem with Mongo I never heard anyone else raise is that there are no namespaces. If I install Mongo, all the tables/collections live in the same namespace. What if I want to use it for multiple projects? How do other people solve this problem?
These are really not points to be discovered in chapter whatever of the docs.
- Download, Brief 3rd party tutorial, Production, Break, Complain, RTFM / Complain
- RTFM, Smile, Download | Move On, Staging, Production
Seems most of the issues from this article came from a lack of reading and investigating.
In general it feels like Couch actually takes storing data seriously. Append-only and whatnot. It's slower and a little bulkier than Mongo, but it does the important things right (1.0 bugs notwithstanding.)
I'd love a follow-up blog post on your experience with Couch.
another thing I didn't realize was that because of the memory mapped systems which i guess is fine performancewise it's hard to estimate memory usage on a machine. from what I understand there is no possibility to limit the memory usage. Which means that the only way you can limit the amount of memory used is by keeping the size of the database below your memory. quite important things to know imho.
here's an interesting post mentioned in the comments:
Isn't http://www.mongodb.org/downloads an obvious place?
The problem is that Mongodb didn't complain when he was inserting data above the limit. A data store doesn't complain when it runs out of space? It should be mentioned as the biggest problem with 32bit version.
I understand having another node or two for fail over but I reckon with the spec of the largest offerings from AWS or Linode most people will never need to worry about this and can manage everything on one Postgres or MySQL db. Why complicate things before you have to.