Hacker News new | comments | show | ask | jobs | submit login
Scaling MongoDB at Mailbox (dropbox.com)
99 points by ashishgandhi 1178 days ago | hide | past | web | 68 comments | favorite

>one performance issue that impacted us was MongoDB’s database-level write lock.

People give MSSQL shit for having row-level locks (if you don't use their MVCC option), yet how is it that Mongo runs with a database-wide option and people don't immediately laugh and walk away? Is the hype so powerful that people just shrug about a huge mutex?

The problems people encounter with SQL performance generally (or at least also) are on the read side where query optimization can become a pretty hard problem. Basically, the short answer is people are using SQL and noSQL very differently (as one would expect).

cue: someone telling me SQL query optimization is easy and people are just idiots.

It's a valid point taken by itself. But I would respectfully suggest that it's not a particularly useful statement, because you're not comparing SQL to anything. "Complex SQL queries are hard" because "complex queries are hard". If there was a magical tool where you could do complex querying and always have the right answer in the minimal time, then these discussions would never arise. (To a debatable extent, that is one of the things people are paying for when they choose a "better" database: the hope that they get closer to the ideal on complex queries)

Personally, I would much rather perform a complex query in SQL than NoSQL. The base-line is likely to be not-terrible, which may be good enough for single-use queries. Optimization is much easier (IMHO) - SQL is a DSL for querying and indexing. Compared to having to write code to do that (which seems to be the primary NoSQL approach), the DSL approach seems much more efficient. SQL isn't perfect, but it beats no-DSL-support or another DSL that is even worse than SQL :-)

The caveat being that in document-oriented store (like MongoDB) the queries tend to be simple(r) - after all, you have no joins, so how complex can they be? Instead of complex queries you need to use appropriate document schemes and/or application logic.

MongoDB has its problems (for instance document schemes can't be enforced, disk space consumption concept is unintuitive,...) and relational DBs have their strengths (ACID, SQL is standard among DBs,...). But very few people ever reach a point where the global write lock becomes a problem. 10gen knows exactly which features are vital to most of their users and which are not...

But you still need to join in NoSQL, for any query that isn't along the natural hierarchy of the data.

The relational model takes this observation and concludes that there should be no "natural hierarchy" in the logical model (the physical model is a separate question.) It's a _theoretically_ beautiful idea. The counter-intuitive _real-world_ result is that the theoretical approach also yielded faster systems than alternative philosophies. I think that's the reason why the relational model has dominated for the past 30+ years.

You might well argue that that's because a _good_ NoSQL implementation has yet to be created (the no true Scotsman argument). That may well prove to be true one day, but I would bet that a good non-relational implementation will be a derivative from the relational world (Postgres with hstore, or Google's F1 with Protobufs); not from a project whose starting axiom is to get rid of SQL/relational.

Yes, MongoDB really is that bad. Never use it.

Use rethinkdb instead. Its not perfect but it mainly hits the same niche but does it right.

Or anything else, for that matter. Postgres, MySQL, even SQLite beats Mongo for any use-case. Cassandra, HBase if you need a distributed DB. RethinkDB when it gets mature.

MongoDB is in its own class of terribleness.

RethinkDB looks promising, but it's a bit early.

MySQL has a few engines that still have a database level lock ( MyISAM, MEMORY ) and those generally preform better for unbalanced loads ( i.e very read heavy or very write heavy )

Do you have any source for this? While logic suggests that they should (but perhaps only very slightly) I think InnoDD has got a lot of more optimization work than any other engine.

A benchmark from Oracle shows InnoDB as being much faster on a read-only workload.


EDIT: By the way the locks are table level in MyISAM and Memory.

We moved to TokuMX (http://www.tokutek.com/products/tokumx-for-mongodb/) recently and saw a 2x improvement in response times. They've swapped out mongo's backend with their fractal tree storage engine which is MVCC. Definitely worth checking out.

This is something I've considered pushing for at my company. Any pitfalls you encountered? Did it reduce db size on disk as much as their marketing claims?

My tests with it showed a drop in disk size of about 20-30%.

Unfortunately the read/write improvements weren't high enough to justify the massive engineering feat it would take to replace MongoDB with TokuMX, mostly due to the lack of commercial support for Toku.

Our disk size shrank to 10% of stock mongodb size.

The migration is easy now that they've released a tool to replicate from a stock mongo to TokuMX. They also do have commercial support that we've paid for and their team has been incredibly helpful and responsive.

Perhaps once my commercial agreement with 10gen expires in a couple of months I'll take another look.

we migrated recently (last week) using the new mongo2toku bridge. it was relatively painless.

I think that's how it was in previous versions of Mongo but it was fixed in later versions. I'm probably wrong though.

It used to be a mongod-level lock, it's now a logical database-level lock. It used totally laughable, now it's merely amusing.

People give MSSQL shit for having row-level locks (if you don't use their MVCC option)

Just to clarify, are you talking about MSSQL's tendency to escalate locks (e.g. from row to page to extent to table)? They do that to make locks more manageable, and while in most cases it improves performance, in some high-concurrency situations it can be a gigantic PITA (especially given that the ways to strong-arm it into not escalating is not the most helpful).

I dont understand how MongoDB can stay as a shiny tool given its long list of shortcomings. It is so elastic that I think it really has no true form (object cache, mq, relational??, key-value bucket). Lately I haven't read any news about a company migrating to Mongo, but rather most were either departing from mongo because of some catastrophic outage or some mission impossible operation to handle the shortcomings, like this one. In its current state it looks more like a prototyping tool that allows you to delay some tech decisions until your product matures, rather than a production tool. Or it is just me that wants at least 5 major versions on his production database.

Marketing. They marketed the hell out of it. Most developers I know have MongoDB mug.

> Lately I haven't read any news about a company migrating to Mongo, but rather most were either departing from mongo

They had some strange defaults to start with (to give them serious advantage in small silly benchmarks) like un-acknowledged writes. Yes you read that correctly, for years their default configuration was to throw write requests over the fence and assume they succeeded.

There are some horror stories of people's databases becoming silently corrupted. Those corrupted databases were then backed up and back ups were corrupted.

Eventually they had fixed their defaults but their reputation as an engineering company went down the hill ( I speak for myself here mostly).

From my experience a lot of people using MongoDB don't know what they are using. To them 'unacknowledged writes' and 'database level locks' sounds like Latin. They see short examples, cool mugs, other cool kids talking about how easy MongoDB is and they start using it.

or - mongo is great for some problems and not so great for other problems. if there is one thing you can figure out hanging out on HN long enough - it's cool to hate on mongodb

> it's cool to hate on mongodb

Tens of thousands of computer geeks all randomly chose to hate a product. Clearly a coincidence or unlucky alignment of stars, nothing to do with said product, of course.

It could just be the case that everyone complains about their database and the HN crowd uses a lot of MongoDB. There's certainly more total MySql/MSSQL griping, whether or not gripers/users is higher for MongoDB isn't really obvious.

As I pointed out MongoDB inc (ex 10gen) made some decision in relation to how they marketed and set up their product that made many turn their heads.

I want to like Mongo, I'm all about "cutting my teeth on the bleeding edge", but I don't know what to think about MongoDB. Most conversations about it seem to go like this:

Lover: MongoDB is so much faster and easier than old school SQL! Hater: It's fast and easy because it doesn't check if it's written data correctly, which leads to corrupt data. Lover: Well you could always turn on the write lock. Hater: If you turn on the write lock, it becomes slower than normal SQL databases. Lover: Well you're just trying to use it the same way you use SQL. MongoDB is great for some problems and not so great for other problems. Hater: What problems is MongoDB better at solving?

And then that's the end. I've never heard a convincing use case for this thing. If you can change that for me and give me one, I'd be delighted to listen.

Same reason MySQL sticks around when there are alternatives that make it harder to shoot yourself in the foot: Easy to get started, works well enough to make an MVP.

I think the premise of prototyping tool is pretty accurate. It makes easy to get something up and running and defer any operational knowledge until later. That being said, I don't see why people can't iterate their apps quickly on time-proven relational datastores. While having JavaScript through the stack is nice, I found writing Mongo Queries a PITA with all the curly-braces, etc...

Simple. There are some use cases for which MongoDB will be 5x faster than any other system. It is relatively unique among databases in that it is a document store as opposed to a relational system.

It is also very easy to use and manage.

> There are some use cases for which MongoDB will be 5x faster than any other system.

This is a really bold claim that I very much doubt is true. For one, there are numerous other document stores that target the same sort of use cases as MongoDB. Second, there are benchmarks floating around that PostgreSQL used as a key value store is faster than MongoDB as is. I wouldn't be surprised if you saw similar things with other SQL databases.

Only with unacknowledged writes. If you ask for acks, it gets ridiculously slow. Much, much slower in every single way than Postgres with a fully indexed HSTORE. That's also fully ACID.

Of course Postgres is also relational. Have you tried using a document oriented store like MongoDB? I have. For some problems it just plain rocks.

That doesn't mean I would want to store financial transactions for a bank in a MongoDB, but it has its place and that place is clearly huge (judging by the number of people using it).

I love it because it is: - simple to use - fast enough (with safe writes, thank you!) - simple to use - simple to use

Did I mention it was simple to use, which allows me to focus on app instead of wrestling with DB?

Postgres' HSTORE is as document oriented as Mongo.

I have used Mongo plenty and I really don't want to do it again. With its almost-safe writes it's extremely slow and no simpler to use than Postgres. Also much less flexible, basically a subset of Postgres' data model.

> It is also very easy to use and manage.

This is something that SQL database fans don't get. A small start-up can't afford to have a team of guys fiddling with incredibly obscure performance tuning settings.

A developer might not know how to optimize a slow-running Postgre query, but he knows how to cache data and denormalize in a manner that makes the query he wants to run fast. And a document-store plays nice with this.

Any simple relational beauty of SQL databases falls down once you get over a few million rows, and then the hideous hacks and bizarre settings start coming out. At that point you need a pile of domain-specific knowledge to make them play nice.

Mongo takes the domain-specific knowledge out of the equation and lets the developers apply their existing skill-set to the data.

I've you've got a thousand employees? Get a few DBAs and Postgre.

If you've got, like, five?

A small startup that invests in some ops talent is going to run rings around a small startup that optimizes for developers.

People think that staffing up devs creates agility and speed, when in reality it just increases product features.

Scope creep without someone around to voice the needs of operations is a crazy-efficient generator of technical debt.

One thing you don't know or faced yet is that mongo fails, and sometimes it fails so hard that you would rather kick some rocks.

For example, if you start a background index generation on an eventually consistent replica set, indexing on secondary nodes are done foreground. Which means you only accept reads from slaves but slaves are unresponsive because of the index generation. In this state, if you try to do anything fancy your data will go corrupt. Only way out is to wait through the outage (which I find it pretty hard to do so). This is still not solved in 2.4, waiting for 2.6.

Replica sets with all secondaries which can't elect a primary because it lost a node, or the mostly random primary-secondary switches that drops all connections, seldomly primary reelecting itself meanwhile dropping connections for no apparent reasons. Mongo offers tin foil hats for integrity, consistency and reliability. So yeah, I'd rather examine and understand why an SQL query is slow. Because it is at least deterministic, which in mongo nothing really is.

Postgres supports free form json, XML or hstore document formats by the way, couchDb has its own specific features as a document db too. I still don't see why people want to go on with mongo this bad.

I've been running, on a five minute cron, kill -9 on the master mongo instance in our QA lab for a good long time. 24/7.

There's a program running that, 100 times per second, reads a document, MD5's the field, and writes it back. At the same time, it reads a file from the local filesystem, MD5's it, and writes it back. The document and the local filesystem file started with the same value.

After a few thousand kill -9's on the master instance, the local file and the mongo document are still identical.

I've been running MongoDB in production since 2010.

It's definitely possible to use Mongo in a way that isn't safe for your particular use case. But we're doing it correctly.

I haven't lost a bit of data in more than three years of MongoDB.

Mongo has a lot of limitations. We're currently researching various 'big data' solutions, because for us, Mongo doesn't fit that.

For ease of development (in dynamic languages, where your in-program data structures and in-database documents look almost identical), safety, and lack of headaches, MongoDB has been a consistent win for me and the teams I've been on.

Is your test producing a new document value 100 times a second, or just writing the same value back over and over again?

It sounds like it might be the latter, which is not a particularly stressful test (because you can't detect data rollback).

I'm more familiar with relational database internals, but I wouldn't be surprised if a DB just optimized out the unchanged-write entirely (they'd still need to read the current row value, but they don't have to invoke any data modification code once they see the value hasn't changed).

For a good test, you really want to simulate a power-loss, which you aren't getting when you do a process-level kill, because all the OS cache/buffers survives. You can do simulate this with a VM, or with a loopback device. I'd be amazed if MongoDB passed a changing-100-times-a-second test then. I'd be amazed if any database passed it. I'd even be amazed if two filesystems passed :-)

I'm reading the document, hashing it, and writing the hashed value back. So it changes every time.

I plan on extending this test by blocking 27017 with iptables, then doing the kill -9, then wiping out all of the database files. That'll be fun. :)

"A developer might not know how to optimize a slow-running Postgre query, but he knows how to cache data and denormalize in a manner that makes the query he wants to run fast"

And you can cache and denormalize with postgres just as easily, and it will probably perform better and not corrupt your data.

What mongo makes easy is things like replication and failover. It can go horribly wrong on that front, but until you see that, it is much easier to get up and running with replication and failover.

No, with Mongo you get a bunch hacks you need to do to maybe get indexed reads. I had to write a query optimiser that does lots of crazy things and simply rejects some entirely reasonably-looking queries because Mongo can't do them efficiently.

Not to mention the complete lack of durability. Even Redis is more durable.

Postgres is trivial to use. Is something slow (which will happen much later than with Mongo)? Add an index, done. I'm not a DBA, you don't need one.

are there no problems with mongo 'once you get over a few million rows'?

You need consistency (distributed mvcc), row-level locking and scale-out architecture. Clustrix might be a better fit for the workload. Clustrix customer Twoo.com has 336 core deployment (168 master, 168 slave, 21 nodes each), they have millions of users and billions of transactions per day. Their application still thinks it's talking to a single MySQL database and they don't have a DBA and they have never thought about shard keys etc. We in the database industry should be solving these problems for you.

Wish someone would do that for Postgres.

There's Postgres-XC [1], but it's a fork and I don't know how actively developed.

1. http://sourceforge.net/apps/mediawiki/postgres-xc/index.php?...

Nice work -- that's a hard problem. I've felt some of this pain as I helped Cloudant customers make hot migrations from mongo to into Cloudant. Your first two figures make it clear just how challenging hot-replications are (especially master-master), not to mention handling failure scenarios. For all of the great things about Mongo, there's something very awesome to be said for CouchDB's MVCC replication model. Glad to see that you're open sourcing those tools. Maybe they could be extended to make a MongoDB <==> CouchDB replicator.

I worked on a project where we used Mongo to collect analytics and while it worked, there were always problems. We had a problem with our shard key so one shard received more data which crashed it, overloaded the other two shards and crashed the whole system. A second issue we had was trying to read from Mongo while we were storing data which caused a lot of lock contention. The question I have is, knowing what you know now, would you have still gone with Mongo?


...a single Python process using gevent and pymongo can copy a large MongoDB collection in half the time that mongodump (written in C++) takes, even when the MongoDB client and server are on the same machine.

I would guess this is for the reason posted in the article: the Python code is multithreaded (sort of, gevent) but the C++ code is only single threaded.

Sure, but that seems to suggest a meta-reason: the standard utility included with the package isn't asynchronous [sort of orthogonal to threadedness] because it would be a PITA to write an asynchronous utility in C++.

That's not true, I've had no trouble writing very fast threaded C++ programs with the mongo client.

1. That's not what I said.

2. I'm sure the Mongo devs would accept pull requests for mongodump.

1. > it would be a PITA to write an asynchronous utility in C++

2. Doubt it, and there's no reason for me to write one for them.

Asynchronous ≠ multithreaded. TFA actually tried both ways, measured the difference, and found async to be much faster.

We hear similar stories from the field here at Cloudant. Distributed databases, scaling clusters -- that's hard stuff. Nice to see such a thoughtful post.

It's funny to read these comments trashing mongodb. Perhaps people should do a little less HN ranting and a little more serious study about the problem they need to solve. The design, benefits and tradeoffs of mongodb are well understood. If they don't fit your application, don't use it.

Summary: vertical partition on collection. They wrote some new code to make moving a collection easier.

I'm aware of MongoDB issues with scaling, and I'm trying to avoid it as much as I can, but I still can't find any other alternative that offers geolocated queries.

From firsthand experience, the PostGIS extension for PostgreSQL is fantastic.

A couple of my coworkers at Basho have done geospatial work with Riak, our scalable, distributed database: http://basho.com/indexing-the-zombie-apocalypse-with-riak/

Someone below mentions PostGIS, which from my understanding is among the best tools in the field (if you heard of OpenStreetMap, the whole DB for all their geo data runs on that, and people use dumps like others do with Wikipedia to build their own customized GIS apps on tops of it).

That said, other people mention Cloudant deployments here, and that is CouchDB. If you want a Mongo-ish (they are still very different) NoSQL document store (notice I did not say database), there is extra layer of functionality on top of CouchDB known as GeoCouch. I have never personally used it but I have been looking for a reason to.


Why has the new system less write lock? Is it sharded?

It sounds like the new system is a cluster with a single collection. So all writes to the other collections no longer occur, therefore the write locks decreased


to everyone use hnotify!

It's at the database level now.

tl;dr your system is more complicated than you realize, so scaling it will be more complicated than you think. Also, MongoDB's write lock is a pain in the ass.

two writes don't make it wrong?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact