
Why MongoDB Never Worked Out at Etsy - mcfunley
http://mcfunley.com/why-mongodb-never-worked-out-at-etsy
======
bonobo
I was expecting a post describing why MongoDB specifically weren't fit to
their use case, but the TL;DR version is basically:

" _Before you get too excited, the reason for the failure is probably not any
of the ones you're imagining. Mainly it's this: adding another kind of
production database was a huge waste of time._ "

The blog title is misleading IMO. It could as well be titled "Why [any other
DBMS] Never Worked Out at Etsy" and the conclusion would be the same.

~~~
gfodor
Your proposed title is very confusing and meta. The current title is only
misleading if you are going into it with biased expectations about what it's
going to teach you. The article is interesting precisely because it is not
just another "NoSQL sucks" screed.

~~~
bonobo
What I meant was that the article was not about MongoDB at all. I think
thefreeman summarized it better than me on his reply: the author is talking
about his experiences on running more than one DBMS in production, not about
MongoDB in particular.

I'm not even saying if the article was interesting or not. It surely has it's
merits, it touches a point worth discussing (the downsides of trying to cover
one DBMS's weakness throwing in another DBMS in the mix), but I was
uninterested. I was expecting to hear why MongoDB wasn't fit for his use case,
to better understand when to use NoSQL and when not to, so yes, I was indeed
biased waiting for something different—but, in my defense, I say I was biased
because the title made me think this way.

------
lincolnbryant
Soundcloud wrote a blog post [http://backstage.soundcloud.com/2011/04/failing-
with-mongodb...](http://backstage.soundcloud.com/2011/04/failing-with-
mongodb/) about the specifics of how they failed in implementing an analytics
platform in MongoDB, then went with Cassandra (and why). I have been at a
company where it was on developers to deploy mongo clusters, setting up
logging can suck, but still there are no numbers or even application
integration specifics here. As someone pointed out - manual denormilization
can suck. There are options like MongoHQ and Heroku though so this shouldn't
resolve to "don't try a new data store its hard and possibly buggy".

------
untog
Genuine question- who is using MongoDB successfully in production, and at
scale? I'm not aware of anyone myself- I hear of it being used in hackathons
etc because its so quick to set up, but I'd be curious to know what people are
using it with.

~~~
ry0ohki
I run two sites, one is the perfect use-case for MongoDB -
<http://www.AUsedCar.com> , it's a used car search engine. We've seen nothing
but benefits by switching to it from MS SQL Server. Queries are way faster
etc... It's a great use case because 99.9% of DB interactions are read-only
searches.

My other site, <http://www.BudgetSimple.com> on the other hand is using SQL
Server (in the process of porting to MySQL). It would not be a great use-case
for Mongo, because there are usually as many update, delete, inserts as there
are reads, and instant database integrity and a schema are important.

Anyone that claims a tool is perfect for every problem is probably wrong. You
need to figure out the best one for your use case, and load test, security
test, performance test, etc... until you have a good guess for the right
answer.

~~~
mark_story
> It's a great use case because 99.9% of DB interactions are read-only
> searches.

Did you ever consider a datasource like elasticsearch? If yes what made you
choose mongo?

~~~
ry0ohki
No, and I actually used to work for a company that made a similar type of
search engine!

That probably would have worked as well (don't think I considered that
specific solution). Mongo came up on top because of it's wide use (among other
things), ie it's pretty easy to find support and lots of stories about how to
scale it under different scenarios.

------
vph
The bottom line is that MongoDB and MySQL are two different persistent data
structures. MySQL is a more powerful data structure that can do more things.
MongoDB is less powerful, but is more efficient at certain things. Due to pre-
mature optimization or shortsightedness, some folks are romanticized with the
efficiency of a less powerful data structure (MongoDB) and fail to realize
that their application really need the more powerful relational data
structure.

These things should be good learning examples for all.

~~~
lincolnbryant
This also makes it sound like whoever intervened to rewrite said feature in
sharded mysql had an easy time. Usually this would not be an obvious port.
However we don't know the technical nature of the feature or specifically why
it failed.

~~~
ryoung
I'm the developer that helped migrate the data from mongo to mysql (under
mcfunley's supervision). Even a straightforward data migration becomes
complicated when you have to do it without affecting the production feature or
consumers of your public api (parallel writes to both dbs, snapshot and move
the historical data, switch reads to the new db, etc). In addition, we took
the opportunity to move the feature to a sharded architecture and to rethink
the schema. Anyway, you're right in that it wasn't exactly an obvious or easy
port.

~~~
lincolnbryant
Congrats then! I've seen mongo features start simple then end containing a
bunch of embedded lists of primary keys to SQL or other data stores, rather
than what would be a bridge table or two. Not as elegant if your not putting
everything there (as some people here say here is mandatory to prevent the
overhead running mongo on top of everything else like you mentioned

------
leetrout
This is a really good point that doesn't bring up any direct slams against a
particular tool; +1 to the author for that.

I've found using Mongo as a stop-gap for consuming JSON APIs extremely useful.
You could probably s/Mongo/{nosqldb} there since it's nothing earth
shattering.

However, as the only tech guy in our startup I'm always looking harder at
Redis than Mongo for most of the problems for which a NoSQL solution might be
tempting. I've recently had a lot of success with JSON in Postgres and knowing
HStore is always there if I need it has firmly cemented my opinion that I
don't need a separate NoSQL solution (yet). (Of course I am merely persisting
data in JSON format- not querying on it).

------
darrencauthon
Maybe it should be titled...

"Why MongoDB Never Worked Out Two Years Ago When We Tried to Run It For Our
First Time For One Feature, And Beside Another Database Which We Really
Considered Production."

I've seen and used MongoDB on multiple projects, big and small, and it's fine.
It's a database that stores data. Use it for that purpose and you will be ok.

~~~
gfodor
You didn't read the article because the point was that the lesson learned was
that if you are going to have two data stores the human tendency is for one to
be a second class citizen with regards to support by ops, etc.

------
jeremyjh
This is totally reasonable. MongoDB, more than any other "NoSQL" database,
directly competes with MySQL/Postgres as a general-purpose application
database. I don't see a need to have more than one for most applications - at
least as long as there is only one development/support team for that
application.

------
monstrado
Most of the production deployments I find on the internet are around 3-5
nodes. Are there any production clusters that are running 500-600+ nodes?

~~~
francesca
Disney runs over 1400 instances according to this presentation:
[http://www.10gen.com/presentations/mongosv-2011/a-year-
with-...](http://www.10gen.com/presentations/mongosv-2011/a-year-with-mongodb-
running-operations-to-keep-the-game-magic-alive)

Also foursquare runs a very large MongoDB deployment.
[http://www.10gen.com/presentations/mongodb-foursquare-
cloud-...](http://www.10gen.com/presentations/mongodb-foursquare-cloud-bare-
metal)

Craigslist: <http://www.10gen.com/customers/craigslist>

Shutterfly also has a very large deployment:
<http://www.10gen.com/customers/shutterfly>

~~~
monstrado
1400 deployed instances doesn't necessarily equate to a 1400 node cluster. It
seems to be very common for these companies to have several small to medium
sized clusters...nevertheless, still pretty large deployments.

------
stesch
Wondering why everybody is using MySQL if Postgresql is supposed to be better.
Are there any (startup) success stories involving Postgresql?

~~~
mminer
Perhaps one reason is that more hosting providers support MySQL but not
PostgreSQL. Amazon AWS, Google Cloud SQL, and numerous others offer hosted
MySQL solutions but not PostgreSQL. I'm unsure how much overall usage such
service providers account for though; it would be an interesting stat.

~~~
stesch
Here and on Reddit I can read stories about people who are free to choose
whatever they want. Nearly nobody uses Java or PHP if you can decide
themselves.

But it's always MySQL. Starting with MySQL, going back to MySQL, staying with
MySQL.

------
lmm
I'm surprised to hear this coming from Etsy, a place I thought of as doing
deployment right.

All these things should be simple. You already have (or should have) a unified
system for dealing with logging/monitoring/graphing/init scripts/backup across
multiple services that are far more different from each other than they are
from mongodb (Sharding strategy and slow queries are probably an application-
level concern). It shouldn't be hard - in fact it should be trivial - to add
one more service. At last.fm ( _disclaimer: my experience was brief and
getting on for two years ago_ ) it felt like we were running every database
under the sun, but we had a unified system for doing
deployment/monitoring/everything, so it was no bother to add one more if an
application wanted it.

------
mrinterweb
Misleading article tile's summary: We tried to use a technology that was less
mature than another technology. We had to figure some stuff out that had
already been figured out on the more mature technology. Using two technologies
was more complicated than using one.

------
emperorcezar
Oh look, another "We thought Mongo was a silver bullet and found out that was
wrong" post.

~~~
aroman
Except, as others have said, that is not what this article is saying at all.
That said, your comment seems to imply that you just read the headline, and
(perhaps understandably) didn't actually read the article.

I've learned, especially on HN, that article titles can be extremely
misleading.

~~~
emperorcezar
"Mongo tries to make certain things easier, and sometimes it succeeds, but in
my experience these abstractions mostly leak. There is no panacea for your
scaling problem. You still have to think about how to store your data so that
you can get it out of the database. You still have to think about how to
denormalize and how to index."

I read the article. That statement makes it sound very much like they thought
Mongo would be silver bullet for that feature.

------
nickaknudson
I think the real issue here is that most people don't understand /how/ to use
MongoDB.

The best use case for MongoDB is as a document store. I can essentially cache
numerous MySQL requests into a compiled set of useful information. Especially
if the information changes somewhat infrequently, then instead of running
MySQL requests for every page load I can pull the information from MongoDB. In
most cases when I use MongoDB, its not as a persistent data store, but as a
"compiled" data store.

MongoDB also has some useful set operations.

I for one don't believe that MongoDB is /directly/ competing with MySQL,
Postgres, etc. but rather enhances these databases.

------
druiid
Whenever I see articles come up like this one mentioning MongoDB, I wonder not
why people decided to go with Mongo, but why they didn't go with some of the
alternatives out there? For my part, we use Couchbase to great success and it
fixes many of the complaints against MongoDB. Then there's Riak and countless
others with well established quality installations. To me MongoDB seems the
buzzword NoSQL engine that gets used for 'play' projects, but not much in the
way of real-world implementations. Thoughts?

~~~
jeremyjh
I do not see any of those other NoSQL databases as really being equivalent.
MongoDB intends to be a general-purpose application database. It has many of
the features developers expect from MySQL/Postgres, such as arbitrary numbers
of indexed fields, partial record updates, aggregation queries (simpler than
Map/Reduce) and many others. Couchbase may be much closer in feature-set but
its developers claim they do not really compete with Mongo.

I do not see Riak or Cassandra as competing at all. In fact I would expect
most applications that use Riak or Cassandra are also using a general-purpose
database as well (such as MySQL or Mongo). You could use some of those
databases as a general purpose database but it would be more work for little
benefit. It makes more sense to me to use Riak or Cassandra for use-cases that
really need high-throughput and unlimited write-scalability and use an app
database for things like user accounts and preference management and all the
little things that can take up a lot of development time but will never have
really demanding runtime requirements (for 99.99% of internet apps).

~~~
druiid
Good points for sure. I think though I'd personally look at the different
solutions on both an architectural and feature basis. A good number of the
reasons that the original article listed as issues they came across, were
outside the realm of features available in the actual MongoDB system (more or
less) such as problems with logging, monitoring, backups, etc and were more
architectural issues. To be certain, these can (and probably will) be issues
with other systems to investigate.

------
tobyjsullivan
I'm so glad to see this post. I remember having a conversation with someone
from Etsy at one point and they made an offhand comment about MongoDB having
been a terrible idea with a hint that there was a longer story to it than we
had time for. I've been curious about the story ever since.

Finally, some closure!

