
MongoDB Finds A Major Adopter In Craigslist - DanielRibeiro
http://java.dzone.com/news/mongodb-finds-major-adopter
======
zefhous
The link should probably be to this page:

[http://blog.mongodb.org/post/5545198613/mongodb-live-at-
crai...](http://blog.mongodb.org/post/5545198613/mongodb-live-at-craigslist)

------
citizenkeys
This video is a presentation by Jeremy Zawodny, who co-wrote O'Reilly's "High
Performance MySQL" ( <http://oreilly.com/catalog/9780596101718> ). If there's
anybody worth watching a video about switching from MySQL to MongoDB, it's
that guy.

~~~
DanielRibeiro
For a direct link to the 34 min video:
<http://www.10gen.com/video/mongosv2010/craigslist>

------
ericd
A bit off topic, but since Jeremy seems to be in here, I've been wanting to
thank you for High Performance MySQL - it's one of the best technical books
I've read in recent years, with a lot of insights that had been
hard/impossible to piece together definitively by reading online articles/docs
alone. It has taken a lot of uncertainty out of things for me as a dabbling db
admin trying to scale.

------
Jak3t
It's mind-blowing to read that they could get data in mongoDB faster than to
get it out of MySQL. I wonder if their structure was so complex or if the
queries were complex as well, or is mongoDB that fast? Anyway, great to see
sites like this embracing new technologies and revealing the details.

[edit] I guess watching the video explains a lot more. Do watch!

~~~
jzawodn
Yes, it is a bit mind-blowing. The only DB boxes I can consistently get data
out of at a high rate are those with Fusion-io cards in them. And we're not
heavily normalized--just a handful of tables.

------
zmitri
I recently went to a Google Tech Talk on MongoDB, and although the
presentation wasn't very hacker oriented, I must say the setup of replication
sets and sharding is ridiculously easy
([http://www.mongodb.org/display/DOCS/Simple+Initial+Sharding+...](http://www.mongodb.org/display/DOCS/Simple+Initial+Sharding+Architecture)).
I started up a project using CouchDB and Redis, but was impressed enough with
Mongo's scalability/ease of use that I think I might switch over to MongoDB
for a bit.

------
danielrhodes
The big tests for Mongo are how well it behaves under heavy load, how easy it
is to shard/replicate data at large scale (e.g. what Zawodny was talking about
in his presentation about Craigslist having 100 MySQL boxes), and to what
degree data is recoverable when inevitable failures happen (there are some big
questions here with Mongo since it appears it trades off ACID compliance for
speed).

It's great to see bigger users using Mongo because that's where these tests
really take place. For example, it seems Cassandra got tested this way at
Facebook/Reddit/Digg etc. and didn't cut it.

~~~
EwanToo
Pretty much all distributed databases trade off ACID for speed, it's a
fundamental issue with database design.

If you've got 20 database servers, potentially on the other side of the planet
from each other, if you were to be ACID compliant throughout the cluster, each
write commit would be delayed up to 500ms or more.

If you're running a pair of servers in the same room, as is normally the case
with a traditional "active-active" database cluster, a network delay of 1ms
often isn't a significant impact.

------
nwmt
Not to take away from the substance of the article itself, but is anyone else
surprised that they have 2 billion "documents", which presumably means active
ads/listings? That seems like an awful lot.

~~~
HarrisonFisk
MongoDB is being used for historical archiving, not for the live site itself.
The big reason being that changing table schemas for very large sets of old
data is painful with MySQL. So the 2 billion number would be any ad/listing
older than a set amount of time.

The live data is < 1 TB and is still stored in MySQL.

~~~
jzawodn
Exactly.

The "set amount of time" typically hovers around 60 days, though our archiving
process has been off for several months while the migration took place. So we
have some catching up to do--somewhere in the neighborhood of 150 million
postings, last I counted.

~~~
biot
I've been hearing some good things about Riak lately and their masterless
implementation seems quite interesting. Did Riak ever make your radar and, if
so, what were the disadvantages that made you choose MongoDB?

Were I to guess based on the video, I would say lack of a Perl client and
you'd probably end up having to roll too many of your own solutions on top of
it?

