

Notes from a production MongoDB deployment - dmytton
http://blog.boxedice.com/2010/02/28/notes-from-a-production-mongodb-deployment/

======
rubyrescue
If you suffer a power loss and/or MongoDB exits uncleanly then you will need
to repair your database.... it involves MongoDB going through every single
document and rebuilding it. When you have databases at the size of ours, this
would take hours.

\- that seems really scary

~~~
spudlyo
Scary yet familiar. I work with many databases that have large MyISAM tables,
which is essentially the same situation.

------
jcromartie
I'm currently using MongoDB for a new system. I'm glad to see them reporting
such a good experience at that scale. My deployment won't be anywhere near as
large, so it's very encouraging. It will probably continue to run in a screen
session like they say they did initially :).

------
viraptor
I wonder... if they say "However, these files are allocated before MongoDB
starts accepting connections. You therefore have to wait whilst the OS creates
~37 2GB files for the logs." and then just run:

    
    
        head -c 2146435072 /dev/zero > local.$i
    

Can it be lazily allocated by doing something like `open(); seek(2146435072);
write("\0");`. Is a sparse file _required_ to return "0" when reading empty
places? (or does it just happen very often)

~~~
ynniv
Several comments on the article suggest using sparse files. I suspect that
this completely defeats the purpose of pre-allocating space. The
initialization would be fast, but the time savings would return as a probably
larger penalty amortized over the runtime in a non-obvious way. The only way
to really know would involve performance testing the operation of a similarly
sized database restored under both conditions on fresh drives.

~~~
jbellis
> I suspect that this completely defeats the purpose of pre-allocating space.

There are two reasons to preallocate. One is so that you can get by with
fdatasync rather than full fsync (that is, you don't have to sync file
metadata as well, which is usually an extra seek; file length is the most
commonly changed part of "metadata").

The other is to use mmap, since you can't change the size of an mmap'd file.
This is the only part that mongodb cares about, since they never fsync.

There may be reasons to mmap a sparse file but I can't think of any.

~~~
ynniv
_There may be reasons to mmap a sparse file but I can't think of any._

It sounds like you understand the forces at work here, but I am confused by
your statements. Do you think that "pre-allocating" a sparse file is a valid
alternative to writing 2GB of zeros to disk in this case?

------
mikebo
Very interesting that they have 17,810 collections (aka tables). I wonder if
it is common w/ MongoDB to design a data model in this way? Anybody have more
info on this tradeoff vs. a smaller but larger size collections?

~~~
dmytton
I've commented on this here: [http://blog.boxedice.com/2010/02/28/notes-from-
a-production-...](http://blog.boxedice.com/2010/02/28/notes-from-a-production-
mongodb-deployment/#comment-818)

Essentially this decision was due to the disk space requirements because of
the way MongoDB allocates files for each DB.

There are no performance tradeoffs for using lots of collections. See
[http://www.mongodb.org/display/DOCS/Using+a+Large+Number+of+...](http://www.mongodb.org/display/DOCS/Using+a+Large+Number+of+Collections)

~~~
mikebo
I'm actually not advocating for a database per customer. I was more wondering
why you have so many collections -- surely they're not all representing
different logical data. My guess is that you create some number of collections
per customer?

With the new built-in sharding support, couldn't you have a 'customer_id'
field and collapse many of those collections into larger collections, then
shard by 'customer_id'? I'm just trying to understand tradeoff between these
two types of schemas.

