

Automating partitioning, sharding and failover with MongoDB - dmytton
http://blog.boxedice.com/2010/08/03/automating-partitioning-sharding-and-failover-with-mongodb/

======
pierrefar
Boxed Ice is probably one of the larger MongoDB production deployments, and no
doubt letting the DB handle the data growth will be a good thing for them. If
anyone is going to run into issues deploying this, they will.

I have great hopes for Mongo 1.6: the automatic partioning and failover make
it a very useful addition to startups' toolkits who need this kind of data
storage. It's like having your very own Amazon SimpleDB that's cheaper to
build and you control it 100%.

BTW, David gave a good talk about how Boxed Ice uses Mongo recently in London:
<http://skillsmatter.com/event/cloud-grid/mongouk>

~~~
rb2k_
If sharding and repliaction is a major concern, I'd also take a look at riak
or cassandra. Especially as a simple key-value store, they offer more
interesting balancing and error recovery features than MongoDB. MongoDB does
however have a rich data-schema and allows queries (although the necessary
indexes bring up ram consumption a fair bit and might become a problem)

~~~
pierrefar
Yeah I looked at them a little bit. The reason I went with Mongo for my
purposes was because it works the way I thought of the data and also it has a
very helpful community. It's also very easy to set up and get going with PHP.

------
Nitramp
Something that surprises me about MongoDB is that they always talk about how
this is for humungous data sets.

But then MongoDB does not seem to support online/hot backups without setting
the whole system to read only. Incremental backups are not supported at all,
AFAICS. If a server crashes, you're up to some really lengthy fsck action
before your server is back up.

I don't understand how people actually make this work. If you have a huge
databases, a complete backup is going to take too long to make setting your
system to read-only feasible. And if a server crashes, even if you have a hot
standby replica, if you have to do several hours of fsck before you can
restore regular operations, that doesn't sound good either.

It's nice to see movement in the database market, and MongoDB looks really
good from a front-end side (JS, indexes, queries, ...); but I think it really
is lacking some crucial features.

~~~
mathias_10gen
You don't have to set your whole system to read only for the entirety of the
backup. There are three ways to avoid it:

1) Use mongodump which queries for every object in the database. It will
produce a compacted file, but does not do a point-in-time snapshot.

2) Take a backup from a slave. You can either set it to read-only mode, or
just shut it down and copy the files.

3) Run MongoDB on top of LVM. LVM supports almost instant snapshotting because
it uses copy on write. Once the snapshot has been made (usually less than a
few seconds) you can resume the server for read-write operations.

Note that all three options can be combined. You could make an LVM snapshot of
a read slave then use mongodump's --dbpath option to produce compacted backup
files.

Please see <http://www.mongodb.org/display/DOCS/Backups> for more details.

~~~
rbranson
I second the snapshotting using the disk system. It would be nice to be able
to do this without locking the database, but LVM or ZFS snapshots take
fractions of a second to create. In the mean time, pending transactions will
just wait.

