

Hybrid Incremental MySQL Backups - pkaler
https://www.facebook.com/notes/facebook-engineering/hybrid-incremental-mysql-backups/10150098033318920

======
dmpatierno
The real surprise here is that Facebook was still using mysqldump to perform
regular backups. Yikes that's slow.

~~~
leef
As opposed to...?

~~~
dangrossman
Something a little more high tech, like running MySQL on top of a file system
that does synchronous replication.

You don't even need your own engineering to do better than daily mysqldumps.
Run your MySQL instance on Amazon RDS and you get synchronous replication to a
hot standby instance (that's not MySQL's async replication, no delay and no
overhead), and the ability to take consistent snapshots from the standby
server at any time. The snapshots will be incremental since they live on S3
where duplicate chunks aren't duplicated, and don't load your database server
while they're taken because they're copied off the standby.

~~~
leef
> Something a little more high tech, like running MySQL on top of a file
> system that does synchronous replication.

You can't guarantee a consistent snapshot from the file system alone for
Innodb. You need some other logic on top which is exactly what Xtrabackup
provides.

RDS certainly isn't going to work for Facebook, but in order to achieve their
fast snapshot feature RDS is almost certainly using LVM snapshots.

LVM snapshots can be a better option than mysqldump but LVM snapshots are
local and therefore require enough space to take the snapshot and keep it for
the time it would take to copy it off box. This can be a big problem as you
can't rely on your snapshot to always work. Also, using an LVM snapshot
requires a 'flush table with with read locks' command to be run while the
snapshot is in progress which can bring in its own issues.

~~~
HarrisonFisk
You can actually use LVM on a live InnoDB instance as long as both the logs
and tablespace reside on the same volume. There is no locking required.

We actually used LVM at Facebook for MySQL backups for a while, however as you
stated it requires at least double local space. In addition it is also a real
nightmare on performance when it is running since it is double writing data
locally to disk. So basically you need to run at < 50% utilization for some
extended period of time to be able to take a snapshot successfully. We run
much higher than that 24/7.

When you aren't disk performance or space constrained then LVM snapshots can
be a very good option.

------
mleonhard
Restore time is important, too. XtraBackup backs up indexes, eliminating extra
time required to rebuild them during restore.

