
Rsnapshot: filesystem snapshot utility based on rsync - eatonphil
http://rsnapshot.org/
======
discreditable
I used rsnapshot for a long time and I can't recommend it over attic
([https://attic-backup.org/](https://attic-backup.org/)) anymore.

Rsnapshot works quite well but over time I started having problems. I was
using it to snapshot a web server with many files uploaded by users. Over time
there were so many millions of files that trying to work with the rsnapshot
directories required a lot of patience. There was one time when I needed to
copy the files to another storage medium, I let rsync run for DAYS and there
was no end in sight. There were also two instances where I had some filesystem
(ext4) corruption within my rsnapshot tree and in the end the only way I could
get it happy again was to delete some suspect backups from my chain.

If you're using it to backup anything more than /etc/, I'd strongly suggest
using attic. It has these advantages over rsnapshot:

* Block-level dedup instead of file-level. If 1b changes in your 1g file, that's another 1g needed in your backup.

* Compression.

* Encryption via key file, and/or passphrase.

Disadvantages over rsnapshot:

* You access your backups via a fuse mount. This isn't awful but I admit it isn't as nice as direct access.

* If you are running attic from a local host to a remote one, the remote needs attic installed. Alternatively you can store your attic repo locally and mirror it remotely in some other way.

* It can take a long time to work with large backup chains.

~~~
vog
Attic shouldn't be used anymore! [1]

You should either switch to Borg, which is a not-backwards-compatible fork of
Attic:

[https://borgbackup.readthedocs.io/en/stable/](https://borgbackup.readthedocs.io/en/stable/)

Or you should switch to Obnam, which seems to have a cleaner implementation
than Attic/Borg, and most importantly, a well-documented archive format:

[http://obnam.org/](http://obnam.org/)

Note that both have an archive format which is incompatible to Attic. Also
note that some years ago Obnam had poor performance compared to Attic, but as
far as I know, it became much faster since then.

[1] A serious bug which may cause data-loss has never been fixed, so Attic was
removed from some distros such as Debian: [https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=802619](https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=802619)

~~~
oever
Unlike Attic, Bup, and Borg, Obnam does not use rolling checksums. Rolling
checksums are a brilliant way to do efficient deduplication. The bup DESIGN
document explains how rolling checksums can be used.

[https://github.com/bup/bup/blob/master/DESIGN](https://github.com/bup/bup/blob/master/DESIGN)
[http://obnam.org/bugs/more-details-dedup/](http://obnam.org/bugs/more-
details-dedup/)

------
beagle3
While rsnapshot and friends are useful as they provide a real file system
(e.g., on a portable drive that can just be connected to another computer at
any time, no app required), there are "bup", "borg backup" and (IIRC) "obnam",
which all provide much faster snapshotting[0], much better space utilization
(a change of 1 byte in a 1GB file takes 1GB for rsnapshot; it takes 4KB-128KB
for the alternatives), compressed storage, and more. And if you want a "live"
system, they provide a FUSE mountable system that makes every past revision
look live.

I haven't used obnam, but from the interwebs it seems like bup and borg are
much much faster, with bup's forte being in the situation where many clients
with similar files are backing up to a remote server over ssh (in this
specific scenario, borg is slow and cumbersome), and in that it's just a git
repository you can poke at. Otherwise, borg has encryption, deletion of old
versions (still experimental in bup), and many other goodies.

[0] not really an atomic snapshot, of course - they don't make any more or
less guarantees than rsnapshot and friends.

~~~
ComodoHacker
Add zpaq to the family. It also supports deduplication, encryption and offline
catalog. The latter means you don't need access to old backups to perform
deduplication. And it's cross-platform.

~~~
beagle3
I tried zpaq four years ago and it looked very well built, but exceptionally
slow (as in, 5-10 times slower than bup at the time). Have things changed?

~~~
ComodoHacker
I didn't compare with bup and can't say how much has it improved in four
years. But yes, subjectively it's slow, mostly because of compression I guess.
Not "exceptionally" slow though. With default settings it's a good tradeoff
IMO.

------
prohor
I like also rdiff-backup which makes a mirror of a directory + reverse diffs,
in case you need to move in time. [http://www.nongnu.org/rdiff-
backup/](http://www.nongnu.org/rdiff-backup/)

Duplicity is also another tools that uses rsync algorithm to find and stores
differences, but makes unfortunately incremental backups. On upside - it can
store backup on anything that you can push files (FTP, S3, IMAP, ...) and it
is the only rsync-related solution with encryption I could find
[http://duplicity.nongnu.org/](http://duplicity.nongnu.org/)

There is also a very nice full solution - BoxBackup, which unfortunately looks
not very active now. It is continuous backup, with block level de-duplication,
client-side encryption. Unfortunately uses its own format, where I strongly
prefer to have things more standard, so you can manually fix them if something
goes wrong. [https://www.boxbackup.org/](https://www.boxbackup.org/)

~~~
Bromskloss
> I like also rdiff-backup which makes a mirror of a directory + reverse
> diffs, in case you need to move in time. [http://www.nongnu.org/rdiff-
> backup/](http://www.nongnu.org/rdiff-backup/)

My favourite backup tool! It's a tragedy that it seems to have gone stale.

~~~
Zitrax
There seem to be some activity now in the github repo and the home page has
definitely been updated since last time i visited it.

------
cyphar
The issue with using a setup like this is that you run the risk of having
inconsistent snapshots, where several related files were snapshotted at
different points in time. I'm not sure how this lash-up works with something
like a database, where you have to have a consistent copy of a single file as
well.

If you use Btrfs (or ZFS if you're on FreeBSD or illumos) then you can have
proper filesystem snapshots with essentially no overhead. Since both
filesystems are transactional, you're guaranteed to get a snapshot of a single
transaction (resulting in no inconsistency). Unfortunately due to partial
writes in database files, Btrfs has some performance issues when having copy-
on-write enabled (which is required if you want snapshots) for a database
file.

~~~
kijin
You can't use rsync, rsnapshot, or any other tool that manipulates files
and/or filesystems, to get a consistent backup of most databases. Databases
come with their own tools and/or commands for creating a consistent "dump".
Nothing else is guaranteed to be safe.

I use mysqldump/pg_dump/etc. to create database dumps and copy them to backup
storage, and rsync/rsnapshot to back up the app itself. There is no need to
use a single tool to back up everything. If your app is well designed, there's
not even any need to back up the database and the filesystem simultaneously.

Even SQLite has a .dump command.

~~~
koolba
> You can't use rsync, rsnapshot, or any other tool that manipulates files
> and/or filesystems, to get a consistent backup of most databases.

Sure you can and if you can't, then get a better database!

A well functioning database should be able to deal with being inconsistently
shutdown. If you can't take a snapshot of the disk then what's the difference
between that and the power cord being pulled?

The main reason not to do this on a DB server is that the recovery time may
large. That doesn't mean it can't be done though.

> Databases come with their own tools and/or commands for creating a
> consistent "dump". Nothing else is guaranteed to be safe.

Sure they do and you should use those as well, that doesn't mean you can't do
this too.

~~~
cyphar
The issue is not that you have an atomic copy of the state at a certain point
in time, which in a bad database couldn't be loaded. The problem is that when
you make a full copy of a file you are copying linearly and if it's a big file
a chunk could change while you're reading it (or multiple related chunks which
you are in the middle of copying) -- then you have a __very __inconsistent
state which a database can 't recover from no matter if their updates are
atomically written. The database could do some magic with unlinking, but that
would cause massive performance problems.

~~~
koolba
That's a solved problem:
[http://linux.die.net/man/8/xfs_freeze](http://linux.die.net/man/8/xfs_freeze)

Combine that with disk snapshotting and you're fine.

~~~
cyphar
Or just use a filesystem where snapshots are atomic by nature. :P

------
modeless
There are many rsync-based backup systems like this that assume you back up
many times and don't restore often, so backups are efficient (deltas, etc) but
restoring is not (just download the whole thing). I'd like to use an rsync-
like mechanism for software distribution, where the problem is reversed: the
software is updated infrequently, but there are many clients that all need to
download updates, so "restoring" should be efficient. Is there any open source
solution for that? Preferably where the server can be dumb, like an S3 bucket.

~~~
rakoo
You might want to have a look at zsync
([http://zsync.moria.org.uk/](http://zsync.moria.org.uk/)), which is exactly a
reversed rsync: the seed server calculates a signatures file over a file, the
clients download that signatures file (through any means, but usually HTTP)
and calculate a diff on their side; once they have instructions they can
download only the parts of what they need through HTTP Ranges. Which means
that any dumb HTTP storage that respects Ranges will work.

The big advantage over rsync, on top of being available over HTTP, is that the
server only calculates the signatures file once for the life of the content,
instead of calculating it for each and every client.

I know some distributions use that, but it's not very well known... yet ?

~~~
modeless
Cool! Zsync looks like exactly what I want, except that it doesn't sync
directory trees, only individual files.

------
da_n
I spent a good few weeks recently researching and testing different back up
systems that I could install to a Raspberry Pi with a big USB hard drive
attached, to act as an extra offsite backup for some remote servers. rsnapshot
was pretty much the only system I found that ticked every requirement, it has
proven itself very robust in the month or so I have had it doing nightly
snapshots of around 5 servers. Previously I had used a Ruby back up library
which was actually really good but needed to be run from the server rather
than the back up endpoint.

~~~
stevekemp
I wonder what you included in your requirements?

For me things I value are encryption on the backup-host, and de-duplication.
That lead me to obnam, and attic.

rsnapshot is a nice tool, because it only needs to transmit things that have
changed, but without encryption it isn't something I'd personally want to run
again.

~~~
prohor
The de-duplication in this case is achieved by hardlinks of unchanged files
between snapshots. This means, deduplication is only on file level, so even an
append makes a new copy.

I've been looking for an rsync-based tool with encryption, but really couldn't
find any. There is that uses rsync way of finding differences & keeps change
log, but this is in fact normal full-snapshot & incremental backups:
[http://duplicity.nongnu.org/](http://duplicity.nongnu.org/)

~~~
extra88
My understanding of what people looking for de-dup want is for their backup
store to contain zero files with identical contents. With rsnapshot, if you
rename a file, rename a parent directory of it, or move the file to another
directory, you will get another copy of that file in your backup.

I've used rsnapshot for about ten years, mainly to make remote backups of data
on SMB/AFP file servers, and something I had to actively discourage was staff
using the moving of files/folders between higher level folders as a method of
project management. One group in particular does video production so a project
between yesterday and today might have only a change of a few bytes but
because the project's folder was moved to a different parent folder (e.g.
Completed/), rsnapshot would make a new complete backup of that project which
could be tens of gigs. Filesystem snapshots like ZFS's avoid this problem but
I haven't worked with how those are copied remotely.

------
oever
There are few more similar project such as Restic (Go) and rdedup (Rust).

[https://github.com/restic/others](https://github.com/restic/others)

[https://github.com/dpc/rdedup](https://github.com/dpc/rdedup)

------
utternerd
I've used [http://backuppc.sourceforge.net/](http://backuppc.sourceforge.net/)
for going on 16 years now, it's great and still maintained, albeit by one guy.
It just works. Not sure how it stacks up to this.

~~~
gamedna
I have used backuppc for years as well, and it uses a similar strategy of
rsync/hardlinks to make point in time snapshots available to the users.

------
amelius
Note that ext3 and ext4 only support 64k hardlinks, so if you already use them
a lot, then snapshotting may make you run out of hardlinks even faster.

~~~
pwg
Yes, but that is not a global limit but a per-file limit. So you would need
one single file to have 64k hard links to it before you'd run into this issue.

------
aruggirello
I use Timeshift on my Ubuntu desktop which is basically the same, but wrapped
up in a nice GUI.

