
Rsync.net: ZFS Replication to the cloud is here and fast - wstrange
http://arstechnica.com/information-technology/2015/12/rsync-net-zfs-replication-to-the-cloud-is-finally-here-and-its-fast/
======
KirinDave
This is awesome, but it should also remind us of something that is easy to
forget. This is tangential to the thread, but relevant becuase of ZFS.

Note how they run ZFS, "FreeBSD."

We tend to ignore it shipping cloud services and mobile apps and websites, and
just focus on Linux based deploys. But FreeBSD is out there. It has all the
amenities you'd expect from a modern Linux and more including:

1\. Compatibility with nearly all Linux binaries.

2\. A more robust implementation of containerization (Jails), with a matching
implementation of a docker API.

3\. Proven OpenZFS support.

4\. A really robust framework for moving and standing up software.

5\. Proven performance.

In a way, tbh, it's like one of many tech stacks we've neglected for the
"mainstream" even though it has a proven track record and great features. ZFS
is also in that family, as is stuff like Mono and OCaml.

~~~
PhantomGremlin
_We tend to ignore it_

One thing that FreeBSD _doesn 't_ have is a well known public face. Every
software project (and company) needs one.

Linux is Torvalds. Of course there's a lot more to a distribution than the
kernel, but most people don't think that far.

OpenBSD is Theo. Love him or hate him.

But FreeBSD? A while ago Jordan Hubbard came to mind. But he was never "the"
guy, and he left a long time ago. I view it as being run now by a nameless,
faceless cabal.

Anyway, that's just my opinion on one reason why we tend to ignore it.

------
jlgaddis
ZFS is awesome. I never would have imagined I could ever get excited about a
filesystem, but I did. If you've never used it, I encourage everyone to check
it out. It's so completely "flexible" and you can do things that you never
would've thought of.

A while back, my personal mail server ran on FreeBSD 9.x on bare metal. I
wanted to rebuild the physical host but run everything in jails on a "clean"
10.x install instead. I created a new virtual machine on our ESXi cluster,
booted it with an mfsbsd image, and prep'd the hard disks. Back on the server,
I shut down all the services, took snapshots, and shipped them off to the VM
using ZFS send/receive. In short order, I had P2V'd my mail server and it
worked wonderfully. I left everything running in that VM for a few days while
I rebuilt the physical server and got all of my jails set up and then moved it
all back.

Allan Jude gave a cool talk at vBSDcon a few months ago entitled "Interesting
Things You Didn't Know You Could Do With ZFS" [0]. If you've never used ZFS or
you're wondering what it might have over your current filesystem of choice,
you might check it out.

[0]:
[https://www.youtube.com/watch?v=qXOZmDoy2Co](https://www.youtube.com/watch?v=qXOZmDoy2Co)

------
rsync
Obligatory:

rsync.net has always had a "HN readers discount" which is quite substantial.
Just email info@rsync.net and ask about it.

------
coldpie
I loved reading this article because it's just pure nerd joy. The guy is so
enthusiastic about rsync, measurement, and unix, just reading the article is
fun, even if I don't have a use for this particular tool.

------
swiley
Just a note to anyone new to rsync and friends: Rsync is a really fast version
of cp that also works over networks. Rsync does not go both ways /and/ handle
file deletion (you have to choose one or the other), so it can't replace
dropbox and syncthing for example.

~~~
LeoPanthera
If you do want something that goes both ways, unison is essentially a
bidirectional rsync.

~~~
moviuro
An syncthing ([http://syncthing.net](http://syncthing.net)) is a working
unison :)

(See [http://popho.be/#gift](http://popho.be/#gift) for some more details,
search unison)

~~~
rsync
Also note - rsync.net has supported Unison since 2007, although these days we
would recommend using attic or borg, which we now support.[1]

[1] As in, full, native support - not just using an sshfs mount...

~~~
LeoPanthera
Interesting. Both both and attic are new to me. What are the differences
between them? They look very similar.

Edit: Never mind, I found a handy list of differences here:
[https://borgbackup.readthedocs.org/en/stable/#differences-
be...](https://borgbackup.readthedocs.org/en/stable/#differences-between-
attic-and-borg)

------
LeoPanthera
I'm an rsync.net customer, and I use ZFS locally, but I'm not using ZFS
replication because I can't control in what order the files in the filesystem
get backed up. (And I'm also unsure whether it's even possible interrupt a
transfer and resume it later, if the underlying data has changed. rsync does
this no problem.)

My filesystems are pretty massive, but my internet connection is pretty
crappy. I currently am running a script I hacked together which rsyncs my
files up to rsync.net in (roughly) file size order - smallest first, largest
last. This way, changes to huge files don't spend hours/days backing up, and
preventing the backup of smaller files.

I haven't yet found a better way to do this.

~~~
tjfontaine
OpenZFS has resumable send and receive -- so even if they don't have it today,
they could have it in the future.

[http://blog.delphix.com/matt/2015/03/25/resumable-zfs-
sendre...](http://blog.delphix.com/matt/2015/03/25/resumable-zfs-sendreceive/)

[https://www.illumos.org/issues/2605](https://www.illumos.org/issues/2605)

~~~
rincebrain
OZFS has it, but ZoL doesn't (yet), so I'd guess they don't yet.

[https://github.com/zfsonlinux/zfs/issues/3896](https://github.com/zfsonlinux/zfs/issues/3896)

e: Ah, they run ZFS on FreeBSD, but I don't think FBSD has it yet either...

------
nailer
Is 'rsync.net' actually Tridge [1], or just someone else trading on the name
of an Open Source project? There's no 'about us' on the site to allow us to
determine whether this is legit or not. And the stock photos look really
sketchy.

[1]
[https://en.wikipedia.org/wiki/Andrew_Tridgell](https://en.wikipedia.org/wiki/Andrew_Tridgell)

~~~
PhantomGremlin
_There 's no 'about us' on the site_

I tweaked them about this a few weeks ago, and got a detailed answer here on
HN. But I guess they haven't gotten around to updating their site yet.

[https://news.ycombinator.com/item?id=10674029](https://news.ycombinator.com/item?id=10674029)

~~~
rsync
Sorry - getting to that ...

------
antman
OK so how do we setup a fresh ubuntu all with zfs?

~~~
moviuro
You use FreeBSD + an hypervisor on top?

Or you try ZOL (ZFS on Linux): good luck!

~~~
brunoqc
[https://wiki.ubuntu.com/Kernel/Reference/ZFS](https://wiki.ubuntu.com/Kernel/Reference/ZFS)

I'm not sure if it's usable now and if it will be good with 16.04.

~~~
rincebrain
I've been extremely astonished with how stable ZFS on Linux has been for me -
I remember when it was unusable.

------
ClashTheBunny
Combine this with [https://github.com/jollyjinx/ZFS-
TimeMachine](https://github.com/jollyjinx/ZFS-TimeMachine) and you are in a
much better place than Apple's terrible network Time Machine. It's cross-
platform (except Windows) and it's much more reliable than SMB or AppleTalk.

~~~
rsync
Just a note ...

ALL rsync.net accounts run on a ZFS infrastructure.

The 1TB+ accounts that you can send/recv to do _not_ have snapshots enabled,
since it's your zpool and you can set up whatever snapshots you want.

But, our _normal_ accounts[1] that you cannot ZFS send/recv to _do_ have ZFS
snapshots set up that we create and maintain for you, and they are _immutable_
[2].

So this means you can just do a plain old 1:1 rsync mirror to your rsync.net
account, forget all about incrementals or versions, and we will do whatever
snapshot schedule you like.[3]

So it's just like time machine, but more efficient (block level vs. hard
links) and you don't have to set it up or worry about it _and_ an attacker
that gets your rsync.net credentials cannot wipe out your historical backups,
since they are immutable here.

[1] As always, HN-readers discount - just email info@rsync.net

[2] even for our local root ...

[3] default, for free, is 7 daily, or 7 daily + 4 weekly for TB+ customers,
but you can set up any schedule you want (21 days, 8 weeks, 6 months, etc.)

~~~
tekacs
Any comment on why normal accounts can't ZFS send/recv?

Is it because there isn't one filesystem per customer? Is that something that
could be done, feasibly? Would be curious to hear. :)

------
matt2000
Really fun to read article, but could someone give me a couple examples of
what kind of system you'd use this with? I guess I'm so used to S3 now that
I've forgotten what it's like to manage your own files.

~~~
seiji
ZFS supports (and excels at) two necessary things for a good file system: copy
on write snapshots and filesystem serialization. ZFS is designed to run one
filesystem _per user_ , so ZFS is designed to scale crazy high with little
management overhead. It also makes it easy for snapshots to equal isolated
per-user or even per-appication backups.

You can do neat things like: create a base filesystem, create a snapshot of it
(i.e. "freeze" the filesystem in time), send the snapshot to another system
(or over the network for S3 storage, etc), continue modifying files, create
another snapshot on top of your first one, now you can send/backup the new
snapshot which is only a _diff_ of the original snapshot.

Many other systems besides ZFS support snapshots/copy-on-write layers, _but_
they are mostly hacks and your performance will drop considerably during the
"snapshot existing" time. ZFS is different because the snapshot mechanism
doesn't kill your performance (unlike the 18 layer Linux file system
organization).

"Sending" a snapshot can either be directly over the network to another system
to ingest the snapshot (enabling you to stream a new filesystem/snapshot into
existence on a remote machine), or you can gzip the snapshot stream and send
it off to backup storage.

You can layer your snapshot diffs as much as you like. As long as you have all
the diffs back to the original file system, you can re-create any point-in-
time snapshot layer from your backup archive.

Longer details with examples are all over, but a good overview point is:
[https://docs.oracle.com/cd/E18752_01/html/819-5461/gbchx.htm...](https://docs.oracle.com/cd/E18752_01/html/819-5461/gbchx.html)

Also, it's basically the only _sane_ way to run containers in production since
all the copy-on-write and snapshot sending isn't an 8 layer deep hack cake. In
Solaris, containers are aware of filesystems at the deepest levels and the
_networking_ is aware of containers too. All Linux-based container fads over
the past 5 years still haven't caught up to Solaris+ZFS+Zones+Crossbow from 6
to 10 years ago.

------
brunoqc
Is the data encrypted?

~~~
LeoPanthera
It runs over SSH, so yes.

~~~
brunoqc
What about on the server side? For the storage.

~~~
M4v3R
If you want server-side encryption, then Tarsnap is probably better for you.

~~~
brunoqc
Yes I use Tarsnap but an encrypted backup using file system snapshots would be
the best thing ever.

~~~
barkingcat
One way that you can do this is just to zfs replicate a volume to another
system that has an encrypted zfs mount. Probably not what you are looking for,
but voila - end to end snapshotted encryption - except it's not a "backup",
more like a hot failover volume.

------
gjem97
What other cloud providers do people use that support ZFS? Is ZFS on Linux a
viable option these days?

~~~
benjohnson
Hetzner.de works fine for my needs - they have a ~220 Euro storage server with
10 x 6TB drives that you can install FreeBSD 10.2 on.

With ZFS you can have about 35TG usable (ZFS filesystems degrade in speed at
80% in my experience)

My cost is $.007 USD per GB.

~~~
toomuchtodo
You may want to look into Backblaze's new storage offering B2. It's $0.005
USD/GB.

[https://www.backblaze.com/b2/cloud-storage-
pricing.html](https://www.backblaze.com/b2/cloud-storage-pricing.html)

