
Desktop Backup: Traditional, and Torrent-Like - shmichael
http://shmichael.com/2009/12/desktop-backup-traditional-and-torrent-like/
======
beza1e1
No mention of tarsnap? <http://www.tarsnap.com/>

~~~
shmichael
Is there any special feature of tarsnap that I'm overlooking, or is it just
another kid down the block?

~~~
cperciva
Tarsnap is unique in a few ways, but I don't know if you consider these
important enough to be worth mentioning. For example:

* Tarsnap was designed to be secure against even the most skilled attackers -- and was written by someone (myself) with non-trivial expertise in cryptography and computer security.

* Because Tarsnap is built around tar(1), it is heavily scriptable; for experienced users this makes it far more flexible than any other tools.

* Tarsnap is AFAIK the only backup system which works as a metered service -- pricing per byte of bandwidth and per byte-month of storage used, starting at a (very small) fraction of a cent per month. Where other services have fixed monthly fees, Tarsnap just looks at your usage and charges you accordingly.

* I'm not sure if Tarsnap's snapshotting model is unique, but it's certainly unusual; and once Tarsnap users follow my advice of "forget everything you know about incremental backups", they all tell me that it's far more intuitive than other approaches.

~~~
shmichael
Thanks for the clarification. I've added tarsnap to the post.

~~~
cperciva
Thanks! Just one small correction: Tarsnap isn't linux-only. It runs quite
happily on BSD, Linux, OS X, Solaris, Cygwin... basically anything UNIXy.

~~~
shmichael
Corrected.

------
lallysingh
So, what's wrong with a tape backup drive and a box (for the tapes)? My
dataset's 6gb. I currently use zfs incremental and full snapshots to generate
a single file per backup to save.

There are two issues I haven't seen addressed:

(1) No guarantee of privacy: all my data's on someone else's box. I haven't
seen any of them go to court to defend a person's data yet. And this isn't a
phone log or URL list, it's _everything_ _they_ _have_. Backup privacy is not
an area where you screw around.

(2) Upload bandwidth. Most network links available anywhere I've lived are
asymmetric, with a massive bias downlink-side. Uplink speeds are still
measured in 100s of kilobits/sec. $160/month bought me 1.5 mb down, and 768kb
up, with a static ip.

I'd rather run my dataset through gpg and write it out to tape.

~~~
shmichael
A tape has several downsides:

a. It is physically close to your original data's location.

b. It's a hassle to back up (remember to perform backup, insert tape, zfs,
label tape).

c. Redundancy and versioning is at an extra cost.

d. Cannot be utilized for remote access to files.

.

As for the two issues missing -

(1) The data is encrypted before uploading. In addition, it is sliced. So, no
single machine should have access to the whole file - nor would it even know
what the other parts are. Moreover, requests for slices would be digitally
signed, so a different machine could not even request slices it does not
"own".

(2) Upload bandwidth is indeed an issue, but not a serious one. The general
argument goes that what works for torrents would work for backups in this
matter.

~~~
weaksauce
I don't know if you are adequately addressing issue number 1. He is talking
more about the rule of law and a court ordering the data to be disclosed
versus what you are talking about which is the technical issue of one person
gaining access to another persons files via technology. One would imagine that
seeing as this is a backup solution for total/partial failures, one would need
to be able to bootstrap a machine from this information. If all the required
information to recreate a machine was on the machine that failed then this
backup solution would be an expensive waste of time. By that logic there must
be a centralized repository of information at the server side thus they could
in theory be forced to divulge the information on the server if there was a
court order. I am not a lawyer though.

~~~
shmichael
I can't comment on the legal issues of this, since I too am not a lawyer, and
don't live in the US.

What I can say is that your key - whether automatically generated or your
favorite pass phrase - is known only to you, and without it all of your data
is lost.

Thus, if you want to you could store it at the server (so you'd never lose the
data) or you'd keep it to yourself (and never risk the data being exposed).

