

A Tarsnap-Alike for Private Storage - rlpb
http://www.synctus.com/blog/2011/03/a-tarsnap-alike-for-private-storage

======
tropin
As always we mention TarSnap, I can't but be amazed Mr Percival, having all
the infrastructure in place just sort of a GUI, passes on the gold mine of the
millions of Windows users that couldn't care less about the mere existence of
a command line, but still would pay for a backup tool as good as TarSnap.

~~~
thaumaturgy
Colin Percival is another classic example of someone being brilliant in their
field, but a mediocre businessperson. This happens a lot; probably the most
common examples are great chefs that open restaurants and bakeries only to
find that actually running such things requires a set of skills that they
don't really have, or are interested in acquiring.

(edit: I say that even though I don't even possess the aptitude necessary to
fully respect his skills; he's a genius, and that's not a word I ever use
lightly. I just hate to see a great work of engineering, like Tarsnap,
languish in relative anonymity because it's poorly handled as a business.)

~~~
JacobAldridge
I get to work everyday helping some of the many brilliant individuals you
discuss, who are mediocre business people. It's surprising how many don't
recognise that business skills are just as important as their technical skills
(though, for the most part, easier to learn).

It is super exciting sometimes running my business, helping give transfer some
of those business skills that change their lives and those of their customers.

And I can't / don't speak about Colin in this comment.

------
elehack
Interesting. I've been using rsnapshot for a while for a similar purpose,
although its de-duplication is limited to keeping unmodified files as hard
links.

It'd be nice to see a brief comparison of ddar, rsnapshot, and other things
that might fit that role.

~~~
pqs
rdiff-backup also.

~~~
rlpb
I attempted a comparison with rdiff-backup here:
<http://www.synctus.com/ddar/rdiff-backup.html>

Some others are briefly mentioned under "Alternatives", including a comparison
with Tarsnap.

I only have direct experience of Tarsnap and duplicity. Additions/corrections
welcome.

------
thaumaturgy
We've been using BackupPC (<http://backuppc.sourceforge.net/>) to accomplish
something similar for a long time, and have been amazed at how it's able to
store huge amounts of backups on relatively small amounts of disk space. It's
highly configurable and has a decent web interface.

Sometimes, though, it can be challenging to get it working, and it has a few
other issues (e.g., moving backups for systems from one pool to another) that
can make it difficult in the long-run for some use-cases.

------
jayroh
Because I'm guessing there might be others thinking the same question, I'll go
ahead and ask -

How is this different from something like rsync?

~~~
mdasen
I think it's easiest to explain with an example.

So, every day at 10pm, I want to make a backup of my server to another server.
I ddar or rsync the files to another server. Monday at 10pm, I do it; Tuesday
at 10pm, I do it; Wednesday at 10pm, yep, did it. Now it's Thursday and for
whatever reason, I have to restore the system to the state that it was on
Monday. With rsync, changes have been overwritten.

ddar and Tarsnap are useful because they allow you to have a full backup of
each of those days, rather than just the latest. Likewise, they package it
better than the old "do a full backup and then do incrementals on top of it"
way so that you don't have to restore the original and then replay all the
incrementals. Now, you could just store full backups each night, but that
would take an incredible amount of space. ddar and Tarsnap allow you to have
the convenience of full-backup while having the data de-duplicated so that it
doesn't take up too much disk space.

Rsync is more about saving bandwidth while transferring large amounts of
files.

~~~
jonhohle
Are you aware of the --link-dest option for rsync? It seems to do the same
thing. I've been using this (via dirvish) for years for doing full nightly,
deduped backups.

~~~
cperciva
Hardlinks are only useful if you have files which are unmodified. Tarsnap (and
I assume ddar) will take advantage when _part_ of a file is unmodified.

------
RexRollman
Has there ever been a filesystem that does de-duplication on the fly? Or would
something like that be too slow for real world use?

~~~
getsat
ZFS does dedup.

~~~
wazoox
And so do LessFS, Open dedupe and some others.

~~~
RexRollman
Thanks for the info guys.

------
res0nat0r
This looks very cool. I'm trying to use the examples, I've created a ddar
directory with its db file and sha objects, but when I try a 'ddar tf
/path/to/ddar/directory' I only seem to get either a timestamp or the -N name
I've passed. Is this right? How to I see my filenames I've archived?

~~~
rlpb
As far as I can tell you've done it right so far.

ddar is generic and just stores whatever you put in. You also need to use the
tools you used to format the data in to read them back out.

If you've stored files using tar, then you want `ddar xf
/path/to/ddar/directory|tar t` to see the files. Or if you've also gzipped
them, then `ddar xf /path/to/ddar/directory|tar zt`.

Does this help?

~~~
res0nat0r
Ahh that is exactly it. I am just not using the correct syntax. Thanks! This
looks to be very cool, looking forward to trying it :]

------
weaksauce
Thanks for putting this up it looks really interesting! Is there a git repo
somewhere that we can look at or are we stuck with the tarball?

~~~
rlpb
Just the tarball for now, though it will probably make its way to GitHub soon.

------
gte910h
"It just so happens that I’ve been working on de-duplication for Synctus"

OP should clarify that Synctus has no issue with this project, prove he has
rights to release, etc, and stick that in the license.

This feels like a copyright morass (although probably isn't).

~~~
rlpb
I'm sorry this isn't clear. I wrote Synctus, am owner/director of the company
that owns it, and have released ddar under the GPL having authorised myself
through my company to do this.

~~~
gte910h
Right right, I actually expected that to be the case, but you won't
necessarily always be the owner of Synctus, etc.

You should probably do the short amount of paperwork required:
<http://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.html>

"If you think that the employer or school might have a claim, you can resolve
the problem clearly by getting a copyright disclaimer signed by a suitably
authorized officer of the company or school. (Your immediate boss or a
professor is usually NOT authorized to sign such a disclaimer.)"

Stick everyone's mind at ease.

Neat project (which is why I care enough to get the ducks in a row).

~~~
rlpb
A disclaimer doesn't quite make sense for me. Let me explain: I fully own True
Blue Logic Ltd and am sole director. Both Synctus and ddar are projects owned
by this company. Copyright on both Synctus and ddar are held by this company.
Synctus is not open source. This ddar release is open source (GNU GPL 3) as of
today.

So my company retains copyright but licenses ddar to the public under the
terms of the GNU GPL 3. Thus there is no disclaimer to sign, since my company
still claims copyright.

Of course, now that ddar is released under the GPL, it will be forever (in the
same manner as any other open source project).

None of this affects Synctus, which is not open source. If anyone contributes
code to ddar, those contributions would not be permitted to be used in Synctus
without a further licence from the contributor. I've considered this carefully
and decided that this won't be a problem for me.

~~~
gte910h
>So my company retains copyright but licenses ddar to the public under the
terms of the GNU GPL 3. Thus there is no disclaimer to sign, since my company
still claims copyright.

A disclaimer isn't needed now that it's clear in the COPYING file that TBL Ltd
(the copyright holder) is the one publishing it. Thanks for clearing the air.

------
ch0wn
Oh wow, I started using tarsnap yesterday night!

~~~
JacobAldridge
And they've already launched a major addition. Imagine how incredible the
company will be by a week from next Thursday!

------
RexRollman
Will this compile on Cygwin like Tarsnap does?

~~~
rlpb
It should only need a couple of minor tweaks. I am looking into it (as well as
OS X support) and will post on my blog when it's ready.

~~~
RexRollman
Thanks for the inforamtion. I have a combination of Linux or Windows machines
at home and it would be handy to be able to use the same tool on both.

