Important differences between bup and multitape: I used C, not Python; I used a more sophisticated chunking algorithm; I used tape names rather than just numbering them.
Hopefully you don't view bup as competition for tarsnap. There's a lot of situations such as Xen backups, where I really want to dump things to a local backup machine, then off to tape- I'm not interested in any form of hosted service, but tools that make binary diffs efficient and easy would certainly be welcome.
It might also be interesting to expand bup (or multitape/multitar, if it is public), to use the method described in your thesis paper- http://www.daemonology.net/bsdiff/
I apologize for potentially touching on a delicate subject. I certainly wouldn't want to come across as advising someone to use your theories to steal food out of your mouth, but local v. remote bkp seem to be sufficiently different markets.
No; I never really considered it to be useful except as a step towards Tarsnap.
Hopefully you don't view bup as competition for tarsnap
Not really, no. Of course, the author could follow the same path as I took, of integrating this with tar code, and end up producing a competitor to Tarsnap.
It might also be interesting to expand bup (or multitape/multitar, if it is public), to use the method described in your thesis paper- http://www.daemonology.net/bsdiff/*
That doesn't really work. Binary diffs are about comparing old and new files to produce a small patch; snapshotting compares the new file to a list of parts of the old file*. It's a tradeoff between needing more local state (with binary diffs you need to have the old file to compare against) and having larger deltas (with snapshotting, you have a new chunk even if only part of it changed).
That said, my experience with bsdiff was certainly useful in terms of shaping how I think about efficient deltas and compression, so even though none of the ideas translate directly it definitely helped me in writing Tarsnap.
Test case: The last word of this sentence -- sans any training punctuation -- has the markup
Err... nope: It looks like it's only when wrapping a URL in the markup http://www.google.com/* a̶n̶d̶/̶o̶r̶ ̶w̶h̶e̶n̶ ̶t̶h̶a̶t̶ ̶U̶R̶L̶ ̶i̶s̶ ̶a̶t̶ ̶t̶h̶e̶ ̶e̶n̶d̶ ̶o̶f̶ ̶a̶ ̶p̶a̶r̶a̶g̶r̶a̶p̶h̶ ̶̶h̶t̶t̶p̶:̶/̶/̶w̶w̶w̶.̶g̶o̶o̶g̶l̶e̶.̶c̶o̶m̶/̶̶
Looks like I need to break some continuing italicization now.
Trying the same sort of markup but also wrapping a trailing space produces http://www.google.com/ , which is better..
Where ❄ represents an asterisk.
- Applications could register with the back-up utility, so bup (for example) knows to get a hotcopy from the svn repository and a dump from the database.
- Applications could be told to dump to a specific locations on the file system on a regular basis.
- Unix magic could be used, so reading from certain parts of the file system would trigger a dump from the appropriate applications. I'm not quite sure if this is possible (I'm a Unix weenie).
I don't care is best (they would all work). The real solution would have the following features:
- Automated nags (SVN-style) about dirty looking locations.
- A white-list of locations to suppress nags on (the same way SVN can be set to ignore the /bin directory, and the .pyc files).
- A way to resolve the nags (i.e. telling the backup server what commands to run in order to backup certain applications).
- A file format for backup hints (left in a hidden file called .bup in the program's main directory), so applications could automatically tell give hints to the backup program on how to get them to dump.
A nice GUI that auto-suggests backp-up commands (with shell integration like tortise-SVN) would be cool, but not essential.
I've tried to use "applications" consistently in the post. It could mean a database, a repo, website server, or anything. As long as the "application" has some sane way to be back-up up.
And yes, I do know that talk is cheap.
I generally keep complete copies from the past, since there's no easy way to say "Use this 20G image file as the base, then store the changes in this .iterativebkp file."
bup looks like a nice way to do that, but it'll need to mature a bit more (Like.. Pruning bkps..) before I could deploy it, even as a test system.