Hacker News new | past | comments | ask | show | jobs | submit login
Sustaining git-annex development (joeyh.name)
187 points by zrail on July 15, 2013 | hide | past | favorite | 34 comments

We (rsync.net) are supporting git-annex by offering all git-annex users a heavily discounted rsync.net account.

This was announced a few days ago, but not on HN in any way:


We've been explicitly supporting git-annex on our platform since our friend Jason Scott first showed it to us[1] and we will be contributing to the new campaign.

[1] http://ascii.textfiles.com/archives/3625

That's nice :)

And just so people know without having to check the site:

> git-annex users may sign up for our full featured offsite filesystem at a rate of 10 cents per GB, per month. An annual payment is required, and the minimum account size is 50 GB. There are no usage/bandwidth charges, no signup fee, and no contract to sign.

Wow, that looks great. With the discount it matches S3's storage pricing, but without bandwidth charges, with a more convenient interface (an ssh-reachable regular filesystem), and full-service support. Plus an interesting attempt at a "warrant canary": http://www.rsync.net/resources/notices/canary.txt

That canary thing is an interesting trick, previous discussion here: https://news.ycombinator.com/item?id=702247. (Four years ago!)

We've been operating the warrant canary since early 2006.

In the interests of full transparency, I've been looking for alternatives to rsync.net. I haven't found any - although tarsnap comes very close.

I have an existing account, how can I move to this?


For those of you wondering whether this is worth supporting, the answer is Yes.

I work with a lot of smart engineers, and periodically ask them what they do for personal backup of images, video, etc. I don't think I've yet talked to someone who was satisfied with their system.

Git annex (with the assistant) brings me closer to my ideal system than I've been before: a drive at home, one in my desk drawer at work for fire-proof backups, easy automatic shuttling of data between them using my phone or USB sticks, and pluggable cloud storage if I decide I want it.

Oh, and a friendly, responsive and excellent developer behind it, developing software in a cabin, in the woods. :-)

rdiff-backup to coalesce several data sources (hg, databases, email, photos..) on my desktop workstation into one local copy with retention / reverse-incremental consolidation,

rsync to take it to a second box in the house,

rsyncrypto to take it offsite to two separate cheap lowendbox VPSes (was originally rsync'ing a truecrypt container)

I might rearrange a couple of these steps to move the retention stage slightly further away from my main workstation, to better protect against the kind of malware that encrypts all your files and extorts a ransom

Time Machine to a local drive for immediate restores. Backblaze for disaster recovery.

Time Machine could easily be replaced by cron and tar or some other local backup option. Other backup providers are easily found.

This is not to say anything about git-annex's usefulness or the value of supporting the developer. It just surprises me when engineers are still struggling with backups. I can kind of understand "regular folk", but that's not really the git-annex market, either.

As I understand it, HFS+ can silently corrupt your data; so, Time Machine could potentially propagate that "corrupt" data all the way through your backup disks (and possibly wipe out older, "correct" data!)

We need integrity checking both at file-system level and at backup-solution level. If I'm not mistaken, with Time Machine on an HFS+ volume, we don't have the former, and the latter is file-based and not block-based (making it not very efficient, space-wise).

> I can kind of understand "regular folk", but that's not really the git-annex market, either.

I haven't used it yet, but the git-annex assistant is trying to target that niche, with a goal of an easy-as-Dropbox user experience that hides the technical aspects of git-annex under the hood: http://git-annex.branchable.com/assistant/

Can you use Time Machine to shuttle data via a USB stick? How well does it work when you have more than two drives that you want to keep mostly synchronized?

Time Machine does the one thing it does really well, which is incremental, local (or network-local) backups and easy restores.

It's not for snapshotting or mirroring. I used to use CarbonCopyCloner to clone my boot drive to an external USB drive every morning at ~4am, so if the internal drive on my iMac died I had an immediate bootable replacement.

(And when the internal drive DID die, I just rebooted the machine and took nearly a year to get around to replacing the internal drive!)

The best part of git-annex assistant is that the developer managed to deliver, or even over-deliver what he was pitching in Kickstarter campaign. There wasn't a single week in the whole year where he didn't do some real work on the project. You can check his development blog [1] for details.

[1] http://git-annex.branchable.com/design/assistant/blog/

    "Git-annex now auto-syncing photos from my android phone 
    to a Tor hidden SSH service I control (via 
    @guardianproject's Orbot) #prismbreak"
Interesting. I love to see how that is all set up. I'm wondering if that can be done with a Drobo. If not, what is required to get a mini-server set up that connects to Drobo over iscsi.

I donated and I don't often donate (though I really should). It's not just how awesome the software is that compels me but his dedication to his work and his frugal lifestyle that leaves me wanting to support this guy with all his projects.

His personal finances are really impressive. Selling his fulltime dedication for 12k/year...

His arrangements are interesting, to say the least, see http://joey.hess.usesthis.com/ . The whole thing is worth a read, but here's a sample:

> So the whole house runs on 12 volt DC power to avoid the overhead of an inverter; my laptop is powered through a succession of cheap vehicle power adapters, and my home server runs on 5 volt power provided by a USB adapter.

There's also his notes to a caretaker: http://joeyh.name/blog/entry/notes_for_a_caretaker/

I don't recall where I read about it, but my understanding is the developer lives a very frugal lifestyle in an off-grid environment, something akin to homesteading but with an internet connection. His costs are as minimal as he can make them, and in turn he can both spend more time on his work, and be fully commissioned with campaigns like this.

I think he was also involved with the Debian GNU/Linux installer.

> I think he was also involved with the Debian GNU/Linux installer.

Yes he was (might still be, too). That where I'd first heard of Joey.

Not that it’s his case, but $12k/year is quite a reasonable amount of money in a lot of parts of the world. As an example, here in the Czech Republic you can live a decent life for that amount with all the usual comforts.

How does git-annex compare to bittorrent-sync?

(I'm a big fan of Joey's other work, so I bet this is top notch, there just seems to be more momentum behind bittorrent-sync.)

Use cases overlap partially, but are generally quite distinct.

I think they could potentially complement one another, especially if bittorrent-sync used an open protocol, so I could adapt one of the haskell bittorrent clients (which seem to make excellent use of haskell's concurrency from what I've read) to use it.

I like git-annex better. Because git-annex doesn't automatic keep copies of the files in all the directories you've shared. When you need a file that's not in your local directory you can just pull it from a different place, git-annex keeps track where the file is stored.

bittorrent-sync is closed source.

I can't believe I missed that. Thanks!

Can anyone give me an example of their usage of git-annex? The project looks interesting, but the most immediate usage I see would be to add big files (e.g. a database dump, PDF documentation, etc.) to a git repository. What else do people use it for?

I use it to manage my "archive" folder, which contains large binary files which rarely change. This includes music, some tv series, operating system .iso images, backups of retired computers, etc..

The great thing about git annex is that each clone of the repository has the entire tree structure of the repository, but by default has none of the data. So if I'm going on a trip I can just cd into the right folder on my laptop and type "git annex get ." or "git annex get Windows*.iso". Being able to tab-complete all the files in the annex even though you don't have a copy of most of them makes it very convenient.

The numcopies constraints also help enforce redundancy on the data. I could use raid to have local redundancy, but that only protects against a harddisk crashing. If I have four repositories in different physical locations and set numcopies to three, then git annex helps make sure there are always enough copies of a file (in three different physical locations), so I won't lose data even if my house burns down.

I can't remember why I didn't switch to git-annex from Unison, although it did put me onto bup (which still doesn't quite do what I want either).

I think my biggest problem was by and large I want a versioning sync tool, and worry a lot less about managing what's on my devices.

I think you can automate most or all of that with the git annex assistant now (and I think most of its functionality is exposed as commandline commands as well, so you don't have to use the GUI if you don't want to).

Is there a German cloud storage provider that I can use as a git annex remote?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact