Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Open-source, centralised, zero-knowledge file syncing software?
10 points by PuffinBlue on Feb 28, 2018 | hide | past | favorite | 7 comments
I'm looking for an open-source version of Tresorit or SpiderOak. I don't really want peer-to-peer like Syncthing.

I'd like to be able to throw up a server anywhere online and sync files across my personal devices, using the server to keep things in sync but safe in the knowledge that the server operates with zero-knowledge and cannot decrypt files.

Is the Nextcloud the only hope?

Use restic. It’s what I use to backup my servers to Backblaze B2. Here are its pros:

* It supports a variety of common backends, including B2, AWS, GCP, Azure, a remote SFTP server, a remote REST server, a local directory, etc. You can adjust common parameters like bandwidth, concurrent threads, timeouts, etc.

* It’s open source and actively developed: https://github.com/restic/restic. The author wrote it in Go, but if you don’t want to compile from source or install from your OS package manager you can install binaries for each release. The author is responsive to issues and there is an active community of folks involved with feedback and development. The author also solicits the community for feedback from time to time, such as when he was deciding what to prioritize for the move to the new remote repository format.

* It’s very well documented: https://restic.readthedocs.io/en/latest/. You should be able to get going with a backend of your choice after 10-20 minutes of reading and configuration. Basically, you’ll install Restic on your machine from source or a binary, then you’ll set up a remote repository and a client-side generated password for securing files sent to it. After the initial backup to the repository you can set it and forget it in a cron job. The CLI is expressive and has many options; personally, I like to house my repository’s password in a file with very restricted permissions, then house all file/directory exclusions in another file, and include both of these files as CLI flags in a sudo-run restic command from cron each day. It’s very flexible and adaptable.

* The native model is end to end (client-side) encryption implemented with very sane cryptography. Restic uses authenticated encryption implemented via AES-CTR and Poly1305. Personally, I would have preferred one of the NaCL defaults like XChaCha-Poly1305, but AES-CTR is fine. AES-GCM would be better optimized (especially for CPUs supporting AES-NI instructions), but the author chose AES-CTR because he understood it better (this was before AES-GCM had an optimized Go implementation). In any case, there’s nothing obviously wrong here, it’s just conservative. The author also uses SHA-256 for hashing files, and has rejected/postponed proposals to move to Blake or Keccak (SHA-3), which also indicates to me that he is very cautious (this is a good thing). Restic uses encrypt-then-MAC with separate keys for AES-CTR and Poly1305-CTR, and finally, client-side passwords use scrypt for the key derivation function. These are all sane, well-considered choices, if a little old-fashioned in some respects.

* Restic has an explicit, documented threat model, which is fantastic: https://restic.readthedocs.io/en/latest/100_references.html#.... It’s not typical for cryptographic software to include a threat model, which significantly elevates it (alongside these other points) from being a “random project.” More generally, the section containing the threat model also meticulously documents how file locking works, the deduplication methodology, everything I’ve said about crypography in greater detail, what the snapshot and repository model is, etc.

At this point I’ve backed up around 40TB or so using Restic, and I’m fairly confident in it. The one downside I would admit is that you have to keep up with updates yourself if you don’t install it through your OS package manager - and at least on Ubuntu, the apt-get version is pretty out of date. But I went through what you’re doing now about a year ago, hoping to find something like Arq for linux servers. I settled on Restic and never looked back.

For your particular use case, here is my concrete recommendation. Set up two servers A and B, with a syncing directory between them (like Google Drive, Dropbox, or whatever else you’d like). Install Restic on both of these servers (and any other server you’d like to sync to), such that the password securing the files is available on each server you control, but none of the intermediaries. Using Restic on server A, initialize a new local repository on the syncing directory, secured with your client-side password. Now whatever files you’re working on, if you’re writing changes to it to be synced to other servers, write those changes, then back it up from your workspace to the syncing directory using Restic, which will store it encrypted and deduplicated as a snapshot (don’t put the new version of the file in the syncing directory without Restic, i.e. not in the clear). Then the encrypted file will be synced to all other servers sharing the directory, including server B. You don’t need to initialize a new repo on the directory on server B; Restic should recognize the repo, so just unlock the files using your password from server B. Voila.

This looks perfect. I'm going to read up on this. Thank you.

On Linux you can just rsync/syncthing/whatever (using ecryptfs) the encrypted data.

On any OS, you'll be able to rsync/syncthing/whatever a VeraCrypt volume (though, you'll have to unmount it to make sure that the image is 100% consistent).

Also, borg backup will let you do a remote-is-encrypted backup, which might be what you're after - sync the remote backup instead of the files, and fuse-mount it to read. No Windows for now AFAIK.

You should look at Seafile

Looked at that one. It's not open source (yet). Plus it's still peer-to-peer really.

They have a non-open-source "enterprise server", but everything else is open source. And what do you mean by "peer-to-peer really"?

Huh, I am an idiot.

I've confused this software with something else, or misread the feature set.

I visited the site and forums just yesterday. Taking another look now it's obvious Seafile should do what I want.

Clearly a case of looking at too many things in one go and screwing up.

Thanks for making me take a second look.

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact