Hacker News new | past | comments | ask | show | jobs | submit login

So you're looking for the collection of clients to be able to make secure backups of their data, but be able to store those backups in a central location with de-duplication and without having to trust the central location with keys? That's certainly feasible, though it seems this particular software won't achieve it.

Off the top of my head, the scheme is simple. Break data down into chunks as usual. To encrypt a chunk: chunk_encryption_key = KDF (HMAC (shared_secret1, plaintext)). Encrypt. chunk_id = SHA256 (ciphertext). Upload ciphertext to central repo. Central repo can store and index using SHA256 (ciphertext). When you've got all chunks encrypted and uploaded, build the backup archive by listing all files, their properties, paths, etc. plus their respective chunks. Chunks are stored in the archive using (chunk_id,encryption_key) pairs. Encrypt the archive using a per-client asymmetric key with authentication. Upload encrypted archive to the central repo. Store by name or however you want to handle that.

When it comes time to decrypt, the client can pull the archive from the repo, authenticate, decrypt, and then begin extracting files. When it needs a chunk, it just asks the repo using the chunk_id. It can verify integrity and authenticity using the chunk_id (the chunk_id itself is already authenticated because we authenticated the archive it came from). Decrypt, and you're done.

The central repo doesn't need any of the keys. The clients all share shared_secret1, and then each have their own asymmetric keys. Clients can't upload corrupted data to mess with each other. They also can't pull random chunks and decrypt them, because decryption requires the "chunk_encryption_key" which can't be derived from chunk_id. They can only decrypt chunks that they've used in a backup.

Garbage collection can be handled by tagging archives on the central repo with a list of used chunk_ids. That's still secure.

The only caveat: You'll need to trust at least that clients won't attempt to exploit the flaws in convergent encryption (you can brute-force fields in common documents). Though there are ways to mitigate that in certain scenarios.

My WIP secure backup tool, written in Rust, does most of this: https://github.com/fpgaminer/preserve The only difference is that I decoupled the chunk_id from the ciphertext's hash, as part of an effort to reduce archive size (archive's only need to store 32 bytes per chunk, rather than 64 bytes). That means that it doesn't handle the case of malicious clients uploading corrupt chunks. That can be mitigated though.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: