Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My personal Internet connection is a bit too slow to wait for re-uploading all the data and my vserver doesn't have enough disk space to temporarily store all the data so I pretty much do everything in a streaming fashion: I use a Firefox extension that gives me the wget command (which includes cookies, etc.) when triggering a local download from Google Takeout, then I patch that command to stream to stdout, this first (tee-)pipes to a Python script that decompresses the data on the fly and dumps the hashes for each file into a log, and it also goes to "age" for encryption, and then to s3cmd for uploading the encrypted data to Wasabi.

For the comparison I pretty much only use the logged hashes which allow me to figure out if any hashes (and associated files) are missing in the new version of the backup. This isn't a perfect solution yet as a few things aren't detected. For example Google Takeout bundles mails in mbox files and I currently don't check for missing mails. It would be better to convert the mbox files to a Maildir first so that the comparison can be done on a per-mail basis.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: