
Ask HN: Secure automated backup? - jamesknelson
Hi HN,<p>After seeing all the news about lost data recently, I need to get my arse into gear and get an automated backup set up properly.<p>I&#x27;m using a Mac, so I looked into the Time Capsule. That said, if one of the data loss scenarios is a well-written ransomeware worm, it feels like the Time Capsule is going to be just as vulnerable as my main machine.<p>What approach would you recommend to back up data, with both hard drive failure and ransomware in mind? I&#x27;m open to cloud based solutions if that actually makes more sense.
======
2bluesc
I use borg[0] to create local space efficient encrypted backups and rclone[1]
to mirror the archives to Google Drive. I wrote a short script to automate it
and schedule it to run every night.

[0]
[https://borgbackup.readthedocs.io/en/stable/](https://borgbackup.readthedocs.io/en/stable/)

[1] [https://rclone.org](https://rclone.org)

~~~
y4mi
You don't have the Google drive app installed anywhere?

If you do, this setup doesn't help recovery after a cryptolocker​. The
encrypted backup would also be unusable.

------
olalonde
I'm surprised that no one mentioned Tarsnap yet, it's run by a well known HNer
(cperciva): [http://www.tarsnap.com/](http://www.tarsnap.com/)

It's not exactly noob friendly though.

~~~
msh
Well it's quite expensive compared to other solutions.

~~~
zie
It really isn't. I mean it _looks_ expensive, until you take into account all
the dedup and everything it does to reduce the cost. I put in $10 dollars a
few years ago, I still have almost a $4 balance ($3.91).

~~~
riobard
How much data are you storing?

~~~
zie
not much obviously, I am only storing critical data in tarsnap. Like others
have said, if you are storing large GB's or TB's of data, rsync.net or
something may be a better solution.

------
_cjk7
There's a common rule called the 3-2-1 rule, it states that you should:

\- Have at least three copies of your data.

\- Store the copies on two different media.

\- Keep one backup copy offsite.

Personally, I'd recommend:

Copy 1: Your Mac.

Copy 2: A local NAS (my personal choice) or hard disk.

Copy 3: A remote backup, stored on a hard drive in a desk drawer at work,
Backblaze, Google Drive, Amazon Cloud Drive or whatever other solution suits
your needs.

In terms of software, I personally use rsync + ZFS/BTRFS snapshots (NAS -
local, NAS2 - remote) and rclone (cloud). I haven't really used fancy
solutions like Attic and Borg due to their need to write dead (i.e. not
mountable without a performance penalty) data to local disk or SSH. No
affordable storage that I've found offers this (rsync.net offers it but is too
expensive).

It's getting to the point where I'm seriously considering buying an LTO6/7
tape drive though...

I'll also add because I haven't seen it elsewhere: _verify your backups_. A
backup is pointless unless you _know_ you can restore it. The best way to test
this is by doing it. It should get to the point where you don't fear a
restore. It shouldn't be painful. There should be no worry. It should be no
more than an inconvenience. When something goes wrong, you don't want there to
be even the smallest hint of doubt that there's something wrong with your
process.

As such, I _strongly_ recommend having an easily accessible backup. I'd go for
a spare HDD sitting in a desk drawer at home before going for cloud backups
just so that you can test it frequently.

~~~
simonhorlick
It's also worth thinking about time to restore. If you have hundreds of GB
worth of backups it could take a very long time to restore everything from the
internet. Keeping an easily accessible backup around is really worth it.

------
goerz
I use Arq ([https://www.arqbackup.com](https://www.arqbackup.com)) with Amazon
Drive (unlimited data for $60/year) for this

~~~
AdamGibbins
I also use Arq but send to rsync.net with reduced pricing
([http://rsync.net/products/attic.html](http://rsync.net/products/attic.html))
in addition to SFTPing to a personal (offsite) server.

Additionally, I run Backblaze and use Carbon Copy Cloner roughly once a week
back to clone my entire drive to an external drive.

For personal servers I use borg with the same reduced rsync.net pricing.

~~~
camgregg
rsync.net is fantastic.

You don't get the read-only snapshots with the reduced borg or attic accounts
though (they expect you to manage all increments using those programs).

If you are prepared to pay the standard rate, the read-only snapshots can't be
destroyed by any hacker or ransomware.

------
jmathai
I have a setup which works really well for my photos and videos
[1][2][3][4][5]. It automatically keeps a copy of each file in 3 locations; my
laptop, a Synology NAS and Google Drive / Photos.

[1] [https://medium.com/@jmathai/introducing-elodie-your-
personal...](https://medium.com/@jmathai/introducing-elodie-your-personal-
exif-based-photo-and-video-assistant-d92868f302ec)

[2] [https://medium.com/@jmathai/understanding-my-need-for-an-
aut...](https://medium.com/@jmathai/understanding-my-need-for-an-automated-
photo-workflow-a2ff95b46f8f#.dmwyjlc57)

[3] [https://medium.com/@jmathai/my-automated-photo-workflow-
usin...](https://medium.com/@jmathai/my-automated-photo-workflow-using-google-
photos-and-elodie-afb753b8c724)

[4] [https://medium.com/@jmathai/one-year-of-using-an-
automated-p...](https://medium.com/@jmathai/one-year-of-using-an-automated-
photo-organization-and-archiving-workflow-89cf9ad7bddf#.97qsvo3cq)

[5] [https://medium.com/vantage/how-to-protect-your-photos-
from-b...](https://medium.com/vantage/how-to-protect-your-photos-from-bit-
rot-9d0c6998121f)

------
sidmitra
I used to use Crashplan which had unlimited storage and was fairly cheap(like
4$/month or something) for a family plan.

You might want to check it out. [https://www.crashplan.com/en-
us/features/](https://www.crashplan.com/en-us/features/)

Also it was one of the few services that had a client that worked on Linux

~~~
JoshTriplett
"used to use" is an interesting endorsement. Why don't you use it anymore?

~~~
sidmitra
My plan had expired. And i ended up not renewing(since i was cutting down on
all 3rd party services that i use). So i just moved to an external hard disk
and "Back in Time" on linux.

But i've been looking at re-subscribing to it again.

------
Sidnicious
Here are some options that I have experience with:

\- Time Machine with offline disks: Since Time Machine supports multiple
backup destinations, you can use a Time Capsule or hard drive that's always
connected to your Mac, and also have one or more additional hard drives which
you connect periodically and otherwise leave in a drawer.

Pros: Free, built into macOS, can browse file versions directly from many
apps.

Cons: Needs ongoing manual intervention (i.e. plugging in the offline drives).
Some reliability issues… but I've experienced the most problems backing up to
my own SMB/AFP shares, so a Time Capsule might be OK.

\- Backblaze ([https://www.backblaze.com/](https://www.backblaze.com/)) or
CrashPlan ([https://www.crashplan.com/](https://www.crashplan.com/)): Both of
these online backup services have $5/month unlimited plans, and both let you
specify your own encryption key (in the form of an additional password), which
isn't shared with the backup provider. Note: In my experience, Backblaze's
client is much lighter on system resources/battery on Mac.

Pros: Inexpensive, off-site storage, low-maintenance.

Cons: Ongoing cost, requires trust (In theory, the client software could be
sharing the encryption key with the company/the NSA/your nemesis).

\- Arq ([https://www.arqbackup.com/](https://www.arqbackup.com/)): Paid
desktop software which can back up to many different destinations, including
S3, Google Drive, or your own server via SFTP. You specify an encryption key
for each destination.

Pros: Full control. Option to back up to another machine that you own (so no
ongoing cost for hosting).

Cons: Up-front cost. Support is less straightforward than hosted solutions
since Arq doesn't provide storage.

~~~
tedmiston
An unlisted con of Backblaze is that they delete all external drives if not
plugged in for at least 6 consecutive hours every 30 days. It can be a huge
pain if you travel regularly or otherwise don't want to leave your computer on
all night.

~~~
Baeocystin
Is this a new policy? Coincidentally I just restored something off a year-old
backup from a dead machine's external drive, and it wasn't an issue. Maybe
because the machine itself hasn't connected in a while?

~~~
tedmiston
The machine not connecting at all puts it in some kind of exempt from deletion
state. They claim it is a six-month limit in their docs.

Note the point about having all external drives connected for that first boot
or their backups will be wiped out. Very easy to shoot yourself in the foot.

[https://help.backblaze.com/hc/en-
us/articles/217664898-What-...](https://help.backblaze.com/hc/en-
us/articles/217664898-What-happens-to-my-backups-when-I-m-away-or-on-
vacation-?mobile_site=true)

~~~
Baeocystin
I can see how that would be a footgun. Thanks for the heads up.

------
Faaak
Most importantly: it must be the backup server that has to log into your
computer to backup, and not the other way around. That way, if your
computer/server is compromised, the backups are still there. If you make the
error to connect to the backup server, a hacker could also log into it and
delete everything.

I my backup server uses rsnapshot and you can only log into it with ssh + key
+ OTP.

------
liareye
[https://www.arqbackup.com](https://www.arqbackup.com)

~~~
whitepoplar
Arq is such a wonderful piece of software. If you go this route, I'd say to go
for Amazon Cloud Drive as Arq's datastore--it's $59/year for unlimited data.

------
znpy
[http://duplicity.nongnu.org/](http://duplicity.nongnu.org/)

~~~
Sami_Lehtinen
Duplicati is newer and better.

~~~
maturz
Been using this for a while now, seems to work great
[https://github.com/duplicati/duplicati](https://github.com/duplicati/duplicati)

The features I like is * Encrypted cloud backup * Blockbased backup (only
backup changes) * Restore files from a certain day

------
SCdF
I use Time Machine, Arq and Amazon Cloud Drive:

\- I have an external HDD partitioned in half: One half is for large external
files that don't change much (raw files, archived data etc); and one half is a
dedicated partition for Time Machine

\- Time Machine backs up my laptop. If I lose my computer but not my hard
drive, I can get a new one and seamlessly get the computer back to exactly how
it was when I last backed it up, open tabs and all

\- I also have Arq running, attached to Amazon Cloud Drive (cheapest external
storage I know of). It backs up both selected portions of my laptop's disk, as
well as the external hdd's non-timemachine partition (due to how TM works you
can't really back it up to the cloud[1]) to "the cloud"

This leaves me with:

\- Three copies of my laptop data: in the laptop, in an external hdd and in
the cloud

\- Two copies of larger data that can't fit, in the external hdd and in the
cloud. My external HDD lives at home.

[0] Time Machine backups up once an hour, and stores backups as a simple
directory structure on disk of your entire hard drive, except using hard links
to old backups to avoid duplication. It keeps the last 24 hrs of hrly backups,
the last 7 days of daily backups, and then weekly backups until it runs out of
room.

This format simply doesn't work with the kind of backup where it scans a
directory to see what's changed, because it effectively looks like you're
adding hundreds of gigs of data each hour.

------
bedros
I second borg backup, I use it on my linux/mac machines

for windows I use reflect backup
[https://www.macrium.com/products/home](https://www.macrium.com/products/home)

I tried acronis backup, but the disk restore failed, absolutely horrible
software. then tried reflect disk restore was very smooth.

------
feelix
For local bootable backup I use Mac Backup Guru, which I also wrote:
[https://macdaddy.io/mac-backup-software/](https://macdaddy.io/mac-backup-
software/) It's useful because it's the only software on OS X besides Time
Machine which makes versioned (incremental) backups using hardlinks.

For remote backup I use Arq, but I have found that to be very buggy. I'm
considering switching to rclone: [https://rclone.org/](https://rclone.org/)

With both of those backup solutions in place I should be ready for pretty much
everything.

~~~
zapu
Can you elaborate how Arq is buggy? I'm considering switching to Arq from
Crashplan, because I supply my own storage for it anyway.

~~~
feelix
I have had to restore a backup once, so I restored the most recent version,
which was restoring an old version of the file. I ended up frantically
restoring all the different versions of the backups, and one of them was the
most recent version. But it was considered an old version by Arq, I'm not sure
precisely which one, but it definitely wasn't in the top 3 most recent ones.

And more recently I made a backup to glacier, and it is trapped in "in
progress", even though it appears that it may (or may not) be complete.

I haven't used it much, maybe 6 times, and 2 of the times it's had these
catastrophic problems I'm talking about. I'm going to switch to another
solution. I was considering rclone (like I mentioned) or Crashplan,
ironically.

~~~
zapu
Crashplan worked well for me, sometimes I have to restore a file that I
overwrite or that ends up being corrupted because of buggy (in-house) software
and I never had any problems with that.

My main issue is that I went with free plan + my own server because transfers
to their servers were really bad. But to supply your own storage, you have to
install Crashplan on the server, which uses significant amount of RAM. So this
rules out backuping to my NAS (not enough memory there, even though it's x86
and technically people ran Java on it).

The other issue is that file format is (AFAIK) closed, and also you an account
and connection to Crashplan service, so I'm not completely sure that you
really own the data. I didn't do much research here, though.

~~~
tedmiston
Was the Crashplan client performance better when backing up to your own server
vs to theirs?

~~~
zapu
If you mean client performance in general, I never had problems with that,
it's just the connection to the server overseas was not that great. I have my
own dedicated server in Europe to which I have much faster connection.

------
gtf21
I quite like Crashplan:

\- very reasonably priced: I pay around £10 pcm for unltd storage for my whole
family

\- zero-knowledge encryption: I have the encryption keys, and everything is
encrypted on my machine before its sent up

\- relatively low bandwidth: only ships changed files (pretty standard tbh)

It's saved my bacon a few times, e.g. I've used it to rescue my sister's
dissertation when she wiped her laptop thinking it was in Dropbox when it
wasn't. I was amazed by how easy it was for me to rescue the file from the
archive.

~~~
2bluesc
I used crash plan on Linux for years, but stopped because their Java client is
a train wreck.

It would consume gigabytes of RAM and every year or so it'd meltdown when
trying to install an update without using the system package manager.

~~~
tedmiston
This is the same reason I left Crashplan. If they could get the Java client
under control it would be a worthy contender, probably even the best consumer-
friendly option.

------
5_minutes
Have all my data on Dropbox with revisions activated, and having that backedup
by Crashplan. You'll have double automated backups and 0 hassle managing it.

------
iamcreasy
I have a related question. I want to take backup of certain folders to a
portable USB HDD every night. Can anyone recommend any simple solution for
that?

I don't need encryption or any extraneous features. I just need the selected
directories to get mirrored to a backup location.

Currently, I am using SyncToy by Microsoft, but I was looking for a cross
platform solution.

~~~
iamcreasy
Found a list here :
[https://en.wikipedia.org/wiki/Comparison_of_file_synchroniza...](https://en.wikipedia.org/wiki/Comparison_of_file_synchronization_software)

------
satai
3-2-1 rule.

I would use time machine capsule and periodically (weekly?) connect an
encrypted external drive and Borg backup there. Next week a second drive,
third week the first one...

Always keep one of this drive off-site.

This is just one of many options how to get reasonably safe (I use an almost
this one just deja-dup instead of time machine.)

~~~
ottobonn
I also use deja-dup but lately it seems to choke when trying to determine what
to back up (I have about 175 GB of files, which doesn't seem outrageous). Have
you had any issues with the speed? It could be that I have a long history
saved on my backup drive and it's trying to apply too many incremental diffs.

~~~
satai
I have no idea. Anything in logs? Maybe move backups to an other location and
start a new backup repo to see what happens?

------
brandonhall
I've used the following method for years and it's really simple. Get an
external hard drive and partition it as needed. One for your Time Machine
backup and another for data. Use Google Drive to mirror the data and use Arq
as your Time Machine in the cloud.

------
mattbillenstein
I don't backup end-systems -- but I do have a directory with important data
sync'd to several systems and the cloud using syncthing. The rest of the data
I care about is in git -- everything else on the system is basically
disposable.

------
pfarnsworth
I think Amazon Cloud Drive is $60/yr if you have Prime. You can hook up your
account to your Synology NAS and have it automatically back things up as soon
as you copy it over. Also Synology can encrypt it on the fly as well.

------
allan_wind
attic <[https://attic-backup.org/>](https://attic-backup.org/>) encrypts data
in transit via ssh, deduplicate and encrypt data at rest. I have come to
appreciate both how easy it is to restore data, and the control you have over
pruning which backups are kept around. You either need to be able to install
attic on the remote host, or be able to mount the file system (i.e. fuse).

------
rogeralan
Do you think Google Drive (and others) are secure enough to store personal
financial data like Quicken? Sometimes all you need is an email and password
to get in.

~~~
tedmiston
You can always put extra secure documents inside password protected encrypted
disk images.

------
vladimir-y
[https://github.com/duplicati/duplicati](https://github.com/duplicati/duplicati)

------
Baeocystin
I see it's already been mentioned, but allow me to second Backblaze. I use it
for hundreds of clients, and they have consistently been the most reliable of
the backup services I have tested __*. Since they also have versioning, you
can recover from CryptoLocker variants comparatively easily as well.

(Almost any cloud-based backup system can help detect Locker variants. If you
notice your daily backup data set suddenly shooting up in size, time to start
checking for background encryption.)

(°Backblaze, Carbonite, Crashplan, Mozy, Acronis)

------
Gatsky
What sort of upload speeds do people doing cloud backups have?

I could never contemplate uploading 100s of Mb each day with my crappy ADSL2+

~~~
Baeocystin
I have basic cable-provided internet. 50 down, 3 up.

Initial backups take a long time, no doubt. But the daily diffs aren't so bad.
As long as you use something smart enough to only upload the deltas, you're
fine.

You can also snail mail drives to many cloud providers for your initial backup
state if you want. I've never done so, but I've heard from friends that have
that it worked out fine.

~~~
Gatsky
That's very helpful info, thanks!

------
aussieguy1234
I've used spideroak before. Zero knowledge encryption, even the NSA _probably_
can't access it

------
Mathnerd314
Just stick it all in Google Drive, pay the $2/mo for 100GB or $10/mo for 1TB,
done. It stores the last 100 versions of each file so ransomware shouldn't be
a problem, although apparently there's no way to restore a folder at a time,
you'd need to do it individually for each file...

~~~
sidmitra
Is there a good linux client? I couldn't get the ocaml client to work well
enough.

I'd even be fine with even a one-way backup that backs up my entire ~/
(excluding some specified dirs/files) as snapshots. I don't need even need the
two-way sync.

~~~
ktta
I've used rclone[1] with great success (it can sync specific folders which
looks like what you want) but I've since then moved on to a commercial service
called Insync[2] because I was having rate limiting problems with the Google
Cloud API key that rclone needs.

I do wonder that I should move back to rclone because I don't like a third
party having access to my google drive. As it goes without saying, use a good
encrypted backup solution. I like cryfs for encrypting data.

[1]: [https://rclone.org/](https://rclone.org/) [2]:
[https://www.insynchq.com/](https://www.insynchq.com/) [3]:
[https://www.cryfs.org/](https://www.cryfs.org/)

------
serverguy
For Desktop Backups, I think backblaze.com should do the Job.

~~~
sliken
Desktops not running linux.

~~~
mdekkers
They now offer B2, which is really awesome.

