Hacker News new | comments | ask | show | jobs | submit login
Ask HN: What's your backup setup? Manual or Automated?
50 points by cdvonstinkpot on May 21, 2015 | hide | past | web | favorite | 62 comments



1) I use a time machine drive at my desk. This is useful for quick recovery.

2) I use backblaze (with encryption) for continuous backups to the cloud.

Backblaze was super useful once when I had a drive fail as I was traveling. I was able to retrieve my SSH keys immediately and continue working in the cloud. It also provides reassurance that, if there were an event such as robbery or fire at my apartment, losing my laptop and time machine drive together would not be catastrophic.

A final tip - I use http://tripmode.ch to disable backblaze / dropbox / box / google drive / spotify backups while tethering, on bad wifi, on airplanes, etc. I find this tool essential to managing all of the services that continually sync in the background.


Here's what happens with manual backups:

Day 1: Let's test this! Hey, it works.

Day 2: Right, I should run this.

Day 3: This takes a long time.

Day 4: Maybe I should do this weekly.

Day 15: Right, time to do a backup.

Day 23: Weekly backup. This takes a really long time...

Day 45: I should do a backup next time I have time.

Day 421: Where are my backups?

Aesop says: only automated backups keep happening. Only backups that send you a message when anything at all goes wrong are worth having.

I say: nobody wants backups. Everyone wants restores.


Attic ( https://attic-backup.org/ ) backups to my home NAS, then the NAS backs up to CrashPlan. I keep a certain number of daily/weekly/monthly snapshots and the backup script prunes once a day and verifies once every other day.

I also like to encrypt my backups before uploading them to CrashPlan. Each machine has a randomly generated key and the list of keys are then encrypted with a master key. I back up the list of keys to CrashPlan as well, but the master key is not. Instead, I split the key using Shamir's secret sharing and left copies of the component keys in a variety of places. If something ever totally destroys my home and I lose my local copy of the master key I can recover it by recombining a certain number of the component keys.


I have been looking into both of these technologies recently.

Attic has the most attractive feature set of the backup management tools I have looked at (deduplication and compression, support for remote repos, built in encryption, ability to mount repo as a FUSE file system), with the main downsides being that it is fairly new (and so just doesn't have as much testing as some of the older options) and its development model is not ideal. It is solely maintained by the original developer as a personal project and so when he is busy development stops. Recently, a handful of active community contributors started a fork (https://borgbackup.github.io/) because of this. The original developer still seems committed to working on Attic when time permits, but I'd prefer to see a team of capable developers maintaining something I was relying for the integrity of my data.

CrashPlan was the most attractive offsite backup solution because its price was competitive (similar to several other options, ~$6/month to backup one computer with no data size limit) and it provided cross-platform support for Windows, OS X, and Linux. I also like that it has a few different encryption options, including the option to generate your own encryption key that you do not share with CrashPlan.

I am curious about your decision to encrypt your data before uploading with CrashPlan. Are you trying to avoid storing any unencrypted data on your NAS because it is more exposed to the internet? Or do you not trust CrashPlan with your data? I would think that using CrashPlan with a custom key would be fairly secure. If you don't trust CrashPlan in that case, you probably shouldn't install it on your machine at all. The CEO of CrashPlan has commented on double encrypting data (http://superuser.com/a/589686).

What is the restore process like when combining Attic and CrashPlan? I don't think CrashPlan has any file system mount option, so you would need to restore an entire Attic repo to recover any files from it, correct? I guess this is okay since you have the NAS to restore from for individual file backup and you only need to restore from CrashPlan in the case of catastrophic failure (the NAS dies entirely) when you would need to do a full restore.

My current backup solution is an rsync script backing up to a remote machine I own. I previously used Bitcasa for offsite backup but was unhappy with the service and with the way the company kept changing its business model (shifted away from consumer cloud storage to a business / app market). I'd like to replace my rsync script with something more sophisticated like Attic and use another offsite option like CrashPlan.


I encrypt the data before shipping it off to CrashPlan mostly because it's the most flexible. I have a small amount of "sensitive" data related to my research work. It's not really anything that is actually worth protecting IMO, but I'm obligated to meet certain requirements when storing it. Encrypting it before uploading lets me have more fine-grained control over how it's encrypted. I also like to use different keys for different machines. I do actually trust CrashPlan, but I like doing it this way. The NAS isn't exposed to the internet, so that's not the issue.

Restoring from CrashPlan is basically as you describe. Assuming my home NAS isn't destroyed, I've just restored from the appropriate Attic repo for the machine off the NAS. If the NAS was destroyed too, well I'd have to restore from CrashPlan. My backups are targeted though, so a full restore is only at most like a TB of data (basically, my documents, configuration files and music collection).


I use BackupPC (http://backuppc.sourceforge.net/) on an always-on 8-watt Fit-PC2 (http://www.fit-pc.com/web/products/fit-pc2/). BackupPC grabs twice-daily backups of my servers and once-daily backups of my personal machine, and emails me if there's any trouble. The backups aren't dependent on any third-party service, I don't have to worry about who else has access to the files and data on my personal system, and I habitually check its web interface for extra peace of mind.

I think it's hands-down the best cross-platform backup system currently available, but for some reason it's not a popular approach.


Seconded for BackupPC. I've been using it since I first saw a release announcement for it on Slashdot ~15 years ago. I combined it with automated weekly archival uploads to an off-site server (that I was also using for other purposes), as well as a manual monthly transfer I would make to an external hard drive (that would otherwise remain powered off, in case someone/something managed to get access to all of my live machines).


If your backups are automated, you need an automated backup checker in order to ensure your backups are reliable and will work the very day you'll need them https://github.com/backupchecker/backupchecker


But who checks the automated backup checker?


$ cat /etc/cron.daily/x_mailbackup

#!/bin/sh

find /media/wd1T/backup -mtime -1 | mail -s "Backup Data 1 day old alt mdorf/valun " xx@xxxx.com


"Build Failing"

Well, that's not right... lol


Thanks for sharing this code !


No one at time of posting has mentioned standardized backup being stored in physical copy off site.

Low risk, high impact events do happen.

Copying my harddrive to USB and storing them in the same room doesn't protect against a lot of other failures - fire, theft, other social or natural disaster.

For a small company or startup: Code is often versioned. Is everything else? Perhaps buy a cheap USB stick for a dump of all admin files that could have an impact should they get corrupted? Make a copy each week and stick on a new USB stick to put in your car's glove box (hopefully getting full, as you have a stack of backups - the is super important for audits: as a manager you should be able to trace changes in office documents easily, something someone more naïve would assume is covered up - I've seen it happen).

Copying to the cloud puts a lot of trust in the cloud being there in the event of a failure. If the technological connectivity causes of failure for my business and the cloud are independent this makes a lot of sense. Not when passwords can be easily shared, floppy disk controllers introduce rooting vulnerabilities, etc.

An off-site physical medium makes a lot of sense for backups. Encrypt a USB stick and keep it in your car. If you have an office with multiple sites, send a USB/HDD/SD containing backups every couple of weeks.

Low probability, high impact events do happen. And they have an irrecoverable impact.

Have a routine. Then, as others have said, you have a recovery solution, not a backup.


MacBook Pro (mine), MacBook Air (wife's). Both backup important data to CrashPlan's servers and also use the CrashPlan app to backup everything to my home server.

The home server is running Windows 8.1 with SSD for OS drive, a 2TB drive with other data (movies, tv shows, photos, home videos, downloadeds, dropbox, etc). Four more drives of various sizes and models (gradually moving to 3TB+ WD Reds) are pooled together using DriveBender with duplication. This drive pool is where the local CrashPlan backups are stored, as well as occasional full disk images before OS upgrades. The server's own OS and data drives are also backed up to the drive pool with CrashPlan and to CrashPlan Central

All home photos and videos are also backed up to Google+ Photos. All documents are scanned, stored in Evernote, and shredded.

All of the computer backup is automated.

The photo backup is automatic, but organization takes my intervention. My phone and my wife's both upload 2048x2048 images to our Google accounts. They also backup full res images to my Dropbox. Once a month or so, I transfer the full res images to the home server and upload any good ones (of our son) to Google+ in full res to share with our families.

Document backup is a pain in the ass. I just collect a bunch of mail and documents on my desk and every so often I use my Fujitsu ScanSnap to scan it all and it automatically uploads to Evernote. Sometimes I label them by date, but I've been to lazy to do that lately, they get thrown in an archive notebook and I rely on search.

I think this pretty well follows the 3-2-1 rule.


Arq backs up my OS X machine and dumps the files into an S3 Glacier instance. Arq is inexpensive, and so is the S3 instance, so this setup works for me.

Arq has failed a few times without telling me, so I am not going to maintain a solely Arq-based setup in the longterm, but it's fine for now.

In the future, I'd like to add some redundancy, and backup to a location that I have physical control over. For now, however, I am pretty confident in this setup.


I have a PowerEdge server at home with two different sets of hot-swap drives, a 6x500GB in RAID6 and 2x2TB in RAID1. It also has one top bay for easy hot-swap.

The RAID6 array is for nested-VM experiments with OpenStack, ESXi, etc., whereas the 2x2TB drives are in RAID1 and exposed to NFS and Samba. My Mac uses this share to store its Time Capsule data, plus just general crap I need to put somewhere and a full sync of my Dropbox and SpiderOak directories. I've got a cron job on the box that detects when a drive's been plugged into the hotswap bay and rsyncs from the RAID1 array to that disk around 3AM. It sends me a Pushover message when it's done the sync, and sometime during the week I cycle that drive out to my safe-deposit box and plug its alternate in. Occasionally I verify these backups by plugging them into my desktop's hot-swap SATA bay, but for the most part I'm confident in their success at this point so that's not an every-week thing.

I do cloud architecture and stuff at work, but for my own data I'm more comfortable knowing where, physically, my canonical copies are.


Manual: Time Machine using 2 hard-drive (different models to lower probability of both going down at the same time), one at work, one at home (I sync mainly when Time Machine nudge me every 10 days), which takes care of "something burns down" scenario. Also, a separate hard-drive for sizable personal media (raw photos/videos I don't intend to use soon), which is the only weak point (I should research Amazon Glacier or similar technology)

Automated: all important work also lives online (Github for code, Google Drive for documents, Gmail keeps all conversation).

Apart from the media hard-drive, I think this follows the 3-2-1 rule for back-up:

- Have at least three copies of your data. (3 full copies, fragmented data online)

- Store the copies on two different media. (2 HD, 1 SSD, online)

- Keep one backup copy offsite. (2 full copies at different sites)


Both my server and my dev machine have RAID-1. In addition:

- My server does an automated full backup once a week to OVH's free backup space (on a separate nfs drive), and an incremental backup once a day.

- My dev server (a synology NAS) does a daily backup of my important stuff (basically all my code and documents) to a backup on the RAID drive.

- Once every 2 weeks I manually download a full backup from my main server to my NAS.

- Occasionally I upload the "important stuff" backup from my NAS to my main server.

I don't regularly test my backups, but occasionally I need to extract something (because I delete something by mistake), so that tests them. I also check my logfiles regularly, so I would notice any RAID failures, failed backups, etc.

I think my procedures are sufficient for my purposes. Having RAID really helps with peace of mind!


Time Machine backup to my Synology NAS, then a time-honored offsite backup -- I clone my drive and physically leave the latest copy at my parents' house.

every time I visit I bring a newer drive and swap.

Simple, cost effective, foolproof.


Automated Portion: Every morning at 7:30am, a cron job runs to rsync my main computer's /home directory to my NAS that is sitting in a closet. The NAS has two drives that are mirrored to guard against hard drive failure. Every Saturday morning, a homebrewed script will separately check to make sure all my backups are current and will email me to advise me that everything is current and OK. If there is a problem, it will tell me which days are missing.

Manual Portion: Once every couple months, I backup the portion of my /home directory that is not heavy media like video to a large USB stick, which I keep at my office.

I don't love this setup, particularly because (1) an offsite backup does not happen automatically, (2) my video is not backed up offsite, and (3) it is not resilient to natural disasters in my area, since both my home and office are within a half-mile of each other in downtown New York City.

Moreover, because I keep my NAS mounted even when no backup is in progress, my backups are vulnerable to malicious code executed even at the non-superuser level. For example, a while back there was a bug in some gaming software (Steam?) where a script executed that would rm -rf /* because of a badly written shell script. If that had happened to me, it would wipe my /home directory and my NAS backup.

Eventually, I'd like to set up a Raspberry Pi running in another area of the country and have rsync push daily backups over ssh to that offsite computer.


Cron job backs our two PCs up to an encrypted USB drive overnight, every night, using rsnapshot.

Backs up Linux PC (where the drive is attached) and Windows PC (via rsync courtesy of Cygwin).

Rotate the USB drive with its off-site partner every Wednesday (cron job emails me a reminder that morning.)

Every January I replace the external USB drives with new ones.

That's the PCs.

For our phones, contacts are handled in OwnCloud and using CardDAV[1] to two-way synchronise with the phones. This also is really nice as I can manage and edit contacts on the desktop as I like to avoid data entry on pokey little phone screen keyboards if I can help it. (I also use the contact data for other things such as caller ID info popping up on the TV when the home phone rings, so a single source that updates everywhere is handy.)

Use Titanium Backup[2] to backup phones daily, and FolderSync[3] to SFTP said backups to the desktop PC periodically (along with two-way sync of photos on the phone).

[1] https://play.google.com/store/apps/details?id=org.dmfs.cardd...

[2] https://play.google.com/store/apps/details?id=com.keramidas....

[3] https://play.google.com/store/apps/details?id=dk.tacit.andro...


I used to use BackupPC for onsite backups (which is a great product!), but wasn't comfortable with my lack of an offsite. I've since moved to Crashplan (http://www.code42.com/crashplan/), mostly because they have a Linux client as well as Windows and also allow peer to peer backups.


I use Time Machine. One thing I learned recently is there are a bunch of files that Time Machine doesn't backup - http://apple.stackexchange.com/questions/131399/what-folders...

In my case, I use nvAlt as a note taking tool. nvAlt stores the interim changes (have not yet been persisted to the database) in ~/Library/Cache or something, which is on the exclusion list for Time Machine. Since I never really restart my machine or close the programs I use daily, the changes never really got backed up (I had months of notes that were not persisted). Long story short, I had to restore from a Time Machine backup, which mostly worked but I did lose some stuff.

I keep the time machine exclusion list pretty slim now because, frankly I don't care if my backup has caches and other garbage in it. I much rather prefer recovering things in a stateful manner that's least disruptive to my work.


I wish Apple did backups to the cloud (which would make iCloud indispensable and a lot more useful).

I hope one of the storage giants (Dropbox, Google Drive, Box, etc.) rolls out an automatic backup service that makes backing up your computer to the cloud really cheap. This could probably also be accomplished with a quick Python script.


Why don't you use Backblaze or CrashPlan?


Wow didn't know those existed, thanks! I think I will try Backblaze :)


https://www.backblaze.com/ when I was mostly using Windows. $5/mo or $50/year per computer with no limit on filesize! Seriously, I've had multiple terabytes on there and it never complained. And the upload speeds keep getting faster.

One limitation of Backblaze is a limited file history. If a file gets modified and you don't notice, the old version will eventually disappear. So I have my "important" stuff in a Dropbox account with "packrat" enabled for infinite history.

My Linux-based data is on a RAID-1 which is not backup. But I can't find a cheap enough service to hold another 7.5TB of data.

(Also I'm going to plug Archive Team's backup of the Internet Archive. http://archiveteam.org/index.php?title=INTERNETARCHIVE.BAK/g... If you have a Mac or Linux box, put your spare hard disk space to good use.)


I use a combination of LVM thin volume snapshots and btrfs snapshots. While i could do fine without btrfs snapshots i like to have the ability to browse old versions online in order to diff or restore them. The thin volume pool is mirrored using lvm's rather new raid1 volume types (not the mirror type). If the filesystem fails (which happened twice already) i can merge the working LVM snapshot back and loose at max a day worth of work. However if the LVM fails i have an offline copy with the same setup (LVM and btrfs) where i use btrfs send/receive functionality. But if this happens i will loose some days of work. If both LVM VGs fail im lost. The offline copy also holds more btrfs snapshots than the online one in order to reduce disk usage on the online storage. I do it all manually but i am planning to do the btrfs snapshots automatically. However, the LVM snapshots wont ever happen automatically as i don't want to snapshot a bad filesystem and i am unsure how to reliably detect filesystem errors without using the system.


Windows desktop: has script to sync a handful of directories to Linux server. Run manually. Desktop has another drive that has images of the other drives and backups of game service downloads (Steam, GOG, etc.) that aren't anywhere else, to save space.

Linux server: has script to sync data drive to encrypted external drive. Run manually because drive needs to be manually connected and password entered (probably a way to work around this, but I'm lazy). I have 3 external drives that are swapped about 1-2 weeks to work, and 1-4 months to another state.

It might seem too manual, but I have a workaround. I host a podcast[0] where I say "don't forget that today is international backup awareness day[1], so backup your stuff" to remind me.

0: http://thenexus.tv/category/cs/

1: http://blog.codinghorror.com/international-backup-awareness-...


Mozy for personal (non-prod) computers. it's like $60/year. Set and forget, my computer was stolen last year, was up and running a day later, most of which was spent purchasing the new computer (Macbook)

Prod is more complicated, my company works with several multi-TB data sets (MongoDB) consisting of billions of swiftly changing kilobyte-sized documents. This can't be "backed up" in the traditional sense because it's impossible, barring custom hardware/great expense, to take a consistent snapshot of a multi-TB dataset distributed across dozens of machines. So we do the usual distributed replication across datacenters, put RAID underneath, etc. I worry more about corruption due to application errors than losing a disk for this stuff, though.


Automated. My stuff syncs automatically with Seafile (I used to use OwnCloud but really highly recommend you not, since it eats files, and this is a known problem there) every night when I get home. And every night in the middle of the night, there's a cron on my home server that runs a partial, encrypted backup to S3, and monthly full, encrypted backups. I have been bitten before, and so something like this is extremely important.

On the business side, something similar, but nightlies and full weekly backups to S3 of our private docker registry and Gitlab content.

Basically, if a fire or something (knock on wood) destroyed everything I owned, including for some reason all the servers in all the datacenters, I could be back up and running in a couple of hours, I think.


Automated:

Mac:

- Arq (Highly recommended) to Google Drive (Unlimited) and Amazon Glacier

- TimeMachine to two disks and a QNAP NAS

Servers:

- Tarsnap

- duplicity (https://blog.notmyhostna.me/automated-and-encrypted-backups-...)


on servers i run a couple of homebrew shell scripts to do backups. on some i use the backup-gem (https://github.com/backup/backup)

on my workstation/notebook (both apple computers) i use time machine with different external harddrives i rotate weekly.

all setups run automated, but sometimes i trigger them manually (usually right before and/or after some major change in configuration, or when lots of new data are on the drives (eg: i copy all the holliday photos onto my laptop)

i run regular checks on server backups, e.g. check if they "are there" and if i could restore them properly if needed.


Backblaze: backs up main hard drive automatically

Arq: backs up photo library to S3 Glacier automatically

Carbon Copy Cloner: clones main hard drive to external once every two weeks and photo library to external (stored at parents' house) once every month


I use rsyncbtrfs [1] (which I'm also the author of) to auto backup my OS, home and pretty much everything, daily. It's saved my ass on numerous occasions where I've deleted a file by mistake and could go back to the previous night's backup to restore it. Backup is where btrfs really shines. I have some 1200 timestamped directories, each of which represent a snapshot of my data at a certain date. I can just cd into them and look at the files.

[1]: https://github.com/oxplot/rsyncbtrfs


I use BitTorrent sync to sync smaller devices with a home desktop computer. The desktop computer has read only access to the smaller devices to keep it one way. The desktop computer has a windows storage pool in a raid 1 style configuration. Then that syncs with backblaze so I have an offsite backup. This setup doesn't deal with bitrot properly, and my large photo collection doesn't fit well with it either, so it's still a work in progress.

I'm thinking maybe doing something with camlistore, and eventually taking advantage of the 'unlimited' storage you get with onedrive.


I wrote my own backup system, called Snebu (http://www.snebu.com hosted on github at http://www.github.com/derekp7/snebu).

It works a lot like rsnapshot, but uses an sqlite3 DB for metadata storage and also does streaming compression (and encryption filter module is also coming soon). Client side resources are light -- just requires a system with gnu tar and gnu find, along with a bash shell for running the client-side script.


Automated.

Macs: Time Machine to a NAS, plus Backblaze.

Windows: Genie (which sort of sucks performance-wise but all of the backup solutions on Windows seem to be terrible. I miss the built-in one from Windows 7.) to NAS, plus Backblaze.

Linux: rdiff-backup to NAS.

Servers: Linode backup.


I don't use Windows, but often found https://www.bvckup2.com/ quite satisfying when I set up backups for Windows users among friends & family.


Thanks! Looks really slick, I'm giving it a try. It's too bad there's no snapshots or restore/recovery but that's probably fine, with Windows those end up being pretty flakey anyways.


I'm currently on Heroku so my code is all in git and Heroku provides free automated backups for Postgres. I'm eventually moving everything over to Digital Ocean though. In which case I'll probably have a bash script that runs on cron job to just rotate backups. Something like https://wiki.postgresql.org/wiki/Automated_Backup_on_Linux

I don't have massive amounts of important data to manage though so a simple solution like this is all I need.


Manual: Every now and then I run an rsync script to backup my stuff to an external drive.

Automated: I have SpiderOak set up the auto backup select folders. I have about 1 TB of offline storage for $12/month.


Crashplan. Automatic incremental backups to local and the cloud.


We set ours up for daily backups for all our employees. We set it to only use 25% of their bandwidth and it just runs in the background, invisibly to the end-user.

If there is a problem running the backup (three consecutive days missed), they get an automated email letting them know. And IT also gets notified, so we can reach out to the end user proactively to solve the problem.

Having automated backups has saved our bacon on more than one occasion!

Rich BakerRisk


A script to open/mount an encrypted external hard drive, an rsnapshot script to do the deed, a third script to close/unmount the external drive. Run every three days our so. There's a fourth script to mirror to a second external drive, run every two weeks. Overall, it's three one-word aliases, one password input and a cron job.


jwz backup:

http://www.jwz.org/doc/backups.html

I have a script that makes this a little more complicated and gives me incremental backups:

https://github.com/krupan/incremental_backup


Automated. Cron job on the ownCloud box, shipping PGP-encrypted backup nuggets to S3 via Duplicity. Gentle reminder for everyone here - TEST YOUR RESTORES! Nobody gives a rats about backups, they only care about restores.

http://www.taobackup.com/


Since Attic is fast and duplicating I do hourly backups via cron at work. At the end of the day I prune them all and do a daily backup via systemd on shutdown. For offsite I manually sync via usb stick and unison, merge at home where daily backups are done as well. Rarely I update a backup hd stored at my parents.


crashplan. Easy to set up and forget. Unlimited storage with history. Cheap family plan for 10 computers means I don't worry about rescuing my mother in law from a ransom trojan. Really happy about not having to fiddle with any custom setup like I did before.


I use a shell script which I call with a keyboard shortcut at the end of the day and does the following:

- rsync-based backup of a sub-home directory where I keep sources, documents, and other things I want to be saved in case of disaster

- backup of all Trello cards via a custom python script

- machine shutdown


Automated. Hourly to two separate USB drives, nightly to NAS, both with the superb https://bvckup2.com. The drives go to the bank safe when leaving for the long trips.


Personally, I use Macrium Reflect incremental backups on my Windows machines, Time Machine on my MacBook and tar scripts on linux. These are copied to a NAS which is further synced with Dropbox.


Automated

Backblaze for persistent cloud backup.

Semi-Manual (I need to dock my latptop and plug in the drives)

Weekly copy to external HDs (at home) using SyncBack

Manual

I backup by hand my FileZilla profiles, Google Docs, project management web app (XML file).


Syncthing to synchronize everything between the laptops and two storage servers (one local, one remote).

Bup to snapshot the synchronized files every 5 minutes.


I've been meaning to setup a backup plan for like the last 20 years. Maybe over the Memorial Day weekend.


Automated. Fully scheduled and users will not required to worry backup will done in background Seetha


Time Machine, plus Arq (which backs up to amazon S3 glacier storage).


Same, except I recently added Google's new Glacier competitor (forgot the name) for additional redundancy.


> Google's new Glacier competitor

Near line storage. (https://cloud.google.com/storage/docs/nearline)


Automated.

I have Raspberry Pi, running always. And I use it as Time Capsule.


Clone copy with SuperDuper whenever I remember.


Automated.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: