Hacker News new | comments | ask | show | jobs | submit login
Ask HN: How do you Backup your Personal Files?
25 points by davidbrent on Aug 10, 2012 | hide | past | web | favorite | 54 comments
Call me a child of the American media's fear tactics, but this well publicized story of Matt Honan's hack has me questioning my backup techniques of personal files (pictures, music, important family legal docs, tax data, etc...).

Currently I have one hard drive on my home Windows 7 desktop machine, with dropbox sync setup to mirror certain folders and then I use Google Drive for most of my pictures and media sorts. Now I am thinking I need a local option, something fully under my control (and unfortunately my responsibility).

With so many options out there, I wanted to see what the users of Hacker News were doing with their personal files, outside of the plethora of cloud solutions. Thanks!

First line of defense is a Time Machine backup to a 1TB drive connected to an Airport Extreme. No-brainer if you're on a Mac, in my opinion. Tip: excluding my Chrome profile from the backups reduced the size of most backups by an order of magnitude.

Second, CrashPlan. It's the only decent solution I could find with support for backing up network drives without being prohibitively expensive for large amounts of data. The client is a bit resource hungry, but I'm hoping the situation will improve as soon as the first run is complete.

I also have a Dropbox account which I mainly use for syncing and sharing, but I've included the most important files there as well for extra redundancy and ease of access on other devices.

In addition, all my code is pushed to at least one remote repository, either on Github or a server. I also run my own mail server which is rsynced to another server, so there's at least three separate copies of my mail folders.

I have almost the same setup as you, except for the mail system: I am just using either GMail or iCloud server for that. TimeMachine and CrashPlan work really well and do their job totally in background without any visible impact on my work. I really like CrashPlan and went initially for a 3 years unlimited plan: you can get a really good deal usually around black Friday where they drop their yearly unlimited prices.

All my macs are now setup with CrashPlan.

I tried crashplan on my macbook air. The big thing I'm noticing right now is it takes freaking forever to do the initial backup. I'm really hoping that subsequent backups don't take nearly as long.

The initial backup can be long yes... I usually leave my Mac always ON for a couple days when that initial backup is happening. After that first backup is done all the others are incremental and usually fast... Unless of course you just created a multi gigabytes movies that need backup.

I know you are looking for a non-cloud option, but Tarsnap (an excellent service by HN's cperciva - http://www.tarsnap.com/) is impervious to the sort of issues raised by Matt Honan's hack, as long as you properly secure your private keys. Of course it is still susceptible to the other weaknesses presented by the cloud, which is why I also use an rsync'd external drive.

Same here. Tarsnap + external drive, with the occasional burning of DVDs for Very Important Files™(with parity data).

I have a 1TB external drive, and occasionally plug it in and run a script that calls `rdiff-backup --exclude-other-filesystems foo /mnt/backup/bar` for multiple values of foo and bar.

It's not complete enough that I could wipe my root partition, restore the backup and start running again, but I'm not going to lose anything important if my laptop gets stolen.

I don't use cloud backups, mostly out of laziness. (I would have to think about what is actually worth backing up, and organise things so that it's easy to back those things up and not other things.)

I run a Tahoe-LAFS grid that spans a couple drives in my home machine and my office workstation. The majority of "archival" stuff I need to store that's large and relatively infrequently accessed (music, ripped DVD backups, hi-res TIFF scans of my artwork, RAW files for my photography) just lives in that grid. "Backups" are implicit. I even wrote my own music player that just streams out of Tahoe instead of off disk.

I actually had a (mostly full) 2TB drive go south just the other week and I was able to pull it out and replace it without any data loss. The rest of the grid was even still available while I replaced the drive. I put a new drive in, made a new Tahoe node on it, brought that node into the grid, then ran a "tahoe deep-check --repair" and it repaired and rebalanced all the files that had had shares on the broken drive.

My source code is vitally important, but small. I use git so I just always maintain a couple repos. Generally one on my workstation, one on our office git server, one on github, and there might be copies on my laptop, home machine, and a personal Rackspace server.

For my personal websites, I have a cron job that dumps data out nightly and drops it into Tahoe.

I've spent a lot of time thinking about this... I don't trust Drobo/Synology etc because if the unit dies I need another one to retrieve my data. Jungledisk adds up if you have 100+GB, Backblaze is the best of these services, however they are proprietary, and imagine if either were cracked and your life ended up on on a torrent server somewhere... In the end I decided upon a HP Microserver running Ubuntu with disks in RAID5 + a cheap UPS. I then have an rsync cronjob on my laptop. I tried FreeNAS but it was slow. I may consider an offsite backup of the Ubuntu server (critical files (photos etc)) to AWS with Duplicity and PGP... So far I feel the safest and happiest I've been in awhile about my backup situation.

I have multiple layers of redundancy. This is Mac-centric, but you can do something similar in other operating systems:

* I use a Time Machine to do incremental backups of my system drive which also contains any files that I don't want to lose.

* I periodically make a complete image of all my important drives. I have two copies of this. One copy stays off-site and I never have both at the same location as my originals. The other copy is local but left disconnected. I periodically rotate a copy to the off-site location. I use SuperDuper to make the images.

* I use DropBox for things I am working on and various other important files I want backed up and shared across my computers. This isn't big enough to contain the bulk of my media files (music, photos, video), but anything small enough goes here. A few of my super-important media files I keep here, like my wedding video and photos. I started using DropBox's photo syncing feature for my phone camera and am thinking of using it for all my cameras as well.

* Anything in my DropBox gets backed up by Time Machine as well on two different computers at two different locations.

I think it's really important to have at least two backups in the case that one of them fails. And one of them absolutely should be off site since a burglary, fire, hurricane, etc can potentially take them all out if they are all in the same location.

Also, it should not be possible for a remote attacker to erase all of your backups. For example, if you used a combination of Time Machine and cloud backup, an attacker could potentially gain access to your local machine, do a secure erase of all your drives including your Time Machine backup, and gain access to your online cloud backup and erase that. In a situation like that, you better hope that your cloud backup provider keeps enough backups such that you can recover erased files.

If you want to be even safer it's good to have 3 backups. One local, one off-site, and one off-site in a different city (e.g. Internet backup or shipping your drives somewhere). Now this is considering a severe worse case scenario, but having an off-site backup in the same city isn't enough, because a hurricane or earthquake could potentially damage your drives in both locations.

I have a similar backup system as this, but I temper my use of DropBox for super important things because it is a hot storage solution. If I (or someone) happens to delete it from one DropBox location, it's deleted everywhere.

For that reason, I also SFTP my files up to Amazon S3 as a cold storage solution.

Multiple backups. Redundancy is the name of the game (redundancy and paranoia)!

1) Backblaze - fire-and-forget cloud backup, running in the background [constant]

2) Time Machine drive - what if I have no network access and need to recover work? [daily, when I'm at my desk]

3) Carbon Copy Cloner - what if I suffer a catastrophic boot disk failure? [weekly]

The drives for Time Machine and Carbon Copy Cloner are in separate geographical locations (fire, theft, etc).

CCC creates bootable backups, so if my internal SSD drive dies I can boot from USB or just switch the drive out, then restore any changes from TM. Minimal downtime!

I have a 6-drive Netgear ReadyNAS Probusiness that is currently at about 3TB. It supports Time Machine, so I backup my 2 Mac Laptops through Time Machine only. I backup my desktops and all my other files on the ReadyNAS. I have UPS on all my desktops and ReadyNAS to prevent any power outages from causing data corruption. This came in handy a few months ago when a tree branch fell and took out my power lines.

I have two 2TB drives that I switch between every couple of months that I backup my ReadyNAS to. Because this data is basically my entire digital life (my photos, documents going back to about 1993, emails, etc), I don't trust a straight file copy, so I wrote my own simple bit-by-bit file comparison program. An engineer at NetApp told me all the horror stories that he's been involved with in terms of hard drive reliability, including things like the hard drive reporting to the OS that it wrote everything correctly, when it fact it didn't, so I've become pretty paranoid. The last thing I want is to make a backup, and then find that what was written on my backup hard drive was corrupt.

I have a couple of 500GB hard drives that contain some older archived data from 3+ years ago, so my older photos and documents have about 4 different copies. The only thing I don't implement, which is probably my biggest weakness, is that I don't store any copies offsite, so if there's a fire, I'm screwed.

Except for database and photos, my data growth has slowed down considerably. I would guess that my database data grows about 100 GB every year (I collect stock quotes every day), and my photos and videos are, on average, 10-20GB. This year was an anomaly because I went on a road trip where I used 2 GoPros to take time lapse photos and videos of the entire road trip, and that data itself is about 100 GB, and 99% of it is useless (and embarrassing).

I've had certain personal data wiped from a hard drive many years ago, so I am a bit paranoid:

USB Thumb Drive for active files, 2 copies on External Hard Drives for long term storage (which I migrate once a year to new drives), and two physical copies of critical documents (legal, tax) kept in two locations (usually my parents house and a safe deposit box).

Hourly remote-initiated rsnapshot to MyBook Live NAS on internal network.

Then the NAS performs a nightly sync of selected folders from the latest hourly backup with tarsnap, keeping backups from the last 7 days, last 4 weeks and last 3 months.

I have some items that I would not like to lose - photographs of my son.

These are copied to multiple machines, in different houses. They are uploaded to 2 different providers (facebook (natch) and dropbox.)

I also create rars with par2 redundancy data and burn them to good quality CDs which are stored in tyvek sleeves in a firesafe.

That's perhaps a bit over the top, but I really don't want to lose those photos.

Passwords and serial numbers are printed out and kept in the firesafe. That's perhaps a bit insecure, and I need to arrange an "in case of death" list.

Everything else used to be rsynced to a different machine in the same house, but now it's time machined.

A firesafe is a great idea, and I might follow your example. Do you know if the temperature inside the firesafe remains low enough so that things like CDs and hard drives don't get corrupt? I know that they work for things like papers, but are they deemed safe enough for electronics?

I'm a Linux user. I have a laptop that's my main machine and a fileserver for storage. I try not to keep anything on the laptop, but have my programming projects & config files on it (so I can work if I'm away from home).

The fileserver is where most of my stuff lives; it gets NFS mounted to the laptop. It's got two drives as RAID1, and backs itself up to an external hard drive using rdiff-backup. It also backs itself up offsite using duplicity to a server on another continent. Everything's encrypted, so if anything gets stolen I only have to worry about the material loss.

Dropbox syncing to 3 machines, one of which is a linux server in my closet. The linux server makes snapshots of the folder and saves rolling daily, weekly and monthly snapshots to a RAID1 array. (see http://neverusethisfont.com/blog/2010/07/how-to-back-up-drop...) The RAID1 array is then backed up to a separate RAID1 array nightly.

Thia has been working pretty well for me and is a good balance between automation and sufficient redundancy.

Arq backup works great. It does encrypted incremental snapshot backups to your personal S3 account (unlike most services which hold your backups hostage). It has an open-source recovery program. It has a documented format.

For local backups, I use a poor-man's Time Machine — rsync with hard links. It's how Time Machine works under the hood anyway, and I used it long before Apple baked it into the OS. Works great.

The OP uses Windows, though, so I honestly have no idea what equivalents to these options exist which would work with, e.g., NTFS.

Mark me down as another vote for Arq ... total set it and forget it setup. Very easy to use and inexpensive option for offsite backups.

Time Machine also uses directory hard links for when an entire tree is unchanged, as a pretty big optimization.

I have a cheap box at a webhosting company with ssh access.

I use svn over ssh and check in all of my person stuff there.

This is really nice because I can share linux dotfiles, etc. between my work and home machines.

Just be careful those of you using sync too much, I've definitely seen people sync across deletes and other errors.

I personally backup about 1tb of home movies, photos, music etc. I keep 1 local HD + crashplan + crashplan remote in a friends house + Google Drive for docs I use daily. I also keep an offline drive just in case an error is replicated across my live drives.

I also used to use BackBlaze but dropped them, they play tricks to get out of storing a lot of content for too long

Can you expand on the 'Backblaze plays tricks' comment? I am current user and am curious.

At the end of each day I run this Allway Sync app that makes both two way and one-way syncs of my folders to an external 320GB Samsung HDD. I don't propagate deletions for my photos and videos, so they accumulate forever. For all the other folders, I propagate deletions and modifications. Every couple of months or so I mirror this daily use external HDD with another one I have just to avoid any risk involving hardware failure.

I know this is controversial in light of their recent breach, but I use Dropbox in conjunction with a symlink into my docs folder. I don't have photos or videos, and my documents easily fit into Dropbox (I also have a lot of bonus space, from Dropquests, job fairs, and referrals). Everything automatically backs up, is accessible from any device, has revision history, and won't get destroyed if my house burns down.

I'm finding I generate very few "personal files" these days, compared with, say, ten years ago. Most personal files are produced/archived with Google Docs.

Yes, that is scary.

I also have 100 GB or so of archived photographs and videos that I back up with a mixture of Ubuntu One (videos and assorted files from 10 years ago) and Rackspace Cloudfiles (photos). I wrote my own Python scripts for backing up and verifying photos on Cloudfiles.

I've also been working on less files locally. Everything is sitting in Google docs. Or worse, I've mailed it to gmail and it sits there unread.

I rsync between my two nix boxes. My wife's Win box I do every now and again, mostly video and pictures.

Drop box for scripts I need across the nix boxes.

And I do a manual copy to an external, my folder structure is the same on all boxes but I like to watch this one.

Personally I have been very lazy with backups, but I take important documents to usb stick which I carry with my keys everywhere I go.

Though I should probably start backing up my personal hobby projects, because I have lost ton of them over the years, because I just kept one working copy of them, but I have been lazy to great a backup plan which I would actually follow.

1 TB Time Capsule for my most regular main backup (2 mac laptops) iCloud backup for iPhone and iPad (upgraded storage) Arq backup direct to s3 for really important stuff like photos (http://www.haystacksoftware.com/arq/). I have about 200GB on reduced redundancy storage.

Rsync cron to my RAID 6 NAS box. Photos get sync'd to 3rd party service. Thumbdrive for critical passwords (keepass).

- Dropbox for documents and archived log files.

- Photos and videos are backed up to a truecrypt drive manually using FreeFileSync twice a week or so.

- Backblaze running constantly.

- Pogoplug (+1 external drive attached) offsite at my parents house with archlinux installed. Eventually I'll attach another drive and have rsync mirroring them using a cron job.

I live exclusively in the Apple ecosystem, so I:

  1. back my Mac up with Time Machine;

  2. do a weekly Carbon Copy Cloner dd of my disk;

  3. keep copies of my data (documents, music, photos) on Dropbox.
iCloud handles syncing. This is not bulletproof, but it works for me.

I just keep everything I care about in my Dropbox. I use enough machines on a day-to-day basis that this leaves me with a decent local redundancy, and having it all on Dropbox means that I'll have access to it even if all of my machines go down at once.

I fear Dropbox may be more like RAID than like backup: what happens when a file is deleted, causing it to be removed from all of your redundant machines?

You have 30 days to recover it using the free version, and unlimited time if you pay for it: https://www.dropbox.com/help/115/en

Also if one machine is not updated (due to no network, for example), the file still exists there.

My first line of defense: Dropbox

I also have CrashPlan running in the background but hardly ever have to use it

I have an Apple Time Capsule as well but, like CrashPlan, I try not to have to use it and it's more of a worst case scenario backup.

Dropbox all the way.

Apple Time Machine and two external 2 TB hard drives for the iMac. Additionally, a daily backup backup of my Macbook Air to a 128 GB USB flash drive, which I copy over to the external hard drives once a week.

Mine is similar: Time Machine all the time, and a periodic drive image using Super Duper to one of two external (actually enclosure-less internal) drives, one of which is always offsite.

What will you do if your house burns down?

Time Machine via a drive hooked to an Airport Extreme along with Backblaze.

I've been wrestling with this same issue, ever since I read Google's EULA. I'm thinking of using Carbonite, which no one's mentioned yet.

1TB Touro Mobile USB, always sitting on top of my desktop. I just have to do a manual backup every so often using Sync Toy.

I also use Dropbox etc free.

BackupPC handles all computers. Then...

Photos and documents (basically small stuff) -> rsync.net every day.

Large stuff (music, video) -> external USB disk.


Maybe this post should be converted to a poll, though?

Over 400GB backup on external and crashplan, these include all my personal files and photos throughout my life.

* Crashplan, family plan that has unlimited backup for 5$/mnth

* Drobo FS for local backup, can go upto 5 HDDs as RAID.


Time Machine backup (daily) + bit-by-bit copy of hard disk using SuperDuper (on another external drive).

rsync to RAID 5 externals.

Crashplan running in background.

Google Drive for Docs and working files.

Github & other git services for projects.

Github + Dropbox + CrashPlan

rsync cronjob to my external 8TB :)

I don't know why the comment below this "What will you do if your house burns down?" is marked as dead ... it's a very valid question.

Backing up onsite is almost as bad as no backup at all. You are protecting against hardware failure ONLY.

If your house burns down you are screwed. Also, if someone breaks in and steals your computer, do you think they aren't taking your external drives as well?

Offsite backup is mandatory if you really care about not losing your data.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact