
Ask HN: How do you backup your files without depending on a third party service? - sbjs
I&#x27;m using Github, Dropbox and iCloud, but this makes me nervous about my data&#x27;s privacy and longevity for a number of reasons. Among the three services, I have ~700GB of data.<p>But I got to thinking, aren&#x27;t USB thumb sticks reliable and big enough nowadays to fit this much data? I could just buy two 512GB sticks and use rsync to backup to them a few times per week. This way I wouldn&#x27;t have to lug around an awkward hard drive if I&#x27;m ever traveling.<p>What strategy do you use to keep your data safe and private for the long term? Do you have a portable solution, or do you recommend something else?
======
old-gregg
Synology or QNAP NAS. They collect backups from all computer in the house,
encrypt them and ship the encrypted blobs to AWS Glacier.

Synology in particular is a fantastic platform which does way more than a
traditional NAS. It's quite polished personal cloud platform (way more so than
FreeNAS) and includes hundreds of apps, even the backups into the cloud can be
done in different ways.

I also place a premium on the fact that Synology is a piece of hardware. The
software comes built-in and taken for granted. What this means is that it's
been sitting somewhere in the dark corner of my house since 2012, self-
updating and not bothering me with upgrades or expired credit card /
subscription, etc. It doesn't care if I run Linux/MacOS/Windows, it's just
dumb storage that all of these OSes can backup to using built-in tools.

Another tool I like is Resilio Sync [1], it's a cloudless Dropbox and it's
perfect to sync /Documents and /Desktop across computers at home and at work.
It's not a backup tool, but very much related, i.e. if I put a file on my
desktop at work I know it will end up on Synology at home and eventually will
get encrypted and backed up to Glacier, and yes, without any 3rd party servers
involved.

[1]
[https://www.resilio.com/individuals/](https://www.resilio.com/individuals/)

~~~
iooi
I would stay away from QNAP, their quality is terrible. I had a QNAP QVR and
it just reeked of insecurity.

My friend has one of the smaller Synology boxes with 20TB and it is pretty
nice. You can think of it as the Apple of the NAS systems out there,
everything just works.

However, if you like to tinker with your NAS a bit more, FreeNAS is excellent.
It's pretty comforting running zfs over standard RAID, and you have a FreeBSD
system where you can run anything you want in native jails.

~~~
lsh
Agreed, QNAP os/distro/apps just feel shoddily made and maintained. If you
maintain systems for a living you'll go berzerk trying to track down just
exactly what it is doing and why and the constant (weekly?) firmware security
updates.

I haven't tried Synology in a long time. I'm looking to get my few TBs off of
the QNAP and sell it in favour of something custom built.

------
galadran
I have a 4TB Hard Drive attached to an OpenWRT router (Rasberry Pi's are also
a popular choice). The router exposes a locked down SSH port (SFTP only)
through a cheap $5 a month VPS.

I also use Jottacloud [1] who provide unlimited (yes unlimited) storage for
7.5 EUR a month. Based in Norway, they keep an up to date warrant canary.

Client side, I backup to both of these services using Duplicati [2] which
offers client side encryption, block level deduplication and excellent
integration with Jottacloud and SFTP. For synchronisation, I use Syncthing [3]
since I don't want to expose plaintext documents to Jottacloud. All in all,
I'm very pleased and it has required no maintainence since I set it up a year
ago.

[1] [https://www.jottacloud.com/en/](https://www.jottacloud.com/en/)

[2] [https://www.duplicati.com/](https://www.duplicati.com/)

[3] [https://syncthing.net/](https://syncthing.net/)

~~~
ioddly
I just started experimenting with Syncthing because Dropbox no longer meets my
needs.

So far I'm pretty happy and impressed, it's a decentralized piece of software
that Just Works (TM). My plan for backups is to spin up a VPS somewhere and
have them back it up.

I also love that it can use LAN; I can finally sync up my giant mp3
collection.

~~~
JeanMarcS
+1 on syncthing.

I have one on my computer, one one our Raspberry Mediabox (with a dedicated
drive) and one on a dedicated VS.

New version using inotify really helped the Pi !

------
eltoozero
Go by the 3-2-1 rule: 3 backups, 2 different formats, one offsite.

Similar to the military mantra: three is two, two is one, one is none.
(Meaning your "spare" should already be considered in production or broken, so
you need a "spare spare" to really have a "spare").

Time Machine your Macs to a drive and also perform an image backup to a second
drive, keep them both in separate places and implement a rotating image or
Time Machine backup.

On your PCs File History is probably adequate, and you can image the drive
with something free like Acronis.

Linux there are a hundred ways, but the o'reilly 'crocodile' book (backup and
recovery) has many good solutions.

Also as a tip you can get solid LTO5 mechanisms for a good price these days on
eBay and they're pretty robust and hold a good chunk of bits.

Synology NAS are good machines.

------
koolba
If you’re tech savvy and want the easy answer then check out Tarsnap[1].

If you’re tech savvy and want to save a few bucks by doing it yourself, check
out borg[2]. You’ll need to set up an external server for it but at scale (TB)
it’s an order of magnitude cheaper than Tarsnap.

Regardless of off site storage I’d suggest also buying two large USB drives
and formatting them with LUKS. Keep them offline and only sync them
occasionally as live storage isn’t really a backup.

If you’re not tech savvy I’m convinced what you’re asking for is impossible.
The best a layman can do is pick a reputable provider (like the ones you’ve
listed).

[1]: [https://www.tarsnap.com](https://www.tarsnap.com)

[2]:
[https://borgbackup.readthedocs.io/en/stable/](https://borgbackup.readthedocs.io/en/stable/)

~~~
otterley
Tarsnap is a third-party service which in turn depends on S3. The author
specifically requested solutions which _do not_ depend on a third party.

~~~
koolba
I know and I still gave the answer I thought was best.

Unless you're going to run your own servers in multiple locations (or convince
you friends to let you host a NUC at their homes), you're going to rely on a
third party. All that changes is the type of service that you're receiving,
i.e. whether it's an object store (S3, B2, etc), a virtualized server, or even
a rack in a data center.

------
nickjj
I use an external USB HD and automate my backups with rsync and cron. Works
great.

Here's a blog post showing everything from which drive I use to the backup
script: [https://nickjanetakis.com/blog/automatic-offline-file-
backup...](https://nickjanetakis.com/blog/automatic-offline-file-backups-with-
bash-and-rsync)

I've been using this strategy for about 20 years and it hasn't failed yet.
Although in the older days I had a slaved HD and also used CDs on occasion.

I would never trust a consumer grade USB flash stick. I've had so many fail on
me over the years.

------
cascom
I switched to ARQ ([https://www.arqbackup.com/](https://www.arqbackup.com/)) a
while back after a near miss with a hard drive failure, and think its an
amazing product - the key elements to me are that:

1\. It is storage location agnostic (backup to a usb, external HD, network HD,
dropbox, amazon, etc)

2\. your backup data is encrypted locally before being backed up to any medium

What's great is that i can back up one set of data to an external hard dive
manually as needed, another set of data to say dropbox on a nightly basis, and
another set to a B2 bucket on a monthly basis (all of which are encrypted so i
retain my privacy/data security).

Edit: i suppose that it is a "third party service" but i think it reliably
solves for the lock-in and privacy problems you reference

~~~
tothrowaway
I use arq as well. Some things I don't like about it are:

1\. There's no information about what it is doing (which file are currently
being backing up, what's the current upload rate, what's the estimated
completion time, how much data is left, etc)

2\. Large files (like VM images, or crypto containers) block small files from
being backed up. Essentially, your backup frequency can be no more than
largest_file/upload_rate (well, it's more complicated than that because of
deduplication)

~~~
cascom
#1 the information is limited, but it is available
[https://goo.gl/images/9zCMNR](https://goo.gl/images/9zCMNR)

#2 is an issue I've encountered as well

------
corv
Two FreeNAS servers on premises and in different locations with ZFS
replication.

Then use SMB, NFS, Time Machine, rsync, Syncthing, sftp or anything else you
like to push data to one of the FreeNAS machines.

Periodic ZFS snapshots prevent common cases of PEBCAK and ransomware.

Fully redundant, encrypted & open-source solution that can saturate 10GbE!

~~~
gravypod
If you have ecc memory and two locations this is probably the best option.

~~~
corv
If you don’t have a second location you can still replicate to
[https://www.rsync.net](https://www.rsync.net) with all the benefits ZFS
provides.

ECC is highly recommended but even a low powered CPU will do.

------
wristmittens
A lot of the replies either rely on third party services (albeit encrypted) or
forget about the third part of the holy backup trinity of having something
offsite.

It sounds like you're thinking of local backup while you're traveling, which
is great. The portability of USB sticks seems perfect for your use case,
although I think a 1TB stick is still a stretch, maybe look into a small self-
powered external SSD drive. If using OSX you can encrypt it and use Time
Machine to auto-backup when plugged in (easy) or just use an rsync script
(manual).

But don't forget to ship these backups offsite once you're home. If you get a
couple of 2TB drives to account for growth, you could get into the habit of
depositing the current backup into offsite location, grabbing the old backup
from offsite, and just rotating every couple of weeks. Offsite could be a
local bank safe deposit, a work office, or out of town family. (I consider
family second-party, not third-party.) Local backup means nothing if you have
to evacuate due to flood (east coast hurricanes), fire (california wildfires),
or any unforeseen disaster.

I use this method to backup 10TB of photography that I can't do online without
a fiber uplink. Thankfully 10TB platter drives are possible these days for me
to flat-rate mail to my parents across the country. (And yes, Synology for
local backup, that seems to be the hit here.)

~~~
sbjs
The common wisdom always says have an off-site backup. But what if the backup
is small enough to fit in your pocket? I always have a Swiss Army knife on me
anyway so if I have a backup thumb drive that’s the same size, it should fit
just fine, and be as safe as my house keys. I really think we’re in a new age
of self sufficiency because of advances in technology and physical storage
size.

~~~
wristmittens
One of the biggest benefits of off-site backup is stability. That's why
something like a bank's safe-deposit box or my parent's firesafe are my
preferred locations. A backup in your pocket is great until you lose your
keys, your pants get a hole in the pocket, you get pushed into a pond, etc.

------
opencl
You can buy USB3 M.2 SSD enclosures for ~$20. They are larger than normal
flash drives at about 5" long but still reasonably pocketable and much higher
performance than normal flash drives. Drop in a 1TB SSD for ~$200 and you're
good to go.

I had a friend in college that backed up all of his files on a flash drive but
he used git rather than rsync.

~~~
LeifCarrotson
> Drop in a 1TB SSD for ~$200 and you're good to go.

O_o ...For some reason, my mental model of SSD rates was stuck in 2014 at
$0.50/GB. As I type this, my laptop is busy moving a VM image over to the NAS
to free up some space on my 256 GB SSD.

But no, according to [https://pcpartpicker.com/trends/price/internal-hard-
drive/](https://pcpartpicker.com/trends/price/internal-hard-drive/) and
[https://pcpartpicker.com/products/internal-hard-
drive/](https://pcpartpicker.com/products/internal-hard-drive/), you can
indeed find SSDs at $0.20/GB (the low end being currently dominated by Crucial
MX500 drives at $0.18/GB). Looks like it's long past time to upgrade!

~~~
ta658567959
After my last SSD died, I'd have a hard time trusting SSD's again for backup
purposes (this is probably 8-10 years ago though I suppose).

It was more the way it failed - I had one or two boots where a few errors
started to pop up, and then the next boot, nothing - the disk had disappeared
to the BIOS' eyes and any other tools - just dead as a doornail with no easy
way of getting any of the data off it.

I assume they've improved in that regard, or are spinning platters still a
safer proposition over the long term?

~~~
mmt
8-10 years ago is a very long time, especially for what was still a relatively
new, low-volume product in the consumer market.

> I assume they've improved in that regard, or are spinning platters still a
> safer proposition over the long term?

I don't think you need to assume..just check out what statistics are out
there. Sadly, I don't know of anyone publishing numbers for consumer SSDs like
Backblaze does for spinning disks.

------
toyg
The issue with "simple" backups is that, should your house/office burn down,
you lose everything, including the backups. Any alternative (like bringing the
usb sticks to a safety box etc) is much more cumbersome than standard cloud-
based solutions.

I have a SpiderOak subscription for 2TB that is private enough for my needs,
and that's where I backup pretty much any machine i have. I also keep a low-
power Linux server at home (a cubox) with two 1TB disks in raid-1, that acts
as a Time Machine for the occasional quick restore. Most of my code lives in
Gitlab or Bitbucket, and some business stuff in OneDrive. Family pics and
videos, that are massive, get pushed every few months to Amazon Glacier.

~~~
sbjs
Not if I carry the USB sticks in my pocket at all times!

~~~
toyg
That's a recipe for losing data even more frequently...

~~~
sbjs
I haven't lost my house keys once. I'm sure I can manage to do the same with a
USB stick? Especially if it has a keychain hook so I can put it on my house
keys keychain!

~~~
toyg
I guess that's a strategy. It wouldn't work for me, I lose everything
(including house keys a few times).

------
0xbxd
RClone ( [https://rclone.org/](https://rclone.org/) ) synchronizes (similar to
rcopy) directories with a multitude of providers and allows for client-side
encryption. I use a combination of Google Cloud Storage (for really important
files and pictures) with the "crypt" extension and an USB HDD Hub (4 disks)
with Raid 5 hooked to a Raspberry PI for less important data. The backup
scripts start on boot. It's a charm. Oh and rclone is written in Go so it
works on all platforms, including Windows, macOS, Linux (and also ARM).

------
cik
I maintain three sets of backups - two personally, and one with a third party
service.

For the personal backups, I keep one at home (second location) and another in
the office. In both cases, I rely on rsnapshot
([http://rsnapshot.org/](http://rsnapshot.org/)). It's an oldie but a goodie -
I've been using it for over 15 years. It solves the rotating backup problem -
and thanks to the glory of symlinks reduces my disk usage.

I _also_ use duplicity to gpg-encrypt files that I save into SpiderOak - but
feel free to use s3 buckets. I very much like having encrypted backups just
work. In both cases I have my scripts notify my via a webhook when things fail
- and every 7 days regardless, just so that I don't have a heart-attack
wondering if anything happened.

SpiderOak is new - about 8 years. The webhook is newer, about 4. The rest is
ancient and continues to work across everything, always.

------
nextos
I love simple software that doesn't too much magic.

For this reason, I use rsync to a btrfs volume (both local, on a USB stick,
and remote over SSH).

Then I take a btrfs snapshot, so I have versioned and unduplicated backups.

Furthermore, rsync supports gitignore-like files, so you can control what gets
and what doesn't get backuped in a very fine-grained way.

Tarsnap is also a great option.

For your particular use case, I'd go for both a remote server and a local pair
of USB sticks to have duplicated backups in different locations all the time.

~~~
newnewpdro
> I love simple software that doesn't too much magic. > > For this reason, I
> use rsync to a btrfs volume

If btrfs were simple software redhat/fedora wouldn't have abandoned it in
favor of xfs.

~~~
nextos
Then another filesystem with snapshot support.

My point is to use utilities that do one thing, and compose well.

~~~
newnewpdro
Ok, but this approach is pushing the complexity down the stack into the
kernel/filesystem. And a general-purpose performance-oriented snapshotting
filesystem is going to be making a lot of compromises that aren't really
ideally suited to backup duties.

Plan9's Venti/Fossil immediately come to mind as an example of simplicity in
this vein.

------
solox3
* "Syncthing" for keeping multiple copies of the current files, across devices in different locations, that will never malfunction at the same time. This is usually enough.

* "Back in Time" for keeping older file versions in external hard drives.

The two were chosen mainly because they have decent GUIs, both support diffs,
and the backups can be viewed using nothing but a file explorer.

~~~
mrmattyboy
Syncthing also provides backup versions (if updated on a different machine).
So if it pulls a newer version from another machine, it will back it up first
:)

Edit: Granted, time machine provides a MUCH more 'slick' interface

------
msravi
I store my data in 3 places: 1\. A 2TB WD Mybook Live 2\. A $60 per year 1 TB
Amazon Drive account (contemplating moving to Backblaze) 3\. A $100 per year 1
TB Google Drive account.

I use rclone to sync with Google and odrive to sync with Amazon. Stuff on my
phone (including photos) gets synced to Amazon and Google using the respective
photos apps for photos and foldersync for other stuff.

------
TheWiseOne
I use Bvckup 2 ([https://bvckup2.com/](https://bvckup2.com/)) to backup on to
a 4 TB external drive. I also have Duplicati
([https://www.duplicati.com/](https://www.duplicati.com/)) running to backup
those backups onto OneDrive and Backblaze.

~~~
rl3
Just chiming in here to say bvckup2 is awesome. It's one of those rare pieces
of software that has rock-solid engineering. My heart warms with glee every
time I think about it.

I've been using it in place of robocopy for all my Windows backup needs for a
while now, with rsync for linux. It's _fast_.

------
eps
Hourly backups to a NAS and one of two local USB3 SSD drives that I rotate
weekly.

You can get around 400 MBps write rate on the latter [1], which is pretty
decent by itself, but I also use bvckup2 that does block-level delta updates
and that cuts down backup time to around 10 minutes. That's on a bit less than
1TB of data mostly comprising very large encrypted file containers and VMs.
Worked pretty well so far, went through a couple of restores just fine, so no
complaints.

[1] [https://ccsiobench.com/s/fxkH0x](https://ccsiobench.com/s/fxkH0x)

~~~
babo
What's your use case for such a strict backup regime?

~~~
eps
I wouldn't call it strict. It still means an average of 15 min of work data
lost in case of primary storage collapse.

------
kencausey
Certainly local backup has benefits but where it falls down in practice is
that it requires manual intervention from the user. Also, if you don't store
the backup media in a safe place (preferably multi-homed) then an event that
causes loss of the original may also destroy the backup.

------
dbatten
Windows user here. I use the built-in robocopy utility to sync files to one of
2 external hard drives. One hard drive stays at work, the other at home, and I
switch them periodically (usually, right after downloading new family photos
onto the computer).

With this method, a virus or something like a fire that only affected one
location would not be able to take out all copies of my data and, at most, I'd
lose the most recently backed-up information.

It's not 100% bullet-proof, but it's good enough for my personal purposes.

------
gesman
No exactly answering your requirement (not using 3rd party) but i use
crashplan for multiple reasons:

\- they support all platforms (mac, linux, windows) - agent running on your
system, detects changes, encrypts data and transparently send stuff to the
cloud.

\- They support client-side encryption (key or password). If you lose password
or key - you screwed.

\- they support unlimited versioning of back up files. If you screw your PPT
or DOC - you can restore previous version(s).

\- they support unlimited destination size (i'm already at 0.5TB and counting)

\- they keep deleted files.

\- they support removable media attached to your system (which helps me
greatly as I do lots of hi-res photography where each image is 50MB. I already
got 5TB external drive ready to be sync-ed)

\- everything is $120/yr

Disadvantage:

\- above price is _per device_. So if you need to backup 2 separated computers
- you'll need to pay twice, etc...

Conclusion: I been enough through semi-shitty geeky solutions and love
crashplan because it's a no brainer to setup and forget. Of course - if they
suddenly get out of business or cloud will explode or whatever - then you're
screwed. But IMHO chances are - this backup will work for me better than
anything i tried before.

------
turc1656
I have always kept it simple. A storage array in JBOD (non-raid) setup. I buy
two of whatever drive(s) will contain the duplicated data and just do a clean
mirror using whatever backup software you desire. I happen to use 4TB HGST
drives in a Mediasonic PROBOX 4 tower hooked into a PCIe card in my computer
that connects them all through one multichannel eSATA cable. The software I
use is Allway Sync (free for up to a certain amount of backed up data per
month, after that a one time fee for lifetime updates).

I do one additional thing, and it's for third-party storage. So if you're dead
set on cutting out a third-party, I understand. But here me out. If the data
is encrypted on the third-party system, it's largely a non-risk in terms of
privacy unless your filenames themselves put your privacy at risk. I am fine
with my filenames being seen, but the content being encrypted. So what I do is
use Boxcryptor to encrypt the "destination" drive and then I use pCloud as my
third-party storage provider. Both pCloud and Boxcryptor show up as physical
drives on your computer, which I find convenient. So I have my primary source
drive synced with the Boxcryptor drive/location and Boxcryptor is set up on my
machine to encrypt the data on the fly as it is copied (surprisingly, this is
not the default setting). Then, The pCloud software is set up to automatically
sync that Boxcryptor directory where the files show in their encrypted format
(the "drive" version shows them unencrypted because everything is done on the
fly both ways) to my pCloud storage. In the end, I have 3 versions - the
unencrypted source, the encrypted Boxcryptor copy, and the pCloud backup which
contains the encrypted copy.

Also important to note - pCloud is based in Germany and thus subject to the EU
data protection laws and even without that I find pCloud to be a much lesser
risk in terms of data privacy than any company in the US.

------
chaz6
Just because nobody has mentioned it yet, there is a file format called zpaq
[1] which is incremental and journaling with good support for linux, mac and
windows. It support multi-threaded compression and error detection and
recovery.

[1] [http://mattmahoney.net/dc/zpaq.html](http://mattmahoney.net/dc/zpaq.html)

------
kencausey
While we are on the subject I can imagine a device that would be ideal for
some of my clients: A small (deck of cards/paperback book) battery powered
device that connects to the office wireless network and is left on the user's
desk or in a drawer overnight (when network usage is low) that communicates
with the office NAS and stores a backup. The next morning the user sticks it
in her purse or briefcase and leaves another for the next night. The one taken
home is plugged in to charge for the next time.

This still requires the user to remember this process but it is about as
minimal as I can imagine for a local backup that is mostly kept offsite.

I'm thinking about this in particular for a local non-profit that has privacy
concerns with storing their data with a 3rd party.

I'm considering putting something together using a Raspberry PI or other SBC.
Having the result be under a $100 USD with 128GiB-256GiB of storage would
probably be necessary for it to be acceptable.

~~~
kencausey
Another thought that might make it even simpler (for the user) is to have the
device download the data at a throttled rate so it could be done during the
workday without impacting the network too much. This way it would be
unnecessary to leave it overnight or perhaps even take it out of the purse,
backpack, or briefcase. It would still be necessary for the user to remember
to recharge it and maybe switch it out or rotate it with other devices.

------
TheAceOfHearts
I have a home server with a lot of storage; it's running ZFS with full
redundancy. If you're in the US, wait a few months and you can buy an 8TB
external HDD when it goes on sale again. WD EasyStore HDDs regularly go on
sale [0].

For documents and important files I usually keep an extra copy with two or
more cloud storage providers. It's unlikely that these companies would suffer
data loss, so the most likely cause for losing access to those files would
probably be if they closed my account. Using multiple services reduces the
chances of losing access to your files, since it's unlikely you'd be banned
from all of them at the same time.

[0]
[https://www.reddit.com/r/DataHoarder/comments/7fx0i0/wd_easy...](https://www.reddit.com/r/DataHoarder/comments/7fx0i0/wd_easystore_8tb_compendium/)

------
luke0016
I have a core file server that has a 9TB RAID-Z array in it. This server uses
key-based authentication to connect to all of my other hosts - my desktop,
VPS, work systems, etc and uses rsync to copy any changes to a locally
maintained copy. It also attempts to back up laptops, if they are left on
overnight.

Every night this system also reaches out to a remote system in my office that
has an 8TB RAID-5 array. It "pushes" any changes, again using rsync, to this
remote host. This time it uses the --link-dest argument to create nightly
"snapshots." If a file hasn't changed, it simply gets hardlinked. This backup
volume is also encrypted, and only the core file server has the key to decrypt
it.

For the remote system, I have a script that runs periodically that purges
backups according to my own specification. Right now it keeps unlimited yearly
backups, three years of monthly backups, six months of weekly backups, and 14
days of daily backups.

Once a month I also run a script to back up the core file server to an
external 8TB hard drive, again with --link-dest, creating snapshots. This hard
drive is stored in a fireproof safe. It is also encrypted.

I have various levels of redundancy elsewhere, too. (Yes, RAID/redundancy is
not a backup). For example, my desktop system has a RAID1 array in it.

In 2000, I installed Windows ME on my desktop system. It corrupted my hard
drive somehow, and I lost all of my work prior to then. In case you can't
tell, this scarred me for life, and I have worked hard to make sure it cannot
happen again.

I'll add, every single storage device that I use is encrypted. Sometimes it's
just with a key stored on a separate boot disk (usually an SSD). The rationale
being if I ever have to RMA a disk, I don't want the manufacturer to be able
to see any of my data.

On another tangent, the remote system mentioned above is actually a rather old
GX260 with a SATA expansion card and 3 4TB disks. The boot disk is an ancient
EIDE 40GB drive. I keep it around mostly for entertainment at this point.
Here's why:

    
    
      9 Power_On_Hours          0x0012   001   001   001    Old_age   Always   FAILING_NOW 123756
    

That's right, the drive has been spinning for a little over 14 years.

Unfortunately I've recently hit a problem where e2fsck runs out of memory
(it's a 32-bit CPU, so only 2GiB per process) checking the file system. I
tried setting a scratch file, but that bombed for some reason too. So, I may
finally have to upgrade the thing to a 64-bit system.

Oh well.

------
joveian
I've done exactly that to a couple of transcend usb sticks and managed to kill
one when deleting some files while the rsync was running (to free up enough
space for the rsync to finish). I think the controllers on the sticks are just
not as good as SSDs. At least it was an obvious problem (drive completely
unresponsive after that), but I wouldn't make that your only copy.

The main thing I have been doing is using old hard drives, each in an
inexpensive case[0] that can easily be attached to a usb to SATA/IDE
adaptor[1] (there are others to consider if you don't have IDE drives, but
that one seems to work well and not all of them do; for SATA only look at
UASP). You might think that old drives are more likely to die randomly, but I
haven't had that happen once yet (good to keep a couple of copies of stuff you
care about).

[0] [https://www.fasttech.com/p/1645000](https://www.fasttech.com/p/1645000)
[1] [https://www.fasttech.com/p/8292100](https://www.fasttech.com/p/8292100)

I sometimes use rsync but also have an archive script that a) runs mtree with
sha512 per file and leaves the file in the directory being archived, b)
creates a .tar.xz with SHA256 checksum in the xz (except for already
compressed stuff, then I either rsync or use gzip with a seperate SHA256 sum),
and c) encrypts most things via scrypt -t 5 before hitting unencrypted
external disk (ext2 with limited features for maximum portability). Not
exactly push a button level convenience but it works ok for me.

If you just use basic rsync you can end up copying corrupted files from your
system over the backup before noticing that they are corrupted. I think there
are various ways to avoid this, so far I just have multiple copies that I
update at different times but there are likely better ways.

Tarsnap seems like a good option for stuff you wouldn't want to loose in a
fire, but I haven't used it so far.

------
monster_group
I use Back In Time on Linux (which uses rsync) to back up data to a USB
external drive. I backup critical data on 2 USB sticks one of which I carry
with me at all times, the other one in a bank locker (this version of backup
runs a little behind since I have to make trips to the bank). Not the most
convenient way to make backups but I find it an acceptable trade off for me.

Pros

1 No data in the cloud.

2\. Multiple copies of backup one of which is outside the house (to guard
against fire etc.)

3\. I have the most critical data with me at all times.

Cons

1\. Data still in same city (does not guard against major earthquake, flood
etc.)

2\. Manual back ups (not automated on a schedule - though that can be set up
on a home network, I just haven't bothered).

------
Cieplak
ZFS.

ZFS snapshots and clones are like git for your entire filesystem. ZFS
replication over ssh is like rsync but more efficient. ZFS supports multiple
compression methods so you can optimize for latency or compression ratio.
Zpools let you add new disks and grow your filesystem without having to mess
around reformatting. Zpools support various redundancy configurations so you
can survive disk failures. It also supports checksumming so you can monitor
and prevent data corruption.

[https://www.freebsd.org/doc/handbook/zfs-
zfs.html](https://www.freebsd.org/doc/handbook/zfs-zfs.html)

------
davchana
A 3TB main hard disk HD01. It has everything of mine from 1998, photos, docs,
code, media (self-generated) books everything. All things sorted by type in
folders (Media, Own Data, Books, Video, Cloud(Dropbox, Yandex, Mega, Drive)).

An exact mirror of above HD01, labelled aa HD02. Mirror updated every 20-30
days.

An exact mirror of above HD01 in cloud(Yandex & Mega). Mirror updated every
20-30 days. This does not get downloaded to PC back to cause recursion (Cloud
on Hard Disk & Hard Disk to cloud)

Few working folders on PC, worked upon, work in progress, mirrored back to
their original place on HD01 every week or 10 days.

Stuff of two phones & one tablet gets synced to Dropbox, Yandex & Mega as
daily or hourly via foldersync app (Whatsapp backup is synced daily whereas
Camera folder as soon as possible). This gets downloaded on PC boa Dropbox or
other windows versions. This whole cloud gets synced to HD01 every week or 10
days. Also PC's Chrome's download folder also get mirrored to Cloud very
often.

Stuff on gitlab, github & tumblr gets downloaded 1st of every month to Google
Drive via Google Apps Script with an email report of log. This also gets
downloaded to PC via Drive windows app. This Drive folder is synced to HD01 in
Cloud folder.

All sync profiles on PC are setup in FreeFileSync with Drive Labels,
Exclusions, Inclusions, Filters & limits.

HD02 is at different house, brought in only for updating monthly.

------
Sir_Cmpwn
At your scales, bluray discs are probably the most affordable per GiB year
(looks like 2.5T spread out across 100x discs is about $80 right now and
should last 15-25 years or more). Double or triple your data and you should
just use cold hard drives. 10x or more and you should probably use magnetic
tapes. Mirror everything 2-3x and do yearly integrity checks (I just write the
SHA of my discs on the disc label) and your data should last a good long time.

------
hinkley
Thermaltake makes what I lovingly call a hard drive toaster. It’s a SATA to
USB disk caddy that’s vertical so gravity holds the drive in. I use all my
older .75-2TB drives and Time Machine, which you can set up to alternate
backups to multiple drives. I _should_ rotate them out to a deposit box but
most of my best stuff is out on the internet anyway.

~~~
bluedino
We used a similar setup - for some reason the drives failed very often.

------
snow_mac
* 2, 1 TB on Hard Drives, 1 for each MBP. I use time machine

* A dozen or so Flash drives containing our wedding pictures and current copies of our taxes in locked / encrypted volumes.

* Facebook & Google photos for our favorites from vacations

* 4 hard drives scattered through out the house and 1 in our safe

* Typically I bring a flash drive with me when we travel and leave one in the car at the airport

------
0kto
I recently converted all my Linux boxes to use BTRFS filesystems. That allows
for snapshots of each sub volume, and you can even send them (incrementally)
to another btrfs drive. Last weekend I also set up btrbk, which gives you an
environment to automate the process. It let's you snapshot any subvolume to a
schedule you configure, and then send backups to any destination you specify
via ssh, or even USB drives. Since these are incremental it uses very little
bandwidth. It even has an archive function for long term storage, and just
like the other backups you can define individual retention policies for each
location. Did I mention it also supports non-btrfs support, encryption, and
both host/server initiated backups? Awesome. Next weekend I'll set up
encrypted archives to the cloud...

------
rahimnathwani
If you do that (just rsync every day), what happens when your primary storage
gets corrupted, and you rsync the corrupted copy over your USB backup?

Much better to do some sort of incremental or differential backup.

As someone else said, borg is a good way of doing this.

Given you're using cloud storage, you could use rclone to sync (one way, from
cloud to local) to a local disk, and then use borg (on your local machine) to
backup the local copies to a second drive.

Of course, it's best to have this rclone+borg setup in two locations (e.g.
second location can be parents' house or kids' house) so that it's very very
unlikely that the borg repo(s) get corrupted due to user error or software
malfunction.

~~~
blablabla123
borg is awesome, I once happened to talk to the developer of this. He said
sometimes people report bugs to him and it turns out these are hardware errors
that the integrity check of it found...

------
mynegation
I have encfs folder in Dropbox, works reasonably well, but I do not have any
automatic process writing to this from two different machines, that may cause
trouble. That gives me privacy and keeps the most important files always
backed up.

For a bulk backup, once per month I bring one of the 3 portable HDDs from
another place, attach it to my server and start my custom backup script that
saves encrypted zips of sensitive directories and rsync-backup of everything
else to it. At any time 2 or 3 HDDs are outside the house.

------
wink
HP MicroServer with FreeNAS at home, it pulls down backups from my vservers
and pushes encrypted parts of local data back there.

Every few months I manually sync most of the NAS onto an external, encrypted
HDD (just bought a fresh 3 TB one) and that will be stashed in other parts of
town/out of country with family. It's my last line of defense should the
apartment burn down or something. Data might be half a year old, but the last
20 years are mostly there. Also it's not too inconvenient.

------
throwawaydata
Many options here for backups, but I didn't see most of them touch upon backup
integrity verification and losing information through bad backups (any of the
storage devices/locations suffering from bitrot or other corruption that
silently stays and gets propagated) that may not be noticed until it's too
late (during restore time, which may happen quite rarely).

So, how do you know your backups are good and that what you're backing up is
good?

------
avar
I use git-annex to do pretty much what you're describing. It takes some
getting used to, but it's much better than your suggested method of rsyncing
to some USB sticks.

Under the hood it's basically doing that, but everything's SHA-256 checksummed
so you can check its validity, and all copies of your data also sync metadata
telling them who else has copies, it's easy to enforce policies that some
content must have >N copies etc.

------
sam0x17
I use github for all my private code, and then I backup my entire workspace as
an encrypted 7zip file to Google Drive 2-3 times a year (about 10 gigs). There
are plenty of services that do client side encryption + cloud service, but
Google Drive is probably the only one you can do completely for free.

------
Endy
As a private user, I just use either Clonezilla Live or an old version of
Norton Ghost, and burn that to media (DVD-RW lately, it used to be CD-RW until
that became unfeasible, and before that floppies). I prefer the RWs because
that means I can use the discs from 2 or 3 images ago and recycle them.

------
Spooky23
Do it the old fashioned way. Full backup to a hard drive, store that offsite
and do incremental backups on USB sticks. There are dozens of good tools.

Personally, I manage media separately from normal files, and use hard drives
for media, sticks for files.

It’s easier, cheaper and more secure than any backup service.

------
scarface74
I’m just going to speak to longevity. If you are worried about the longevity
of your data I would keep it simple.

1\. Primary data storage 2\. Local online backup 3\. Off-site backup
(rotate/sync regularly) 4\. Online backup - Backblaze

------
nastypants
All hosts backed up to a NAS w RAID 1. I have an extra disk that gets hot
swapped every 3-6 months and it goes into my safety deposit box. No encryption
although I might add that after reading this thread.

------
s17tnet
I use duplicity for create, encrypt backups. To store them I use two USB SSD
drives (in two separate locations); for off-site I use AWS' glacier.

The backups are scheduled with jobs in self-hosted Jenkins instances.

------
ramtatatam
We use Synology NAS (with OpenVPN support). It offers hardware RAID 1 as well
as encryption. We used to backup into this for ~4 years and it served its
purpose.

------
slipwalker
roughly, my $HOME ( 400Gb ) is rsync'ed to a external RAID (
[https://www.amazon.com/Oyen-Digital-MobiusTM-FireWire-
Enclos...](https://www.amazon.com/Oyen-Digital-MobiusTM-FireWire-
Enclosure/dp/B00CH94GMK/) ) , my preferred media is burn on DVDR , a few
specific files go to dropbox and/or gdrive , and all my source code sits on
bitbucket.

------
bsdpunk
[https://www.jwz.org/doc/backups.html](https://www.jwz.org/doc/backups.html)

------
mbesto
Backblaze for my desktop backup.

Then for large archival files I use: [https://www.backblaze.com/b2/cloud-
storage.html](https://www.backblaze.com/b2/cloud-storage.html)

And just use Cyberduck to move the files over. Super cheap and gets the job
done. Beauty of B2 is you can use a plethora of clients to move files:
[https://www.backblaze.com/b2/integrations.html](https://www.backblaze.com/b2/integrations.html)

------
borplk
My upload bandwidth is low enough to make offsite backup not practical
(~1mbps). Anyone else with the same issue? How do you handle it?

~~~
15DCFA8F
You can have two encrypted external HDDs and mantain one with you, and the
other in a friend/family house. Rotate frequently.

------
garmaine
A bunch of RAID disks at home. If you value your data, you should own your
data.

~~~
isostatic
RAID isn't a backup

~~~
garmaine
Context. The question was "how do you backup your files." My answer: on my
RAID system at home. The RAID system IS the backup.

------
t312227
how about a good old hard disk!?

------
lowry
restic to Google Cloud Storage. Their Coldline storage costs a few euros per
month for ~1Tb. And I can easily change backends if I want, restic supports
many options.

The only problem is low upload speeds for residential customers.

------
wendy0x2
I have multiple harddrives synced on a daily basis.

~~~
kennxfl
Are the hard drives stored in the same location?

------
sfkjlkfagfj
Onsite NAS Google Drive

~~~
mbowcutt
So.. a NAS? AFAIK you can't self-host google drive.

~~~
retor
I mentally added a comma between “on-site NAS” and “Google Drive”. Might be
wrong.

