
I lost my data trying to back it up - h4myio
https://hamy.io/post/0010/how-i-lost-my-data-trying-to-back-it-up/
======
coldtea
> _When it comes to backing up the full OS, I don’t believe in online backup
> solutions._

And how did that work out for you?

If you did believe in online backup solutions, in addition to whatever else,
you'd have your data now.

(I understand worrying about quick recovery, booting, etc. But, as an
additional method besides hard local copies, it makes sense, as having an
extra off site backup of the data is important. Better to have to spend 3 days
to manually download and restore a bootable image, than to not have the data
at all. And online backup services also allow for automated backups on your
backup -- backup-ception).

~~~
simcop2387
This is how I treat it too. Always have a local backup, but an online or
offsite one is also important. The mantra I've often heard repeated is "Two is
one, and one is none".

~~~
et-al
Two is one; one is none is more of a military slogan. Backups should follow
the 3-2-1 mantra, fullstop:

3 copies, on 2 different media, and 1 offsite.

~~~
tehmantra
I have heard so many different definitions for the 3-2-1 rule, I have no idea
which is correct anymore.

It seems that you should have 3 copies -- 1 being offsite.

But the 2 I have heard the 2 as 2 formats, 2 devices, 2 mediums, 2 storage
types, 2 technologies.

I don't think the idea of 2 is reasonable for anyone except the most hardcore.
Am I really expected to buy a tape for my local personal backups?

I think this rule could be updated to something 2 different cloud providers.
Or 2 different geographic regions. Drop the 3 and the 1.

~~~
extra88
If one of the copies is local and another is through an online backup, it's
going to be a different in almost every way from local copies so you get the
"2" by any definition.

If all your backups are to a set of tapes or, say 3 hard drives, and you have
a rotation to keep one tape offsite, every copy shares too many traits in
common. A fair number of Mac people use Time Machine to do versioned backups
to an external drive (or drive array) and use separate software to simply
mirror their drive to external drives, periodically swapping the onsite and
offsite mirrored drives. To me, physically moving drives around is hardcore, I
don't trust anything not automated, but it seems to be not that rare. I'm
fairly lazy but my machine is a work laptop so I have a backup drive in the
office, one at home, and have a cloud backup service so I have 3-2-1 without
much thought or effort.

If backups are solely on hard drives, it's probably best to not use the same
make/model purchased at the same time, for fear that they'll all fail within
the same time frame.

------
olavgg
Getting backup right is extremely important, that is why you need proper
tools. For the last 10 years I have been using ZFS snapshots and replication.
I keep 180 days with snapshots on the remote backup and 30 days for the local.
I use it to take backup of virtual machines, different databases including
"running", sql servers. And of course important files, documents and photos.

I've also configured a daily mail with a log about how the
backup/snapshot/replication went. Even if it went fine. So if I don't get
mail, I know something is up and can fix it asap.

I have never slept so well after doing this with ZFS, even backup for Windows
SQL Server with iSCSI volumes works great!

More people should do backup with ZFS!

~~~
chatmasta
Is it safe to backup a database by just straight copying its files? Wouldn't
it be better to use a specific backup tool, e.g. pg_dump for Postgres or
mongodump for Mongo?

(Hopefully you've tested recovery procedures, because as the saying goes,
"your backups are only as good as your last restore")

~~~
chousuke
With Postgres, it's perfectly safe and supported if you just do it according
to the documentation using pg_start/stop_backup and also archive your WALs
properly. Most databases have something similar.

~~~
lettergram
Note, you do need to track the version of Postgres. So a backup from a PG 8
may need to be reloaded into PG 8 before upgrading to PG 9, 10, 11, etc

------
zaarn
I think the moral of the story is; always use software RAID.

Hardware RAID (even fake ones like Intel) are time bombs, either you have fun
configurations like this or the controller dies and no replacement exists.

Unless you'd absolutely need a HW RAID, I'd go for a Software RAID that mounts
and works natively in Linux. I can put the harddrives into a fully foreign
computer and it would work.

~~~
shifto
Depends on the use-case really. I use raid (if any) for performance reasons,
so I prefer hardware raid with a nice controller and some lots of spindles
where flash is not an option. I would never implement this with a software
raid solution for reasons.

~~~
tomatocracy
Likewise. If it’s not offline, verified and off-site, it’s not a backup.

Raid is useful for availability as well as performance though.

Also while we’re discussing raid, the price/gb of a mirrored set of larger
disks vs a raid 5/6/10 of smaller disks (for the same total capacity) is
always worth checking - usually comes out in favour of mirrored pair in my
experience which also has better performance (especially when resilvering) and
resilience and lower energy costs.

~~~
tivert
> Raid is useful for availability as well as performance though.

Also for data integrity, with something like ZFS.

------
simonh
There's now way he could have possible known simply booting the Ubuntu live
disk would do this. He could have done that without any consideration to doing
a backup. However regular tried and true backups are the way to go, preferably
automated.

My worst horror story wasn't really mine. I did desktop support in my first
job. The library had their own system and their own support arrangements but I
had a good relationship with them and would advise from time to time. One day
they said the PC running their archive management system wouldn't boot. Could
I take a quick look before they called the company on their support contract?
Sure enough the PC wouldn't boot, the hard drive had died. I asked if they had
recent backups? Oh yes the backups were fine, in fact it had got much better
recently. Before the backups used to take an hour, but for for the last few
weeks it only took 2 minutes. They only had 2 weeks of tapes on rotation.

Needless to say, the backups hadn't been running at all and failed silently,
probably due to early symptoms of the hard drive failure. They lost
everything.

~~~
Filligree
Did it end up going to data recovery? Most of the time it should be possible
to retrieve _most_ of the data, at least.

~~~
simonh
I don't actually know, I didn't directly support them and they were so
embarrassed about the whole thing they clammed up.

------
diggan
Similar story but way shorter. Was working on the beginning of Typeform years
ago and we had some issues in production, wanted to take a look at the logs.
The logs were in a tar archive, so just needed to untar it. Mixed up the
arguments in the tar command and instead overwrote the logs with a empty
archive. Doh. In the end, lesson learned: be 100% sure arguments are correct
if you're stupid enough to run terminal commands in production environments.
Obviously, duplicate data (pull down logs locally) if you're planning to read
something that can possibly write instead of just reading is another lesson.

~~~
alexpetralia
Linux commands: with great power comes great responsibility. We're all seduced
by the power, but quickly learn the responsibility part!

~~~
tyingq
There was a time when tar did not remove a leading / in a tar archive when
unrolling it.

That required some great caution.

------
alkonaut
Takeaways

\- Your data is at most risk from yourself. Other risks such as hackers,
burlgars, fire or hardware failures aren't nearly as dangerous to your data as
you are.

\- Backup is HARD. Have someone else do it.

\- Mirroring is fine for one first copy. Whether that copy is raid or just
another NAS or something, but there should always be a proper BACKUP too.

\- To be a proper backup it should be 1) Write only/snapshotting, so even if
you remove all your local files, the deltions aren't mirrored. 2) Have long
retention. When you mess up a document you want to be able to notice 2 years
later and just fetch that version 3) Be off site so theft and fire doesn't
risk the primary copy and the backups.

~~~
wetpaste
\- Get backups sorted out before you consider using it as a production system.

If OP had iterated on having a backup system BEFORE adding important data,
they wouldn't have consequences to trying to implement it. At the very least,
do an online, filesystem level backup of the things that are important before
trying to block level disk copy things. Run a database dump, rsync it to an
external location, put important things in version control and push to a
remote origin. And rsync or tarball your home directory and any other
important directories(whole system if possible) and push it out somewhere,
then and ONLY then should you feel semi-comfortable to start messing around
with RAID settings/LVM/fdisk/ etc.

------
herpderperator
You should always favor mdadm (software raid) over fake RAID solutions.
Sometimes it's better to use this even over real RAID cards (which are
proprietary) because you don't have to find an exact clone of the card if it
fails; mdadm is all software. The only time it would be appropriate to use a
real RAID card is when the performance gain is an absolutely necessary
requirement, or if you need to expose a logical device to the BIOS or OS that
maps to the RAID array without additional setup.

~~~
bashinator
I once had a particularly bad time where the 3ware RAID card had a bad
interaction with the controller on our hard drives, preventing a rebuild after
a disk failure. Thank goodness I had marked that system as best-effort, no-
guarantees when asked for cheap local mass storage. This wasn’t even fake
motherboard RAID - but it was a screwdriver DIY box.

Mdadm is far more resilient, and I would never go with hardware RAID again
without full vendor support.

------
jondubois
The last time I lost all most of my data was when I tried upgrading Ubuntu to
the next major version using the upgrade manager in Ubuntu. I thought that
since the UI was so friendly looking, it must be the safest route. WRONG. It
was the second time that I trusted Ubuntu's UI for doing something critical.
Never again. Now I always back up all the files I need onto an external drive
and then I install the OS from scratch. Only trust the command line.

~~~
EForEndeavour
Definitely don't even trust the command line. I'm as big of a threat to my own
data as a poorly designed upgrade manager.

------
tracker1
Raid is not backup... Your data isn't backed up unless there are at least 3
full copies in 2 physical locations. I have to admit I'm slightly paranoid
about it... 20-something years ago I was working in tech support for iomega
(mostly os/2 and jazz calls). Nothing taught me the importance of true backups
more than that job.

I do leverage a few things though... for the most part, I keep stuff on my
nas. Projects are in git remotes as well as several local copies. Serial
numbers, access keys etc are encrypted and stored in dropbox and copied to
google drive.

These days since most of my work is in source control, I'm less worried about
that aspect. That said, however, RAID is not backup.

~~~
arghwhat
I agree with your idea, but your definition is overkill.

A _backup_ is a copy of data on hardware separate from the live system.

A _good_ backup is a backup in in a remote location.

A backup _strategy_ is to maintain a _good_ backup.

A _good_ backup strategy is to have routine restore tests, to make sure that
the backups serve their purpose.

A _great_ backup strategy requires at least 2 _good_ backups.

~~~
tracker1
I understand... by 3 full copies, that can include the active in-use copy. I
just don't consider something backed up short of having the data in a separate
device and location.

~~~
arghwhat
It has to be on a separate device to be a backup indeed, just like a spare
wheel isn't a spare if you're driving on it.

The location thing is trickier, and while you can't have a any backup on the
same device, you _can_ have a bad backup on the same location. A bad backup is
better than no backup by a long shot.

~~~
tracker1
I'm fine with a backup on-site... I just feel that data isn't safely backed up
unless it's off-site too. Not that everything needs to go that far. But it's
usually useful to think in those terms imho.

------
sbov
I'm always paranoid about this stuff.

There's lots of fancy tech words in that article. Before you use ANY piece of
backup tech you do not 100.0000000000% understand, back up your data with tech
you COMPLETELY and UTTERLY understand how to use.

This likely means a simple copy to an external drive.

After that, set up your first automated backup solution. Test that it works.

After that, set up your second automated backup solution. Test that it works.

For the especially paranoid (me), you can still do manual backups using tech
you completely and utterly understand.

------
zeckalpha
I think this is clear to the author, but readers should note that RAID isn’t
backup.

Separately, having at least a partial online backup would have minimized the
issues caused by the offline backup.

------
bklaasen
Did you log a bug against `dmraid`? You're right, it shouldn't make changes to
the file system without user intervention.

~~~
h4myio
Not yet. I suppose I should do so via Debian bug report? Cause the last update
for the upstream source seems to be around 8 years ago.

------
cellularmitosis
I think the key mistake here was allowing the complexity of his higher-level
filesystem arrangements (raid, etc) to leak into his lower-level backup
routine.

Making a system backup like this should just be about imaging the raw disks,
regardless of OS, filesystem, raid arrangement, etc. sda is sda is sda,
period.

Keep it simple and you drastically reduce the likelihood of making a mistake
like this. A copy of Knoppix and dd will do the job just fine. Paragon
Software, Clonezilla, ntfs-3g, mdadm -- distractions.

Even better, if you keep the fundamental units of your system simple (a raw
image of sda), you maximize tool compatibility / availability and your chances
of recovering when things aren't going to plan.

------
NKosmatos
Sorry for your loss. Here are my two cents...

For complete system backup on Windows use Acronis [0], on MacOS use Carbon
Copy Cloner [1] and on Linux Clonezilla [2].

Also never forget the 3-2-1 rule of backups [3], 3 copies, 2 local copies on
different mediums and 1 copy offsite (cloud, remote) :-)

[0] [https://www.acronis.com/en-us/personal/computer-
backup/](https://www.acronis.com/en-us/personal/computer-backup/)

[1] [https://bombich.com/](https://bombich.com/)

[2] [https://clonezilla.org/](https://clonezilla.org/)

[3] [https://www.backblaze.com/blog/the-3-2-1-backup-
strategy/](https://www.backblaze.com/blog/the-3-2-1-backup-strategy/)

~~~
guitarbill
For macOS, Time Machine can do automatic backups to a mounted NAS volume over
WiFi. So my Macbook backs up to a ZFS 3-way mirror. Important files are also
stored in DropBox. That's how I get relatively hassle free 3-2-1 backups.

For Windows, a similar strategy is possible, but I consider my Windows box
disposable (only gaming), so don't have much recent experience.

~~~
HenryBemis
Just a risk I see here, you have everything IN your home.. laptop, backup.

Unless your NAS is somewhere (not your home's internal wifi network but a
WLAN). Consider a Carbonite (or similar) that has infinite backup space to
make sure you have the 'offsite' covered.

~~~
guitarbill
Dropbox solves that to a certain degree, too, and provides cross OS sync for
the files I care about most.

Honestly though, low effort is very important for me. After all, any solution
is simply reducing the odds, not completely eliminating them.

------
arendtio
Well, it is not like I didn't do any stupid things in the past, but booting
into to a different OS (live system) with such a setup is obviously risky.

I am a bit conservative when it comes to backups and I do not trust myself
very much. So I am doing backups on the block device level to avoid not
copying certain file types, file attributes or whatever feature the filesystem
in question posses.

In order to avoid, risky OS reboots, I am using lvm snapshots like this:

    
    
      # sync the filesystem to disk
      sync
      # create a snapshot with a 10GB buffer
      lvcreate -L10G -s -n "$snapshot" "/dev/vg/$lv"
      # write the backup to a backup disk
      dd if="/dev/vg/$snapshot" of="/mnt/backups/lvm-vg-${lv}_backup-$(date +%Y%m%d).dd" bs=64k
      # remove the snapshot
      lvremove -f "/dev/vg/$snapshot"
    

That method is certainly not perfect, as you are still taking the backup with
a mounted FS in place and the snapshot size must be adapted to the use-case,
but in the end, this is a method to create backups which are complete
(independent of FS), easily accessible (e.g. mount via loop device) and do not
interrupt the service.

------
RcouF1uZ4gsC
After reading several of these I lost my data stories, one thing that seems to
popup over and over is RAID. For me, it is not worth using RAID other than
RAID 1 (Mirroring). My data and my time are far more valuable than the extra
storage gained. With RAID 1, I can take any working drive, mount it and
instantly access it. In addition, there are fewer moving parts and fewer areas
for screwups. Just say no to fancy RAID.

------
giomasce
To me the takeaway is that backup system should be designed with the storage
system, when the storage system is still empty of valuable data. If you change
the backup system, you should treat that as a change of the whole storage
system, so first you would run the former backup, archive that securely, set
up the new system, check it and only then release the former backup as not
critical anymore.

------
empath75
I guess since I do devops sorts of things for a living where we treat servers
as ephemeral, I don’t really even consider full drive backups any more.

Generally when I set up a computer, I use automatic configuration tools, and
any sort of data store will have its own automatic backup and restore solution
— or even better, won’t be on that server at all.

In the case of a home computer, I’ll install everything using ansible and
homebrew (or the App Store or steam) and backup documents and photos to
Dropbox and/or iCloud or s3, and all my code and dotfiles of course are on
github.

I’ve spilled a soda on my laptop and been back up and running with a
replacement in 10 or 15 minutes.

My whole house could burn down and in terms of data, I’d be at best mildly
inconvenienced.

For people that don’t trust cloud, it’s way more likely that your physical
copy is going to be lost or ruined than that multiple cloud services are going
to lose your data.

------
tbyehl
If you have the resources to run your own backup server, I heartily recommend
UrBackup[1]. It does Windows right by default. It can handle Linux ARM
devices. Macs, too. Setup of the server and clients is super simple.

10/10, would install again.

[1] [https://www.urbackup.org](https://www.urbackup.org)

------
anon4lol
RAID is not a backup strategy.

I've lost data with the Intel onboard raid, several times, so I refused to use
the Intel raid implementations; if you are going to utilize RAID, I've learned
the hard way to pay for a proper RAID card and cage (or use the tried and true
software raid setups).

If you loose the RAID array, take a break before you start fixing it. A client
had their subversion repositories with over a decade of work on a Windows
server, depending on a RAID setup. It fell down and turns out they didn't have
any good backups. The IT guy tried to fix it and nuked the super block and did
who knows what else.

They ended up packing up and sending the entire server to some firm that
specialized in data recovery and spending a small fortune.

------
moviuro
_Redundancy_

I don't have backups myself, but I have 5 copies of my data at 3 different
locations + snapshots everywhere: desktop, laptop (linux, BTRFS), NAS at home
(FreeBSD, ZFS); off-site dedicated server at OVH (FreeBSD, ZFS); at school
(FreeBSD, ZFS/netapp, managed by school IT).

Syncthing [0] + whatever handles local snapshots [1,2] work wonders [3]

[0] [https://syncthing.net](https://syncthing.net)

[1] [https://github.com/zfsnap/zfsnap](https://github.com/zfsnap/zfsnap)

[2] [https://gitlab.com/moviuro/butter](https://gitlab.com/moviuro/butter)

[3] [https://try.popho.be/securing-home.html](https://try.popho.be/securing-
home.html)

------
vagab0nd
Slightly off topic, but I've learned (the hard way) that the most important
thing in a backup/restore process is the _restore_ part.

It doesn't matter how reliable your backup is. If you can't restore it you've
done nothing. So far I've managed to:

\- Backup encrypted content without saving the decryption key

\- Backup everything to a remote server without backing up the ssh key to that
server

So do yourself a favor, don't just backup. Try restoring some of your content
from time to time (or at least imagine how you would do it).

Another scenario I've been recently thinking is, if the house burns down
tonight, and you lose everything, phones/computers/etc., would you be able to
get your data back?

EDIT: formatting

------
mobilemidget
(many others including myself, have been there, different way though for me)

The bright side is, you'll never make this mistake again. The downside is, for
me at least, I end up with triple+ backups of everything..

~~~
h4myio
I can defiantly see myself ending up with triple+ backups of everything...

~~~
tubbs
Not to be a grammar Nazi, but I believe the word you're looking for is
_definitely_. I wouldn't have normally picked it out, but I did see you use it
like that in the article as well.

It was a good read! Thanks for taking your time to write it up.

------
sn
The recommendation to only wipe the first and last part of the drive before
reusing them is inadequate. You're best off wiping the entire drive with
either zeros or hdparm security-erase-enhanced. The reason is that not all
metadata is kept just at the beginning or end of the drive, especially if you
have multiple partitions in the drive.

If that is untenable, mdadm has an option --zero-superblock, lvm has pvremove,
and wipefs will remove file system signatures, which you can run before
changing the storage layout.

~~~
h4myio
Thanks for pointing this out.

Do you have an example (or rather source) of a RAID metadata outside of the
first/last 1MiB of the drive?

------
rootusrootus
I will say, when I was reading that, I got nervous as it went along. You are
(were!) less paranoid than I am. Welcome to the club ;-). This is how you get
more backups...

------
cik
I back up all my personal data from each associated desktop via SpiderOak.
Personal server data I rely on a combination of rsnapshot, rsync, and
SpiderOak.

I also rsync to an external drive bay that I swap out weekly.

At the end of the day, every single machine in my network of things, including
my desktop can light itself on fire - and I don't care. If my office or house
is bombed, I don't care. Now - if my city were bombed.. I'd argue that I
probably have other fish to fry.

------
evanweaver
> By doing so, I effectively set off a time bomb. That single sector stayed
> around, lurking, waiting for the right moment.

Technically this is a booby trap, not a time bomb. ;-)

------
aritmo
There was a recent example with Asrock motherboards that are supposed to have
compliant RAID support

[https://news.ycombinator.com/item?id=18541493](https://news.ycombinator.com/item?id=18541493)

Should open source software fill up with quirks for the mistakes of other
software? This looks like a bug report to other RAID software that should
clear up any obsolete flags and enforce a sane condition of the RAID.

------
fipple
Damn, this kind of screwing around with RAID is not what you do with your
actual personal data. It’s like building your own car for your daily commute.

~~~
porknubbins
Yeah, when I first got into building PCs it was fun to do the most cutting
edge, overclocked, using all possible features setups I could and then 3 years
later have no memory of how things worked when I had to replace or fix
something. Now I do absolute absolute bone stock simple idiot proof builds and
get 90%+ of the performance.

------
HenryBemis
Many years ago I was the go-to guy for all friends and family for recovery. I
would try my best to recover as much as I could and then I would charge them
€100 and I was going and buying them 100 blank DVDs and external DVD writers
so they burn the "My Documents" every week. It was a time that CrashPlan and
Carbonite were not around.

------
z3t4
These commands look a bit dangerous:

    
    
        # dmsetup remove_all
        # dmraid --activate y --format isw
    

Is that y a "yes" flag !? eg answer Yes on any question.

In my experience, working with anything beside simple mirrors in a RAID config
will eventually result in loss of data, and that most data loss is because of
human failure rather then disk failure.

~~~
zaarn
No, IIRC the --activate flag accepts either 'y' or 'n' so you can active as
well as inactive the array (LVM has similar flags).

------
cm2187
That sounds like a very complicated way to do a full OS backup.

Dead easy, 10 min, can't fail alternative for home projects: use a VM instead,
switch off the VM, copy the disk file, switch the VM back on. Use NVMe SSDs
and depending on the size, you can probably do an automated daily backup with
minimal downtime.

~~~
sokoloff
Anyone have a solution that can snapshot a running VM that has a PCI or PCIe
card passed through to the guest OS? I use snapshots on ESXi, but they don't
work for guests that have a card passed-through.

~~~
sn
Are you asking for a recommendation on how to do this with ESXi or a
recommendation for an alternative to ESXi?

~~~
sokoloff
Either is fine. If it's within-ESXi, that's a little easier, but I'm not
tightly/inextricably wed to that hypervisor.

------
saint_abroad
It's commendable that the OP was looking to backup but has the reliability of
RAID against disk failure contributed to this being a "black swan" event?

Do we reduce failure rates to "acceptable" at the expense of neglecting
recovery?

------
hjek
> All it took for my RAID set to become invalid and me losing my data, was
> booting up Ubuntu Desktop 18.04 ISO image.

That's quite a claim. Was there really no way to recover anything, even with
testdisk or photorec?

------
erichocean
The only time I've ever lost any data is when trying to back it up.

------
bndw
"Digital data doesn't exist unless it's backed up in at least 3 locations".

I had a professor back in 2008 that hammered this into my brain and I share it
every chance I get.

------
asaph
And... I'm backing up my data right now...

------
anc84
RAID, not even once. At least for me.

------
jeffrallen
Bummer, dude.

------
kkarakk
why would you try this out with "live" data? why not incrementally connect
drives and do it that way?

i'm basically asking what the impetus was for "just going for it" which is
probably going to be a reason that other people share...

~~~
jacobush
Incrementally?

~~~
tomschwiha
I assume by incrementally is meant to have a base backup and store the
difference in between.

Personally I prefer full backups - they feel more complete.

~~~
mnw21cam
I have a server with 300TB of data. Incremental is the only way that my
backups are going to be able to run every night.

But also, there are different styles of incremental. Some are dodgier than
others.

~~~
tomschwiha
I agree for this volume of data incremental is the best choice.

My projects so far ranged from 1-100GB so a full daily backup is quite quick
done and its more easy to recover compared to applying incremental backups
individually.

Well, always choose the right tool for the task.

~~~
Izkata
> compared to applying incremental backups individually.

At least one style of incremental backups can be restored in a single copy - I
use rsync to create hardlinks to the previous day if the file is unchanged
(based on modified time). This results in around 400M of additional disk usage
per day due to changes, for around 90G of data in my home directory.

