
Man accidentally 'deletes his entire company' with one line of bad code - bhartzer
http://www.independent.co.uk/life-style/gadgets-and-tech/news/man-accidentally-deletes-his-entire-company-with-one-line-of-bad-code-a6984256.html
======
nickpsecurity
What's most epic about this is it's in the UNIX Hater's Handbook. One of its
rants was how better-designed systems would warn you if you were going to nuke
your whole system. The reason is that a command to wipe the whole system was
more likely a mistake than developer or admin's intent. UNIX would do it
without blinking. Inherently unsafe programming and scripting combined with
tools like that meant lots of UNIX boxes went kaput.

And today, over _two decades later_ , a person just accidentally destroyed his
entire company with one line without warnings on a UNIX. History repeats when
its lessons aren't learned. This problem, like setuid, should've been
eliminated by design fairly quickly after it was discovered.

[http://esr.ibiblio.org/?p=538](http://esr.ibiblio.org/?p=538)

EDIT: Added link to ESR's review of UNIX Hater's Handbook which links to UHH
itself. Nicely covers what was rant, what was fixed, and what remains true.
Linking in case people want to work on the latter plus my sour relationship
with UNIX. :)

~~~
vinceguidry
One of the more fascinating aspects of human history is how much effort we're
willing to devote to creating varied senses of safety, even if that safety is
only an illusion.

Now, we can call this a failure of design, but really, people who rely on
technology they don't understand can't be saved by good design. Sure, this
particular case could be fixed by disallowing the recursive flag on the file
system root, but safety is never going to be able to be the primary design
concern of any technological system.

Imagine if a sword were made safety as first-class concern. You can't design a
sword that can be used safely by the untrained. No weapon can be, training
with a weapon is a prerequisite for safely using it. Similarly, every
technology has to be understood by those using it. If you don't you're just
inviting trouble.

For a business using technology, the needs are actually fairly
straightforward. You need an understanding of what needs to be backed up, and
a process for performing the backups. If you've picked the former right,
(backing up human-readable information rather than data only readable by
software programs that might go away in a crash) then risk is minimized.

~~~
bryik
In the UNIX Hater's Handbook, defenders of rm consider accidental deletion a
"rite of passage" and remark that "any decent systems administrator should be
doing regular backups" (see page 62). The author's response is funny:

 _“A rite of passage”? In no other industry could a manufacturer take such a
cavalier attitude toward a faulty product. “But your honor, the exploding gas
tank was just a rite of passage.” “Ladies and gentlemen of the jury, we will
prove that the damage caused by the failure of the safety catch on our
chainsaw was just a rite of passage for its users.” “May it please the court,
we will show that getting bilked of their life savings by Mr. Keating was just
a rite of passage for those retirees.” Right._

I'm surprised how relevant parts of this book are 22 years later.

[http://www.vbcf.ac.at/fileadmin/user_upload/BioComp/training...](http://www.vbcf.ac.at/fileadmin/user_upload/BioComp/training/unix_haters_handbook.pdf)

~~~
erik14th
Is there an alternative to allow scripted destructive actions without the risk
of deleting important stuff?

Modern OS's will warn you if you try to delete stuff, but you can still
ultimately do it anyway, I don't see it as something particular to UNIX.

The only similar problem I had was on windows, 98 I guess, I deleted all my
files that weren't readonly by fiddling with a .bat script.

~~~
whitegrape
Have an immutable filesystem, where "deletes" are recoverable by going back in
time. At least until you do a scheduled "actual delete" that will reclaim disk
space.

Another option (though last time I tried it, it didn't work..) is something
like libtrash:
[http://pages.stern.nyu.edu/~marriaga/software/libtrash/](http://pages.stern.nyu.edu/~marriaga/software/libtrash/)
Deletes become moves and you can really delete when you like.

Practically speaking, if you're quick an 'rm' isn't totally destructive even
without backups. There's a good chance your data is still there on the disk,
it's just not associated with anything so it could be overridden at any point.
Best to mount the disk read only and crawl through the raw bits to find your
lost data (I recovered a week's worth of code this way several years ago).

~~~
toomuchtodo
> At least until you do a scheduled "actual delete" that will reclaim disk
> space.

And then you "actual delete" is where the data loss occurs :D

~~~
Avshalom
Right but if you delete your entire file system there won't be anything to
come along and do the "actual delete" so you're safe until some one comes
along with a rescue disk or otherwise mounts it to a system that knows how to
deal with this.

At the very least when you rm important-file.txt instead of importanr-file.txt
you have a chance.

------
AdmiralAsshat
This question that went unanswered in the replies bears repeating:

 _Any idea why the command actually ran? If $foo and $bar were both undefined,
rm -rf / should have errored out with the --no-preserve-root message.

The only way I can think of that this would have actually worked on a CentOS7
machine is if $bar evaluated to _, so what was run was rm -rf / _._

As the above notes, I'm pretty sure recent versions of Redhat/CentOS actually
protect against this sort of thing.

On the offchance you're _not_ running a recent server, however, this could
also be avoided by using `set -u` in the bash script, as it would cause
undefined variables to error out.

~~~
harryf
I believe those variables were not handled by the shell but rather in an
Ansible "playbook" \- see
[http://docs.ansible.com/ansible/playbooks_variables.html](http://docs.ansible.com/ansible/playbooks_variables.html)

i.e. the variables where happening in a Jinja template and because undefined,
rm -rf {foo}/{bar} was transformed by the template engine into rm -rf /

~~~
Torgo
the playbook will fail if there are undefined variables, I find the story
suspect.

~~~
weaksauce
Is there a version of ansible that doesn't have this behavior?

~~~
harryf
Or perhaps the variables were defined like

    
    
         foo = ""
    

Or were set via some function that could return a null

    
    
         bar = getValueOrNull()

------
mgbmtl
I'm a bit surprised the newspaper did not validate the source? They're
basically quoting a Super User / Stack Exchange thread which was probably a
troll.

If a hosting company had deleted 1535 client accounts, we would have heard
other stories about it from angry clients?

~~~
ufmace
It's kinda strange that major "mainstream" news publications are publishing
articles with web board posts as the primary source, indeed, the entire story
itself, with basically zero extra work. They could at least try to contact the
guy, interview him, make sure he at least seems legit.

~~~
CM30
That's not too surprising, unfortunately. With the internet and the obsession
with getting news out as quickly as possible (because hey, being the first to
report something gives you a lot more backlinks and clicks), the standard of
proof for a story has gone from 'a lot of evidence gathered through actual
investigation' to 'someone said this on an internet forum somewhere'.
Basically, the internet and social media rewards quick reporting, not accurate
reporting or verification.

It's still better than a lot of gaming news sites though, where 'some guy
mentioned something on Twitter/Reddit/4chan' is suddenly front page news
within ten minutes.

------
sp332
This felt a bit unlikely, but what really convinced me it was a troll was a
follow-up comment where he said he accidentally switched "if" and "of" in a dd
command.

~~~
chris_wot
If he mounted his backup media and wiped it, what makes you think he couldn't
cockup the dd command?

However, under Linux rm -rf / needs --no-preserve-root to work, right?

~~~
aroch
Depends on which version of `rm` you have. Newer versions (rm from coreutils
8.xx I believe) use no-preserve-root

~~~
profmonocle
And the question is tagged centos 7, which is from 2014 - far too recent to
not have --no-preserve-root.

~~~
JdeBP
[https://git.centos.org/log/rpms!coreutils.git/refs!heads!c7](https://git.centos.org/log/rpms!coreutils.git/refs!heads!c7)
indicates that CentOS 7 has coreutils 8.22 .
[http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit...](http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=34e3ea055721ecc72e6f636700e8ba7b58069c65)
shows that GNU coreutils gained that option in 2003.

------
maus42
Does this article anything else than paraphrase the Serverfault thread? On the
first reading I thought they had contacted the poor fellow to confirm his
identity or something, but rereading, it would appear that they didn't: no
further information than the original source.

[https://serverfault.com/questions/769357/recovering-from-
a-r...](https://serverfault.com/questions/769357/recovering-from-a-rm-rf)

(The user who asked that question uses now a nick, but had the real-soundish
name mentioned in the article when I read that serverfault question first time
earlier this week.)

edit. ...I really hope there isn't a real Marco Marsala someone pretended to
be. Search engine results for that name are not great ATM.

------
aleden
I accidentally set executable permissions on the following Makefile:

[https://github.com/samalba/acdcontrol/blob/master/Makefile](https://github.com/samalba/acdcontrol/blob/master/Makefile)

While typing quickly I tab-completed to 'Makefile' and hit enter. Although it
was a Makefile, it was executed as a bash script. bash ignored the incorrect
syntax and executed line 10:

rm -rf $(DIRNAME)/*

If make parsed the file, $(DIRNAME) would have been nonempty. But it was empty
under bash.

\--no-preserve-root did not protect against this, because the target of the
command was '/*'

~~~
scintill76
How/why does this work without "#!" at the beginning of the file? I just
tested with fish shell, and I get "Exec format error. The file './x' is marked
as an executable but could not be run by the operating system." But, from bash
or dash it does execute commands from the file.

~~~
Tiksi
It'll fall back to sh, the default shell (or whatever it's symlinked to):
[http://paste.click/QxWUMG](http://paste.click/QxWUMG)

In my case that's bash, debian based systems use dash.

~~~
cyphar
That shouldn't happen if you look at fs/exec.c (search_binary_handler) there
isn't a "fallback to shell" option. And fs/binfmt_script.c doesn't fall back
to shell either. Are you sure you don't have some weird binfmt_misc hook
enabled?

~~~
JdeBP
The "it" in what you are replying to is not the kernel; but the shell.

* [http://pubs.opengroup.org/onlinepubs/007908799/xcu/chap2.htm...](http://pubs.opengroup.org/onlinepubs/007908799/xcu/chap2.html#tag_001_009_001_001)

* [http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3...](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01_01)

~~~
cyphar
Well, that's just dumb. Why on earth should "source this random file as a
shell script" be the default?

~~~
JdeBP
Because that's how Unix originally worked, to put it simply. This is a whole
discussion subject in its own right, of course.

------
ams6110
Years ago, I worked in an investment bank and we had a programmer put a batch
program into production that executed the following as a shell command:

    
    
      rm -rf foo /
    

It was supposed to be:

    
    
      rm -rf foo/
    

It didn't run as root, but still managed to wipe out all the business data
files. What saved us was that the servers were configured with RAID 1 and
before the start of the nightly batch cycle, the mirror was "split" and only
one copy mounted.

So we just had to restore the missing files from the other half of the mirror
to revert to the start of the batch window and rerun the entire night's jobs.

~~~
gnarbarian
What happened to the programmer?

~~~
ams6110
Nothing. She continued to work there until after I moved on.

~~~
mfoy_
After a scare like that I'll bet she never makes that mistake again and makes
sure to triple check for typos in dangerous commands.

------
twvisitavisitb
Since I didn't find any links in the article, here's the original post:

[http://serverfault.com/questions/769357/recovering-from-a-
rm...](http://serverfault.com/questions/769357/recovering-from-a-rm-rf)

~~~
kamjam
It's in the 2nd paragraph, linked from "called Server Fault" text.

------
mpdehaan2
NOTE: this is a hoax:
[https://news.ycombinator.com/item?id=11514455](https://news.ycombinator.com/item?id=11514455)

------
spriggan3
I don't believe that's the truth, for a second. Of course the independent
didn't look at the company in question to see if there was any litigation
between this guy and his customers.

~~~
chris_wot
The Independent also didn't look up -r as it stands for recursively remove
directories...

~~~
dave2000
They said:

"the r deletes everything within a given directory"

which is what it does. Non technical readers aren't going to understand
"recursively remove directories".

~~~
chris_wot
Fair point.

------
CPLX
The real news here is that the Independent will write a feature story on a
successful forum troll. Where were they back in the days of the Fucked Company
message board when we could have used their help?

------
oluwie
Reminds me of the time I accidentally typed in 'crontab -d' instead of
'crontab -e'.

Those two letters are eerily too close to eachother.

~~~
wahnfrieden
This is one of the reasons why we have infrastructure as code now, so system
changes can be reviewed and tested just like application code, and more types
of accidents can be reverted via source control :)

~~~
djsumdog
In the article the guy is using ansible. He even had off-site backups, but
they were mounted before his ansible playbooks ran, wiping them out as well.

------
vbezhenar
A competent specialist will be able to help that guy. rm -rf / is easily
fixable if you won't mess around after it. Backups usually have recognizable
format, so it's possible to restore backups and then everything from backups.

------
l0c0b0x
...and this is why we include a 'backup technology' question in our technical
interviews--where 'offsite backup' must follow with something like "possibly
the most important type of backup because..."

~~~
AdmiralAsshat
You know the sad thing is that even this isn't idiot-proof and needs to be
qualified. One of my customer's brilliant "cost-saving" measures was to have
an offsite backup solution that was basically an rsync script that ran every
15 minutes.

So when someone on their end did something catastrophic to their data and it
took them an hour to notice, they were incredulous that we couldn't help them
restore their data even though it was "backed up offsite!" because their
"backup" solution had already caught up and duplicated the broken data.

~~~
ansible
And that's why if you're using rsync, you ought to be using rsnapshot instead,
and have generations of backups so that you are not overwriting your most
recent one.

~~~
SmellyGeekBoy
I find rdiff-backup is great as a drop-in replacement.

------
bhartzer
He even deleted the backups.

>>the code had even deleted all of the backups that he had taken in case of
catastrophe. Because the drives that were backing up the computers were
mounted to it, the computer managed to wipe all of those, too.

~~~
drzaiusapelord
If you're not doing offsite and cold backups, then you're just asking for
trouble. If not crap like this then a fire or a ransomeware infection or a
malicious employee, etc.

~~~
jasonjei
He actually was doing a remote backup (although probably not a cold backup).
Unfortunately, he had used mount instead of rsync over ssh, making it
vulnerable to the rm -rf command.

~~~
zyxley
That's not a backup, it's a mirror.

~~~
jasonjei
Are you suggesting that you can't backup with rsync? Because you can do full
and incremental backups with rsync.

In fact Time Machine on OS X looks like it does backups in this manner...

~~~
cyphar
Just using rsync to make copies isn't a backup. If you use rsnapshot (which
stores each copy separately) then you have a backup. Copies are not sufficient
if you find out that something broke three weeks ago.

------
jo909
While as others already pointed out this story seems a little fishy, it serves
well to reflect if something like this could in theory happen to your
infrastructure.

Do you have your backup servers in the same configuration management software
(ansible, puppet, ssh-for-loop etc) as the rest of the servers? One grave
error (however unlikely) in your base configuration really can take down
everything together in one fell swoop.

How "cold" are your backups? If the backup media are not physically
disconnected and secured, you can most likely construct a scenario where the
above, malware, a hacker or a rouge admin could destroy both the backups and
the live data.

I will certainly suggest some additional safeguards for our backups.

~~~
ansible
Yep, that's what I hope everyone will be doing... thinking about their own
backups and infrastructure.

We have backups off-site on disconnected media, so that alone prevents the
kind of accident we're talking about.

We use btrfs send / receive to send OS images from the primary container host
to the backup container host. The snapshots are read-only, so I'm fairly sure
I can't just 'rm -rf' them, I'd have to actually 'btrfs subvolume delete
foobar' them.

I should try that though on one of the test servers...

------
castratikron
The bash -e and -u options might have saved him here:

[http://redsymbol.net/articles/unofficial-bash-strict-
mode/](http://redsymbol.net/articles/unofficial-bash-strict-mode/)

~~~
giovannibajo1
This. All my scripts begin with "set -euo pipefail", and my editor linter
complains loudly if that line isn't there.

I wish distros would migrate to making those settings the default, over the
years. Even if it would take a while, I think it would be priceless

------
nihonde
Any script that includes rm -rf followed by variables in a path is an accident
waiting to happen. Mounting the backup volumes is just icing on the cake for
this extremely incompetent web hosting provider.

It made me nervous to type rm -rf in this comment form. Those letters are dark
magic.

~~~
smegel
> Mounting the backup volumes

That sounds more like an accident waiting to happen than a single line of bad
code.

------
brador
Why is the data not recoverable?

Maybe things have changed, but rm doesn't zero out the drive. And with the
backup that was rm too it should all be recoverable. Or am I missing
something?

~~~
Qantourisc
Not directly no, but some FS give you a hard time to recover the file
structure, which in some cases is a big problem. You could probably recover
files, but if the backups aren't stored in a tar/zip/... file, it will be hard
to recover both the data and the structure.

------
redbeard0x0a
Too bad he wasn't running Illuminos (OpenSolaris) based servers (or even some
Solaris versions) that would have just flat out refused to run rm -rf /

~~~
neerdowell
It's required in POSIX 1003.1-2013 that rm refuse to remove the root
directory[0].

[0]
[http://pubs.opengroup.org/onlinepubs/9699919799/utilities/rm...](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/rm.html)

------
Xcelerate
Toy Story 2 was almost entirely deleted because of this same problem:

[http://thenextweb.com/media/2012/05/21/how-pixars-toy-
story-...](http://thenextweb.com/media/2012/05/21/how-pixars-toy-story-2-was-
deleted-twice-once-by-technology-and-again-for-its-own-good/#gref)

------
ksenzee
He says he's recovered almost all the data. FWIW.
[https://serverfault.com/questions/769357/recovering-from-
a-r...](https://serverfault.com/questions/769357/recovering-from-a-rm-
rf#comment970897_769400)

------
thinkmoore
As part of my PhD research, I developed a shell scripting language (shill-
lang.org, previously on hn:
[https://news.ycombinator.com/item?id=9328277](https://news.ycombinator.com/item?id=9328277))
with features that provide safety belts against this sort of error. From
speaking to administrators and developers, we believe these types of errors
cause take much more worry and time than they are worth.

Now that I'm graduating, we've started the process of refining Shill into a
product that we can offer to administrators and developers to make their lives
simpler. If this sounds like a tool you wish you had (or if you wish a similar
tool existed for your platform of choice), we'd love to hear from you.

------
drinchev
According to a comment in the ServerFault website, he actually managed to
recover the data [1]. He consulted a company for data recovery and they gave
him a list with the files that they could manage to save [2].

1 : [http://serverfault.com/questions/769357/recovering-from-a-
rm...](http://serverfault.com/questions/769357/recovering-from-a-rm-
rf#comment970897_769400)

2 : [http://serverfault.com/questions/769357/recovering-from-a-
rm...](http://serverfault.com/questions/769357/recovering-from-a-rm-
rf#comment971005_769400)

------
ikeboy
Reminds me of [https://archive.is/9R2j8](https://archive.is/9R2j8)

(Original thread has been deleted.)

~~~
bpchaps
That's awesome.

A company I left a while back recently had two servers accidentally rebooted
through sort of automated task (probably puppet). The fine, I'm told, was one
billion dollars.

Someway, somehow, he still works there. :)

~~~
ikeboy
Details, company name?

~~~
bpchaps
It's a very large French bank. I don't know anything except that it was a
paired batch processing server and didn't push further questions, honestly.
For some reason, I found absolutely nothing in the news about it, but my
source of information is credible. It doesn't surprise me for a second that it
happened, since, for example, I spent months trying to get these guys to fix
their literally useless MQ DR failover scripts, but nothing ever came from it,
since, they didn't have anywhere to test.

With the way they treat their employees - fucking good riddance. There was a
giant mess when Disney forced their NOC to train their replacements, but yet
these guys did the exact same thing, plus some, and there was no public
awareness during or after it. The best part was their push to move everyone to
Montreal. Lower pay, not a guaranteed extension and you're forced to move?
Okay.

The AMRS CTO actually left about a month after he got the position and took me
along with one other person over to a new company. Goldman's head of tech
actually just left to go to the same place. Not gonna lie, it sounds
incredibly suspicious, especially considering the the kinds of shenanigans
that went on there... thankfully I'm no longer working there.

It's a very, very strange place in finance.

------
OSButler
I once had an incident with a server which triggered notification alerts about
a failing httpd service. While I was looking into the issue, the mail service
suddenly stopped working, then the database service went down - it was like a
slow cascading failure, affecting all services on the server one after the
other. I finally noticed the 'rm' command in the process list and asked the
client if he ran any custom commands as root on the machine. Turns out he
followed the instructions on a website to install some custom software without
checking any of the commands and just copied & pasted them into the prompt. He
basically managed to "rm -rf" on / and deleted his own server.

Luckily recent backups were available, so the damage was rather small, but it
was interesting to see someone just pasting & executing commands without
knowing what they actually do, especially when logged in as root.

------
wilkystyle
Did he change his ServerFault username? I see commenters referencing
@MarcoMarsala, but the OP's name appears to be bleemboy at the moment[0]

[0] [http://serverfault.com/questions/769357/recovering-from-a-
rm...](http://serverfault.com/questions/769357/recovering-from-a-rm-rf)

~~~
maus42
Looks like he did, the nick was MarcoMarsala earlier this week.

------
amelius
+1 for snapshotting filesystems.

~~~
creshal
Or any backup solution worth its salt, including "RAID1 where you yank out and
replace one of the drives every other week".

~~~
robinson-wall
RAID is not a backup.

... Especially when your RAID is busy rebuilding for N hours every other week.

~~~
daveguy
Well RAID 1 would be a backup if you yanked out and replaced a drive every
week. In that case it would be a weekly snapshot.

~~~
DDub
Unless you're mirroring across more than 2 drives, you have an AID setup.

~~~
daveguy
You are incorrect. RAID 1 is a mirror setup. There are two drives with exactly
the same information. One of the two drives is redundant. RAID 1 does not
include striping and only requires 2 drives for redundancy.

~~~
Sanddancer
I think what DDub is getting at is that there is no redundancy for the data
received while the disk is mirrored to the new twin. For that, you'd need a
mirrored pair plus a drive to yank out as the backup.

~~~
daveguy
Ah, that's a good point. When you first put the fresh drive in it would be AID
for a while... and no one likes AIDs.

------
blaze33
Instead of doing system administration with root, couldn't we have a system
user with the same privileges as root except it wouldn't have write access to
the files of some users (like your clients) ?

So you could still rm -rf / all you want, delete everything but still have
/home or /var/www content untouched.

We run certain programs with limited privileges to mitigates risks (bugs,
exploits, etc.), why shouldn't we also limit the privileges of root to
mitigates the risk of buggy system administration ?

Obviously having actual backups and testing your code before applying it to
production is good practice but I feel like doing system administration with
root while having potential bugs in your sysadmin code (as in any other
software) leaves the door open to the next catastrophic failure.

------
jmiserez
Nothing to see here, it was a hoax/troll:
[https://meta.serverfault.com/questions/8696/what-to-do-
with-...](https://meta.serverfault.com/questions/8696/what-to-do-with-the-rm-
rf-hoax-question)

------
ausjke
with 'set -u' he could have stayed safe. Bash probably should never treat
undefined variables as a valid but empty value, it's so dangerous.

------
sickpig
[http://serverfault.com/questions/769357/recovering-from-a-
rm...](http://serverfault.com/questions/769357/recovering-from-a-rm-
rf#comment970897_769400)

"luckily we recovered almost all data!"

------
coldtea
> _Together, the code deleted everything on the computer, including Mr
> Masarla’s customers ' websites, he wrote. Mr Masarla runs a web hosting
> company, which looks after the servers and internet connections on which the
> files for websites are stored. _

And he has no backups? Including rolling backups in unconnected storage?

> _Mr Marsala confirmed that the code had even deleted all of the backups that
> he had taken in case of catastrophe. Because the drives that were backing up
> the computers were mounted to it, the computer managed to wipe all of those,
> too._

Then the probably probably deserved to die. Sorry for the customers though...

------
miles
"Most users agreed that it was unlikely that Mr Marsala would be able to
recover any of the data."

Perhaps they are unfamiliar with extundelete?
[http://extundelete.sourceforge.net](http://extundelete.sourceforge.net)

------
TheCams
Sorry if that's a stupid question, but does that mean he was running his
script as root?

~~~
smhenderson
He was running with full administrative permissions and thus file access
permissions on files were ignored.

"root" is the name of the default administrative account on Unix and Unix like
systems.

~~~
TheCams
Thanks :)

------
callesgg
Once i did sudo rm -R . when i was in /var

When i discovered what i had done and stooped it /var/www was already gone.

Luckily we had backups, but that sure did teach me a lesson about rm.

These days i look very carefully before using rm -R and also i type the entire
path.

~~~
rm_-rf_slash
Sometimes seeing certain usernames can get you to accidentally write the wrong
thing.

~~~
goda90
Reminds me of the guy who chose the Xbox Live gamer tag 'XBOX TURN OFF'

------
Schwolop
I managed to sudo chown -R {useless_user}:{useless_user} {foo}/ with foo
undefined, whilst simultaneously distributing that command with dsh to our
entire cluster of 10 machines. This was after testing that everything worked
on the development machine. So of course, I retraced my steps to find out what
went wrong, and killed the development machine too.

The upside is that we knew we had issues, and with everything broken the
impetus is on the right people to ensure they're fixed before we get
distracted by the next shiny feature.

Sometimes, setting your servers on fire _is_ the solution to technical debt.

------
Bahamut
This made me think of this quote:
[https://twitter.com/devops_borat/status/41587168870797312](https://twitter.com/devops_borat/status/41587168870797312)

More seriously, this isn't the first I've heard of rm -rf backfiring - one of
my friends said at one place he worked at, an IT guy walked out one day &
never came back after trying to fix a co-worker's computer. He found out after
by investigating on his co-worker's computer that the IT guy must have ran rm
-rf while root & wiped out everything.

------
mofle
There are many ways you can safeguard [0] `rm`, but in the end, it's better to
just use a tool that move files to the trash [1] instead.

[0]: [https://github.com/sindresorhus/guides/blob/master/how-
not-t...](https://github.com/sindresorhus/guides/blob/master/how-not-to-rm-
yourself.md#safeguard-rm)

[1]: [https://github.com/sindresorhus/trash-
cli](https://github.com/sindresorhus/trash-cli)

------
jchrisa
I lost the private key to one of my AWS servers after it had had a traffic
spike due to blog coverage[1]. It was a toy system so it was using local
storage, but then it became sort of popular. Luckily I had a process monitor
set up so it managed months of uptime before something happened that I
couldn't do anything to fix.

[1]
[http://waxy.org/2008/04/exclusive_google_app_engine_ported_t...](http://waxy.org/2008/04/exclusive_google_app_engine_ported_to_amazons_ec2/)

------
bashinator
I would like to point out that requiring `set -u` at the top of all your
production bash scripts will prevent this kind of disaster - the script will
fail if unassigned variables are referenced.

~~~
cpeterso
If anyone knows bash, it's bashinator! :)

------
megacity
Is there ever a situation where someone would want to rm -rf / ?

~~~
__david__
Nope (almost never), which is why GNU rm requires the '\--no-preserve-root'
flag if you actually want to do that for some reason.

~~~
timseal
In the manpage for rm, I see "\--no-preserve-root do not treat ‘/’ specially
(the default)". For real? The default is to do the worst thing?

~~~
cyphar
No, the default is to do nothing. The (default) refers to "treat root
specially" not the flag.

------
austinjp
Yeah so I typed rm -f * the other day after typing rm -f *~ repeatedly in a
few different directories. In the 2 seconds it took me to realise, I lost a
lot of data. First time I've made that particular typing slip-up in many
years. Thankfully I had backups to restore from. Real heart-sink moment.

Sure, there should have been aliases for rm -i and I shouldn't have used -f
etc etc etc. But sometimes this stuff is going to happen.

------
Animats
This is what comes from treating undefined variables as empty, rather than as
errors. Bad language design in the shell.

------
xlm1717
One take-away from this is that it's probably better to save your backups
somewhere where you can't delete them. Make sure that nothing using rm touches
your database backups. Also, try to keep them backed up in multiple places.
For example, store backups on a server you own, and on a cloud server, like on
S3.

~~~
shadeless
Another way this could have been avoided is if he used "\--one-file-system"
flag, which wouldn't delete backups as they were mounted on a separate
filesystem.

------
pmlnr
Nearly done the same thing once by messing up ordering of flags. Thankfully
this was before devops tools were present, so a ctrl-c stopped wiping before
it got too deep, but a Friday afternoon dowtime is still bad.

Tape/blu-ray disk backups can come really handy in these cases, not being easy
to wipe them.

------
cha5m
I was expecting something much more subtle.

I guess the best course of action to prevent this would be to alias rm to a
custom script, then parse the arguments to make sure the root directory is
never recursively deleted, then calling rm from within your script.

------
anotheryou
he got it recovered

[https://serverfault.com/questions/769357/recovering-from-
a-r...](https://serverfault.com/questions/769357/recovering-from-a-rm-
rf#comment970897_769400)

------
noonespecial
OR "Man learns the value of backups because somethings things go wrong."

~~~
Raphmedia
From the article: "All servers got deleted and the offsite backups too because
the remote storage was mounted just before by the same script (that is a
backup maintenance script)."

~~~
cyphar
That's not a backup. If your "backup script" requires mounting the backup on a
production machine, then it's barely a copy of your data.

------
kilroy123
How is this even remotely possible..?

\- No developers have a local copy of code on their machines?

\- No backups at all?

Worse case scenario, couldn't you attempt to retrieve the data from the hard-
drive? Though, the database(s) would likely not be retrievable.

~~~
npolet
This is what I thought. I would have to go out my way to completely nuke the
servers I work on. I'm trying to understand what structure this guys company
had if everything can be mistakenly deleted without any chance of recovery.

Maybe I misread the article and he runs a niche hosting company that has
different requirements, but it seems strange to me to be able to completely
remove your online body of work in a matter of minutes.

------
BinaryIdiot
According to the thread they were able to recover almost all of the data so
far. So the whole _" deletes his entire company"_ no longer seems accurate.
Still pretty crazy.

------
chiph
Not a Unix admin .. but can you swap the rm command with a different
executable that prompts you when the -rf option is specified?

~~~
wutbrodo
You can just alias rm to a script of yours that does just that with like, one
extra line of bash. I've done this for a couple of commands where I prefer
default behavior that isn't specifiable by flags.

------
justinlardinois
It's weird to me that commenting on a Stack Exchange question could get you
quoted on several major news websites.

------
rurban
Shouldn't the proper fix be implying -i on -rf / At least in some fork of the
coreutils.

------
beloch
Correct me if I'm wrong, but rm doesn't wipe data out, it just deallocates the
disk space devoted to it. If you actually managed to wipe out your entire file
system with rm you could likely still recover your data with a recovery tool.

This story smells a wee bit fishy to me.

~~~
milkey_mouse
It's just hard to get the filesystem entries for the file back. rm doesn't
specifically wipe, you're right, but the filesystem entries are deleted, which
means you basically have to grep the disk for bits of the file you want with
known contents.

------
adultSwim
Oops

