Hacker News new | comments | ask | show | jobs | submit login
Unix Recovery Legend (1986) (ryerson.ca)
138 points by ColinWright on June 14, 2014 | hide | past | web | favorite | 52 comments



"rm -rf /" was how I advanced in career.

A long time ago, in a job far, far away, I was a lowly CAD Operator making technical drawings of electrical diagrams and switchgear enclosure plans. This was on Tektronix terminals each attached to their own Tektronix servers (which as I recall ran BSD at first then switched to SVr4)

A fellow lowly cad operator decided to try out the "rm -rf" command (at that time we each had superuser access to our servers - innocent days) on his server. In those days you could hear the hard drive access each sector, and so it was I heard the faint but distinctive click-click-click of the read/write head slowly but surely going down the sectors as the rm command took hold, followed by my cow-orker's face gradually but increasingly turning white, going through to green, as he realised that a major project's 2 months of un-backed up CAD drawings were gurgling down the virtual drain (again innocent times - no backups).

Being a bit of an unofficial Unix Nerd (I whiled away my spare times there learning how these boxes worked), I had the chance to show my knowledge, explain what went wrong, and how we could prevent this in the future, resulting in - after 2 weeks of overtime for all to re-draw and catch up with the 2 months of lost project work - removal of root access to the systems to all but myself and other key personnel, and a tape backup system with a rigidly kept backup schedule, and subsequent no loss of data after that.

I eventually became in charge of the CAD dept, and years later ended up as manager of IT at a different company, and now run my own IT company ;)


LOL One day working on a very old machine many years ago in a galaxy far far away at the end of a 12 hour coding binge I accidentally typed:

   rm -f *.c
instead of:

   rm -f *.o
I then typed:

   make
I saw tons of missing files and in literally maybe 200ms I hit the reset button on the machines' console! I happened to be sitting there the only thing I could think is these blocks are still in the free list and maybe even the disk didn't sync yet (did I mention this was Xenix, and 1985?).

I rebooted the machine from a 8" floppy that had a copy of fsdb(ADM) and stated sniffing around. Sadly the file changes had been sync'd but I found my code on the free list in the filesystem! I then wrote a C program to dump the free blocks and proceeded to reassemble it into the original C file I needed to recoup my 12 hours of work. Luckily most of the files where the same from a prior backup just one file had a ton of changes from that session. It took me a couple hours to get perfect but it was better then re-writing everything I'd done in the past 12 hours.

Needless to say I then added a "make clean" target and I got much better at backing up my code every couple of hours instead of being a code zombie and waking up 12 hours later without backups!

With today's modern filesystems I have no clue how you could pull off a hat trick like this but that was then and I used the tools I had at my disposal to recover from my own blurry eyed mistake.

A friend of mine once said to me:

  It's not how bad you mess up it's how well you recover.
  And you're a professional at recovery.
I'm still not sure if that was a compliment or an insult...


With Linux, I seem to recall grepping the disk under /dev/ for the contents of the deleted file. It's actually why I still religiously put the name of the file at the top of the file, even though git has kept me from accidentally deleting anything in recent years.


I worry that svn and git have done more harm than good, because I've now got into the habit of using rm -rf (it's the only way to delete .git/.svn) where I used to use rm -r.


There is a great story in Ed catmull's new book creativity inc that tells how the handled the time when rm -rf ~/* was run on the Toy Story II servers, deleting 90% of the finished film weeks before the release. Not a problem till they realized the backup system failed. I won't spoil how it ended but they recovers (not technical solution) but shows how to handle random screw ups. The book is a great read in general as it showcase their techical achievement as well as how to manage a creative culture.


For more details, here's my answer in a Quuora thread from 2 years ago... http://qr.ae/sjteB


Yep - nice animated piece with interviews here: https://www.youtube.com/watch?v=EL_g0tyaIeE


And a link to one of the previous threads here on HN... https://news.ycombinator.com/item?id=3972798


At $work, I walked an admin through changing some settings of a software I had co-developed. After we were done, we deleted some caches

cd /var/lib/$application/cache

find -type f -delete

But that application was actually load-balanced on a second machine; we didn't need to change the configuration there, but still had to delete the caches. So he ssh'd (as root) right into it, and copy&pasted the 'find -type f -delete' right away, and pressed enter while I cried NOOOOOOO. Too late he realized he didn't change the directory first.

He was about to log out, and instead ssh into the backup server to recover the lost files, but I stopped him from logging out - the previous operation had erased the .ssh/authorized_keys file from /root.

In the end, no harm was done, the backups were only 8 hours, and nobody noticed any interruption.


> And the final thing is, it's amazing how much of the system you can delete without it falling apart completely.

We once had a server running a memory-only cache lose its hard-drive. It puttered along for weeks, still serving from memory, before someone got around to replacing it.


Watching rm recursively burn through your filesystem is a right of passage.

I had a genius intern. Straight out of high school, his understanding of asynchronous design put plenty of my peers to shame. Course, his unix experience, especially permissions in this case, was a bit light.

I decided to use OSX's iChat to share my screen with him. As you know, iChat used to allow full control of the local machine to the remote viewer. My bright-eyed genius intern realized this:

    Oh cool! I can move your mouse!
...he says, as he starts clicking around...

    Can I type?!
...and he types "rm -rf /" in my open terminal.

In panic, I smashed every key on my keyboard, fortunately scoring a CONTROL+C relatively quickly. I hoped the damage was minimal, mostly praying it did not touch my home directory. I watched as every application I tried to open failed to start...

Every application I had installed, up to somewhere near X in the alphabet, had been deleted without prejudice before I managed to kill the command. Fortunately my home directory was safe; thank you "/Users" on OSX.

My intern's response was:

    That command won't do anything, I didn't sudo it!
That's the day I setup Time Machine on that system.


Oh man. if their is anything I have learned about the command line, its don't fuck around with rm.


Apple once did this to a few unsuspecting Mac users way back in the day. They released an iTunes installer that deleted the old copy of iTunes using this command:

    rm -rf $2Applications/iTunes.app 2< /dev/null
$2 held the path to the drive containing the copy of iTunes to be deleted. If the name of that drive began with a space, chaos ensued....


I think Mac OS X was still very new at that time. 10.1 had been just recently released.


Very Novice user, and coming from a MSDOS background, I habitually logged in as root, even though I had heard dire warnings not to do that.

I had lots of dot-directories in /tmp and my disk didn't have a lot of space left. So I typed in

        rm -rf .*
thinking that I was very smart, and would save time by not having to remove each dot-directory separately.

After a while, I wondered why it was taking so long and discovered I had very little of my root filesystem left, so I was forced to do a complete reinstall.

Needless to say, I always do my day-to-day work now as a lowly common user, and NOT as root.


These days, whenever I do a recursive remove, I always include the parent directory name, just to force the habit. "rm -rf ../project/" (or . in your case). Same effect as ./*, but much clearer - to me, anyway.


heh, HN syntax ate the stars...


For future reference, "rm -rf .??*" will prevent traversing up the filesystem.


I have almost experienced a disaster like this with a script containing something like -

rm -rf "$foo/$bar"

- when the variables were undefined due to a bug in the script. Fortunately I was not running it as root.


This is why i've been writing:

set -u

in every script i've written since i did exactly the same thing.


There are some bad things that can happen in the UNIX shell.

Ofc. unset (empty) variables "rm -rf ${variable}/" ... I guess paths are given without a slash at the end (also for that reason).

But also other commands: "hostname", if you type "hostname help" then your new hostname is set to "help". For us that means a full rebuild/reinstallation of the machine is necessary.

This surely has been discussed before, but why not (again): Let me know of your horror stories, if you like.


A really simple typo - but it's always stayed with me. Many years ago on a Very Important Solaris Box(TM) I wanted to check when it had rebooted a few months before, so typed "last | grep reboot". Only I forgot to put in the 'grep'... (and if you're not a Solaris person - 'reboot' is pretty brutal on it - you may as well just pull the power cable out the back).

What a fun day that was....


Being primarily used to, and mainly using Linux.

Then using the killall command on a Solaris box, which just happened to be running the company's Oracle database...

Cue the support line phone ringing off its hook.

Cue BOFH Excuse : "Hmm it appears the server running the database has crashed. We're rebooting it now", etc etc.

Lesson learned : killall on a Sun/Solaris box behaves VERY differently from the killall on Linux :D


The similarity of "rm -rf *" and "rm -rf /" makes me nervous any time I "cd" somewhere and then issue the first.

There is no trash can in my Unix, deleted is gone forever.

A move (mv) or copy (cp) just overwrites files with the same name (as default)

Right clicking (&executing everything (every line) that just got pasted from the current clipboard/copy-buffer)


Why would you cd somewhere and then do rm -rf * when you can just rm -rf somewhere?

Having your shell expand * is just asking for trouble; on most systems you'll have no problems creating more entries in a directory than you can specify arguments on the command line, so rm -rf * will not work from 'somewhere' but rm -rf somewhere should still work fine.


> if you type "hostname help" then your new hostname is set to "help". For us that means a full rebuild/reinstallation of the machine is necessary.

What the hell? Why not just change it back?


I think the actual problem was that someone typed "hostname " (that is "hostname" and 2 blanks), which did set the hostname to " " (a blank). Yeah sure changing it back may work (it didn't for the colleagues that told me that).

Here is another one: I once shut down a server because I wanted to know how to shutdown & reboot. I typed "shutdown -h" (I thought that would display help-pages :)

It was the option to "halt" immediately.

We needed to open a ticket to start the machine up again.


I started using these recently:

set -o nounset

set -o pipefail

set -o errexit

set -o xtrace

I added them to all my scripts, and I also added all except errexit to ~/.bashrc

The first one would save you from unset variables.


The other classic rm: "rm * .o"

That's what taught me the importance of backups.


The best protection against this is ... to create a hundred million files in /aaa/ ?


Surely " ", as space is the lowest ASCII character you can reliably put in a filename. But only shell globbing is going to sort the output. Saving something from rm -rf of the superdirectory would require ensuring that the name pops out of readdir/getdents first, which is filesystem-specific and not always possible.


The best protection is to do chattr +i on all binaries and shared objects in /bin, /lib and /usr, and only lift it for upgrades. Possibly also for everything under /etc except /etc/mtab. And for everything under /boot, preferably mounting it read-only if it's on a separate fs.

It might get really annoying really fast though and only iron discipline will prevent you from forgetting to chattr +i something back after chattr -i to do an upgrade. And if you have such a good discipline, you're extremely unlikely to ever do a rm -rf /*.

It would be nice if you could have some hook in the package manager so you could write something to automatically chattr -i relevant files and then chattr +i them back. And a vim macro to do the same when you save a file in /etc.


And a complete copy of /bin in /z.


For some reason when I go to this site, my Dell monitor seems to do a humming noise. It gets slightly louder when I scroll to the bottom.


svckr 2 hours ago | link [dead]

Reminds me of the "most stupid thing" I ever typed into Terminal.app. I somehow managed to create a subdirectory named "~", and decided it was a good idea to do this:

    rm -rf ~
It took me about 5 seconds to realize the command should have returned immediately, and frantically press Ctrl-c. But that was all that was needed to ruin two week worth of code.

If you want to try a fun little experiment, type this command into your colleague's shell and see what happens (and how long it takes):

    mkdir ./~


Yep, did that in 2007. Let it run for minutes because I wasn't at my Mac. By the time I got back, I had a lot of free disk space and the stubborn '~' folder was still there since it was outside my home directory. Since then I used Time Machine or now CrashPlan to ensure I don't ever lose local files, accidentally deleted, again.


> If you want to try a fun little experiment, type this command into your colleague's shell and see what happens (and how long it takes):

> mkdir ./~

  $ time mkdir ./~

  real	0m0.007s
  user	0m0.000s
  sys	0m0.004s
I don't get it. What's fun about this experiment?


The user will ls their home directory, see this file, and attempt to rm it. The shell will instruct rm to remove their home directory instead.


Ah. I feel a bit stupid for taking that so literally.

So, when i delete a directory, whenever possible, i use the rmdir command. rmdir refuses to delete a directory which isn't empty. It's a really handy safety check. If a colleague dropped a tilde directory on me, the process would go:

Well, i'll just delete that.

  $ rmdir ~
  rmdir: /home/twic: Directory not empty
Oh bugger, of course. Better quote it.'

  $ rmdir '~'
Right, now i will proceed to give my colleague a Chinese burn.


Y u no source version control?


Local git repos, mostly. But I learned my lesson. Ever since that incident every LOC is pushed to GitHub, no matter how silly/hacky/not-yet-ready it is.


Reminds me of the "most stupid thing" I ever typed into Terminal.app. I somehow managed to create a subdirectory named "~", and decided it was a good idea to do this:

    rm -rf ~
It took me about 5 seconds to realize the command should have returned immediately, and frantically press Ctrl-c. But that was all that was needed to ruin two week worth of code.

If you want to try a fun little experiment, type this command into your colleague's shell and see what happens (and how long it takes):

    mkdir ./~


It's not always possible, but whenever I'm going to delete things with rm, I first test the pattern with an ls.

I have a coworker who never does this and it drives me crazy. He just blindly trusts his first attempt. And yes, he has accidentally deleted stuff, but thankfully nothing that was irrecoverable... so far.


I like to use "trash" (i.e. "put in the Trash") instead of "rm" (i.e. "remove permanently").

For Linux there's a suite called "trash-cli" (on Ubuntu the binary is called "trash-put" so you will probably want to set up an alias). I guess there's something similar for OSX?


There is also libTrash.so, that when loaded replaces rm entirely.

Yet, I've lost interest since I got a nice backup routine.


There's rmtrash, which is in Homebrew.


you could just use: mv * ~/.Trash


I have an alias set for rm as follows, for the root account:

# alias rm

alias rm='rm -i'

It helps to prevent some stupidity, sometimes. Better than nothing.


zsh, or at least oh-my-zsh, asks you for confirmation if your rm command looks dangerous.


Very classic indeed and very inventive way of solving it.

The article should have [2006] in the title though.


> This classic article from Mario Wolczko first appeared on Usenet in 1986.

An argument could be made that [1986] would be more accurate.


Done that - thanks.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: