Hacker News new | past | comments | ask | show | jobs | submit login

Wow, an awful bug -- and brings back memories of a very similar bug that we had back in the late 1990s at Sun. Operating system patches on Solaris were added with a program called patchadd(1M), which, as it turns out, was actually a horrific shell script, and had a line that did this:

  rm -rf $1/$2
Under certain kinds of bad input, the function that had this line would be called without any arguments -- and this (like the bug here) would become into "rm -rf /".

This horrible, horrible bug lay in wait, until one day the compiler group shipped a patch that looked, felt and smelled like an OS patch that one would add with patchadd(1M) -- but it was in fact a tarball that needed to be applied with tar(1). One of the first systems administrators to download this patch (naturally) tried to apply it with patchadd(1M), and fell into the error case above. She had applied this on her local workstation before attempting it anywhere else, and as her machine started to rumble, she naturally assumed that the patch was busily being applied, and stepped away for a cup of coffee. You can only imagine the feeling that she must have had when she returned to a system to find that patchadd(1M) was complaining about not being able to remove certain device nodes and, most peculiarly, not being able to remove remote filesystems (!). Yes, "rm -rf /" will destroy your entire network if you let it -- and you can only imagine the administrator's reaction as it dawned on her that this was blowing away her system.

Back at Sun, we were obviously horrified to hear of this. We fixed the bug (though the engineer who introduced it did try for about a second and a half to defend it), and then had a broader discussion: why the hell does the system allow itself to be blown away with "rm -rf /"?! A self-destruct button really doesn't make sense, especially when it could so easily be mistakenly pressed by a shell script.

So we resolved to make "rm -rf /" error out, and we were getting the wheels turning on this when our representative to the standards bodies got wind of our effort. He pointed out that we couldn't simply do this -- that if the user asked for a recursive remove of the root directory, that's what we had to do. It's a tribute to the engineer who picked this up that he refused to be daunted by this, and he read the standard very closely. The standard says a couple of key things:

1. If an rm(1) implies the removal of multiple files, the order of that removal is undefined

2. If an rm(1) implies the removal of multiple files, and a removal of one of those files fails, the behavior with respect to the other files is undefined (that is, maybe they're removed, maybe they're not -- the whole command fails.

3. It's always illegal to remove the current directory.

You might be able to imagine where we went with this: because "rm -rf /" always implies a removal of the current directory which will always fail, we "defined" our implementation to attempt this removal "first" and fail the entire operation if (when) it "failed".

The net of it is that "rm -rf /" fails explicitly on Solaris and its modern derivatives (illumos, SmartOS, OmniOS, etc.):

  # uname -a
  SunOS headnode 5.11 joyent_20150113T200918Z i86pc i386 i86pc
  # rm -rf /
  rm of / is not allowed
May every OS everywhere make the same improvement!



>A self-destruct button really doesn't make sense

Bryan--you mean Star Trek didn't get it right? Well, at least they didn't allow shell scripts.

It's an interesting philosophical question though. At what point do you decide that the user truly really can't want to do this even though they've said that they do?


rm -rf is not self-destruct button. It's just a "take everything out of the closets, rip of the labels, and throw it on a heap".

A real self-destruct button would ensure you couldn't recover the data anymore.

So at least try "dd if=/dev/random of=/dev/sda" or more elegant when available, throw away the decryption key.


GNU rm defaults to failing on /, but this behavior can be overwritten with the --no-preserve-root flag.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: