Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you're a sys admin long enough, it will eventually happen to you that you'll execute a destructive command on the wrong machine. I'm fortunate that it happened to me very early in my career, and I made two changes in how I work at the suggestion of a wiser SA.

1) Before executing a destructive command, pause. Take your hands off the keyboard and perform a mental check that you're executing the right command on the right machine. I was explicitly told to literally sit on my hands while doing this check, and for a long time I did so. Now I just remove my hands from the keyboard and lower them to my side while re-considering my action.

2) Make your production shells visually distinct. I setup staging machine shells with a yellow prompt and production shells with a red prompt, with full hostname in the prompt. You can also color your terminal window background. Or use a routine such as: production terminal windows are always on the right of the screen. Close/hide all windows that aren't relevant to the production task at hand. It should always be obvious what machine you're executing a commmand on and especially whether it is production. (edit: I see this is in outage the remeditation steps.)

One last thing: I try never to run 'rm -rf /some/dir' straight out. I'll almost always rename the directory and create a new directory. I don't remove the old directory till I confirm everything is working as expected. Really, 'rm -rf' should trigger red-alerts in your brain, especially if a glob is involved, no matter if you're running it in production or anywhere else. DANGER WILL ROBINSON plays in my brain every time.

Lastly, I'm sorry for your loss. I've been there, it sucks.






Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: