
Count to ten when a plane goes down - UltraMagnus
http://johncbeck.tumblr.com/post/92074597917/count-to-ten-when-a-plane-goes-down
======
noisy_boy
Firstly, firing the intern doesn't make sense - it was a mistake waiting to
happen and he just happened to do it at the wrong time.

Secondly, the punishment meted out should be:

1\. Proportional to the degree of carelessness (in this case not that much
since he accidentally hit a wrong key adjacent to the right one, didn't mow
down anybody while driving drunk)

2\. Inversely proportional to the likelihood of the error (in this case the
likelihood was very high since the reset key was a. uncovered/single-press b.
right next to single reset key).

3\. Proportional to intention (this was a completely unintentional error)

If you say, that the punishment should also be dependent on the degree of
damage, I would say that the responsibility of managing the risk of such
damage wasn't his but of the person responsible for implementing such a high
risk design. If such a person is not around, find the person who approved such
a design. Government departments are usually very good with paper trail.

~~~
nathanb
I think you're wrong to ignore the consequences of the actions as an input to
the punishment. A small amount of unintentional carelessness that causes huge
damage could still be punished. One could argue that a certain degree of
mindfulness in critical situations is a job requirement, and casually making a
careless -- though unintended -- mistake demonstrates a lack of mindfulness
indicating that the person is not properly qualified for the job.

I understand that we don't want a culture that fires people for making the
sort of mistake anyone might make. But to be so careless on a day that is
clearly an exception where something more important than standard business
procedure is going on, can't you at least see why firing the intern for such a
lack of mindfulness might at least make sense, even if you disagree with it?

I interned for a government organization that maintains hydroelectric dams and
the software that controls them throughout the Southeastern US. A careless
mistake could -- in the worst case -- cause blackouts, cost the company
millions of dollars, or even cost lives (if the data-control feedback loop
caused a turbine to spin up at the wrong time or to fail to shut off in an
emergency). And, as is quite common in organizations with non-software-
engineers running the show, the development processes were entirely haphazard.
The environment was such that it would be really easy for me to push
unreviewed code, or to make a stupid deployment mistake, or to be careless in
a number of ways that the system didn't protect me against.

But it was OK, because they hired smart, competent people who understand the
need to triple-check, if necessary, before committing. People who understood
the gravity of the situation, and who didn't phone it in if they weren't
feeling it that day. If I demonstrated that I wasn't one of those people, I
would fully expect to be fired.

~~~
woodchuck64
> I think you're wrong to ignore the consequences of the actions as an input
> to the punishment.

This equivocates on "consequences" of actions, though. It's obvious that the
consequences of hitting F7 before the incident were understood by all
responsible to be low enough that any intern could be expected to make the
right decision. After the incident, the consequences of hitting F7 were
sharply increased such that no future intern would ever be allowed to make
that decision. But then you can't make an argument that assumes "consequences"
were the same at both points in time.

We make this fallacy all the time probably because we're designed by evolution
to reassess the morality of an action based on consequences. It works as a
social heuristic for shaming or rewarding people but it makes no rational
sense that the morality of an action should retroactively change based on
future consequences. You can see similar behavior in our rewarding athletes
for profound genetic advantages, or punishing criminals for profound genetic
deficits. The consequences somehow redeem or condemn, and they should do
neither.

~~~
bonaldi
No, the consequences were the same before and after the incident: a total
system reboot. The varying factor here was temporal: it was usually a low-risk
action when the office was empty, a high-risk one when the office was full.

The negligence on the intern's part was to make decisions and act without
regard for risk as if he was in the low-risk window despite the evidence he
was actually in the high-risk one (all the already-active PCs).

It makes perfect sense that the punishment should reflect inappropriate regard
being given for known consequences. That's what negligence is.

~~~
woodchuck64
> No, the consequences were the same before and after

I'm talking about the perceived consequences, not the actual consequences. The
fallacy here is to perceive low consequences at one point in time, perceive
high consequences at a later time and then try to change history such that low
consequences were never really perceived.

> The negligence on the intern's part was to make decisions and act without
> regard for risk as if he was in the low-risk window despite the evidence he
> was actually in the high-risk one (all the already-active PCs).

He was in a perceived low-risk window. The perceived consequence of accidental
reboot was already figured in and was already perceived to be low. Else why
would the F7 key be next to F6? It is certainly unfair to expect someone to
perceive high-risk when everyone else perceives low-risk.

> It makes perfect sense that the punishment should reflect inappropriate
> regard being given for known consequences. That's what negligence is.

The perceived consequences were low-risk, therefore the known consequences
were low-risk.

~~~
bonaldi
> He was in a perceived low-risk window.

... _because_ he was negligent. "Oh, all the computers are already on? That
only happens when Washington's waiting on something. Oh well, I'll carry on
like this was any other low-risk morning"

> Else why would the F7 key be next to F6?

Same reason why "rm -rf " is the one keystroke away from disaster. Perceived
risk has nothing to do with it.

~~~
woodchuck64
> "Oh, all the computers are already on? That only happens when Washington's
> waiting on something. Oh well, I'll carry on like this was any other low-
> risk morning"

Because those situations were also low-risk mornings. He only saw that pattern
when people left late. He had no reason to expect that people would be working
early in the morning because that situation had never occurred. Further, a
secretary playing a computer game in the morning suggests business as usual,
no one working.

> He was in a perceived low-risk window. ... because he was negligent

No, someone else set up the computers and software with F6 and F7 command
functions side by side and then evaluated the entire network as low-risk for
interns under all situations. It is perfectly reasonable for an intern to take
the same low-risk perspective as his superiors.

> Same reason why "rm -rf " is the one keystroke away from disaster. Perceived
> risk has nothing to do with it.

Perceived risk has everything to do with it. It is inconceivable today that an
intern would have unrestricted access to a company's file system and be
literally a few keystrokes from disaster. The key reason for that is because
perceived risk now is much closer to actual risk. In 1983, no one had a clue
about the kinds of things that could go wrong. Understanding real risk is a
painstaking process requiring time, trial and error.

------
mikegreco
The author states they felt it was appropriate when they were fired. In what
world would it be appropriate to get fired for a single, simple, incredibly
easy to make mistake? Doubly insane when there were exactly zero safeguards in
place to prevent the mistake from being made.

~~~
bonaldi
The world where you know you that a) you have an incredibly powerful key with
no safeguards at your fingertips and b) you might be in a breaking news
situation and nonetheless you go for the key _right next to_ the dangerous one
carelessly enough that you miss?

Think about it: Unix is equally as "insane". If you're the guy on the console
who meant to clean out some crap dir and accidentally typoed "rm -rf /" and
then caused an international crisis you're going to get fired too.

Then years later HN will call for Dennis Ritchie to get fired instead.

~~~
baddox
But typing "rm -rf /" is significantly harder to do accidentally than typing
F7 instead of F6.

~~~
graylights
Not really, a lot of novice unix users are of the habit of removing files with
-rf switch. I cringe everytime I see it.

The command "rm -rf ~/blue/" is just a single space key from being equivalent
to "rm -rf /" with "rm -rf ~/blue /"

~~~
jonreem
On any modern system it's actually "sudo rm -rf / \--no-preserve-root" and
then entering your password while staring at the command.

"rm -rf ~/blue /" will not come close to deleting / unless you are in the
habit of running every command as sudo, even ignoring the presence of --no-
preserve-root

~~~
daurnimator
Except when: (these are terrible lessons to learn)

1\. You type it into the wrong system (D'oh)

2\. You have run `mount --bind / /somewhere/else` then `rm -rf /somewhere` a
week later

:(

~~~
colanderman
It boggles my mind that --one-file-system is not the default :/

------
ck2
How about when Russia returned the data recorders after years of refusing to
South Korea - made a press spectacle of it - and then South Korea discovered
the recorders were empty and missing the data tapes when the press was gone.

Or the US navy crew who received medals after shooting down the Iranian
airline.

Once there is loss of life, it is 100% politics afterwards with little to no
practicality, just look at all the mass shootings where there were zero
changes afterwards. We simply do not value life, it is politics first.

~~~
nness
> Or the US navy crew who received medals after shooting down the Iranian
> airline.

You make it seem like they received the medal for having shot down the plane.
In reality, those who were awarded medals, were awarded Tour of Duty medals
for their time spend in a combat zone. I believe the distinction is important,
particularly since that class of medals are routinely awarded to individuals
during their time in the military.

~~~
ck2
If a police officer shot and killed innocent bystanders, should they get
achievement awards for doing their job otherwise?

My answer would be no, you failed at your job regardless.

Same thing with military.

~~~
maaku
That's a very narrow minded position.

Gosh I hope you never screw up even once, cause you'll never live it down.

~~~
72deluxe
Would you say that making a mistake in software is equal to taking a human
life?

~~~
maaku
Depends on the what the software does.

------
jackschultz
The issue form most new stories is that even when the truth comes out, the
great majority of people will never hear the actual facts. One issue is
because news stations move on from caring about the story quickly. Or, the
bigger issue in my opinion, is that people won't believe the new, correct
facts since the old ones will have been engrained in their head. Solving both
these issues would be really helpful for society, but are obviously damn hard
to solve since we haven't really gotten anywhere in this space.

~~~
Aardwolf
When there is such chaotic news story, I usually switch from news to
Wikipedia. That has all the facts and continues the story even after all media
lost interest.

------
justizin
It's a sign of poor management that someone has to be fired when something
goes wrong, outages are learning situations for all involved, and it is widely
held that the person who took the action that caused an outage is not
responsible, but that all involved are responsible.

See John Allspaw's Swiss Cheese Theory :
[http://www.kitchensoap.com/2012/02/10/each-necessary-but-
onl...](http://www.kitchensoap.com/2012/02/10/each-necessary-but-only-jointly-
sufficient/) .

[ Edit: I guess it's not Allspaw's model, but he applies it to systems
engineering rather well -
[http://en.wikipedia.org/wiki/Swiss_cheese_model](http://en.wikipedia.org/wiki/Swiss_cheese_model)
]

"Accidents emerge from a confluence of conditions and occurrences that are
usually associated with the pursuit of success, but in this combination—each
necessary but only jointly sufficient—able to trigger failure instead."

The person who pushed the button is not at fault, the manager is not at fault,
the guy who designed the button is not at fault - all are jointly responsible.

Blaming the intern does, however, reflect extremely poorly on Itoh and
everyone else in the chain of command. A superior who demands retribution for
a simple mistake that happened to cause him or her pain is basically
worthless.

But, I forget, we're talking about Ronald Reagan.

------
tokenadult
I remember this time sequence very well because I was living in Taiwan when
the incident happened. Yes, people who lived in east Asian time zones saw news
reports that appeared to be based on knowledgeable sources that the plane
might have landed safely with all passengers alive. This explanation of why
the Western-aligned diplomats and military officials based in east Asia didn't
have complete information when they were interviewed by the press is quite
interesting, and explains puzzling memories I have from that day.

------
jere
>And let’s hope that there is no stupid 23-year-old with his finger on an
important keyboard in this information chain.

No. This is something you would read in _Design of Everyday Things_ where Don
Norman would totally shame the the engineers who made that system. Software
shouldn't be designed with the assumption that no one makes errors.

~~~
ak39
What I find incredible to believe is that this problem could have happened
without the F7 erroneous keystroke by a human. A simple power outage could
have resulted in this exact same catastrophe.

Why didn't the backups work? System wasn't "robust" enough. (Did I just use
the word "robust"?)

------
kosei
Really brave of the author to share this story. I know most people would be
afraid to admit this kind of a public "mistake".

------
dictum
Further reading for those who want to be disabused of the concept of _human
errors_ :

[https://en.wikipedia.org/wiki/The_Design_of_Everyday_Things](https://en.wikipedia.org/wiki/The_Design_of_Everyday_Things)

------
abcd_f
> _That Korean announcement and the slow response by the US President — both
> caused by delayed real information — caused decades of conspiracy theories._

I appreciate that the OP was a part of the situation, but conspiracy theories
were _not_ caused by this.

It was time of _very_ high tension between the US and Soviet Union. So when a
plane veers off the course into not just Soviet airspace, but into an
explicitly cordoned off top secret area, ignores all communication attempts,
ignores the presence of fighter jets and just keeps on flying, then the
situation itself is a fertile soil for conspiracy theories.

~~~
brudgers
'It was a time of very high tension' doesn't quite capture how different it
was.

Through the glass of a yellow newspaper box, the _Miami News_ headline that
the Soviets had shot down a plane carrying a Congressman. My first thought was
"This is the war." Not 'a' but 'the'. The primary stance of the US military
was squared off against the USSR and had been for more than 30 years.

------
steven2012
Wow. I got goosebumps when I read that article. I'm old enough to actually
remember when KAL 007 was shot down, and while I wasn't old enough to hear
about the conspiracy theories, I do remember the thing about people being safe
and landing in Russia. To think that this was just a small mistake on the part
of someone, which caused international ripple effect, and who later blogged
about it is really something incredible.

------
userbinator
"With great power comes great responsibility."

Incidentally, "features" like this are why I don't trust systems that have
some centralised control - IMHO giving any one individual (or organisation, in
many cases these days) such power over others is not a good thing.

------
disputin
Scapegoat. The ritual expulsion of the evil spirits wrapped neatly in a little
parcel to appease the elders and thereby prevent them blaming each other -
harmony continues in the hall of power. Meanwhile the problem was in the
process, not the employee, so nothing has been fixed, and the guy who had
learned the lesson is no longer there, and so the problem will recur with the
next lamb to the slaughter.

------
ryanobjc
I disagree that it was appropriate that you were fired, but interesting story
all about.

~~~
kysol
For security reasons I think that it might have been justified, considering
the events that had just taken place. Still a crappy way to go out though.

~~~
brown9-2
What security reasons? Firing the author didn't change what had already
happened.

~~~
kysol
People were bat shit crazy in the middle of that Cold War. If someone randomly
decided to turn off machines without notice, even if they said "whoops
accident, my bad", their actions would have instantly thought of as sabotage.

I'm not agreeing with the outcome.

------
lotsofmangos
I often suspect that most of the work involved in keeping a power hierarchy
going, is involved with trying to pretend that this kind of shit doesn't
happen all the time.

~~~
lamontcg
And then conspiracy theorists latch onto this kind of shit, but believe that
it must be malicious silliness...

Why didn't Reagan respond immediately? Well, he was waiting to hear from
Chancellor Gorkon that the KAL flight had been successfully beamed aboard and
was en route to Pluto, of course... Clearly they'd have their shit together
better so it couldn't have been a 23-year old rebooting all the computers
accidentally and wiping out hours of critical work -- that would just be
ridiculous...

~~~
Loughla
Conspiracy theorists are just 20th and 21st century prophets, really. They
search for meaning in an all too often meaningless world.

It's comforting to think that people can control the direction of every choice
in the world, and that someone is at the helm.

It's uncomfortable to think about the daily series of random, unconnected
decisions that drive the direction of our species.

~~~
lotsofmangos
I'm not sure that it is more comforting to think that there is someone at the
helm, as much as anyone who aspires to be considered to be at the helm has to
keep pushing that story, so it gets repeated more often and with better
special effects than the story about there not being anyone at the helm.

Actually being in control of stuff is very difficult, but convincing people
that you are in control of stuff is pretty easy as we are all suckers for
narrative. The main ways to disrupt a power narrative is to spread other
narratives or for a situation to occur that upsets the existing narrative, so
getting people to make up new ones. This explains why totalitarian governments
can collapse so quickly, which wouldn't be possible if the people running them
were actually in control of anything.

------
privatedan
My first thought after reading the article was that it was ridiculous to
fire/scapegoat the author for hitting the wrong key, too. This has happened to
me before, where a single keystroke ( in my case, a line break in a config
file ) caused me to take down a production system. My punishment? Designing a
more robust system that would protect itself from a badly formatted config
file. To this day, ten years later, a similar error has not been repeated,
despite several attempts of people to push bad config files to our production
systems. If I had been instead fired, no doubt a similar, but perhaps not
exact, error would have been repeated every year or so.

If I had made the same mistake twice without any attempts to fix the situation
long term, then, yes, I think that would have been a fire-able offense.

If you're working with people who care primarily about their own positions and
egos without regard to the team as a whole, well, be prepared to be thrown
under the bus when it comes time for those people to protect themselves.

------
Schwolop
Thanks for posting this. I found
[https://news.ycombinator.com/item?id=8062683](https://news.ycombinator.com/item?id=8062683)
yesterday but yours appears to be the direct link to the author's blog, which
I had missed.

------
joewaltman
Great story....thanks for your willingness to share.

------
peterwwillis
_> On this day, I highlighted her workstation and hit the F6 key to reset. But
my screen went temporarily black and then seemed to be starting again. I
realized that I had mistakenly hit F7 and reset all the workstations in the
embassy._

Ugh.

Those with automation capabilities: keep this lesson in mind, because it will
happen to you in production one day. 'dsh -a reboot' is incredibly easy to
type and can have disastrous effects. Creating abstraction layers around
common admin tasks can help catch simple mistakes and give prompts before
dangerous behavior.

------
jheriko
I hope you fought that firing...

... incompetence like that comes from having F6 next to F7 and no checks or
authorisation needed for a potentially dangerous action etc. Processes should
be designed for people to make the common mistakes... its what they do.

~~~
jheriko
nm. just seen the follow up. :)

------
hyperliner
"My boss, a >> Japanese << computer engineer named Itoh, poked his head in the
door. "

hmmm, I am pretty sure Mr. Itoh was not Japanese working in the American
embassy. I am pretty sure he was American.

~~~
billmalarky
If he was a first gen American his culture would have been greatly shaped by
Japanese culture.

------
joshuaheard
I thought the headline meant to count to ten when a plane goes down...while
you are in it!

~~~
instakill
Me too. I still don't know how counting to 10 will help you press the right
key on your keyboard though.

------
feld
Reset all computers in the embassy with F7? No warning prompt?

Fire the idiot who wrote that function.

~~~
smacktoward
In fairness, it was a different world back then. There were so few people
administering computer networks that you could generally assume someone who
was doing so had been thoroughly trained; and the thing about highly trained
people is that they tend to view things like failsafes and safeties as
pointless time-wasters.

"I know what I'm doing when I hit F7, but the damn system makes me sit there
for _30 seconds_ before it does what I told it to do! Piece of junk."

The result was that software in that era tended to come with a lot more sharp
edges. The age of the Recycle Bin that would save you from yourself didn't
arrive until administering systems became something the general public was
expected to do.

~~~
viraptor
> you could generally assume someone who was doing so had been thoroughly
> trained

No amount of training can prevent something like this. It's like today's
browsers where the tab can be closed with ctrl+w and the whole window with
ctrl+q. It doesn't matter how many times you've done it and how used are you
to the position of the 'w'. One day you will close the whole window by
accident.

~~~
jessaustin
_...the whole window with ctrl+q._

OMG I've never done that but now that I know about it I'm very afraid. If I do
it tomorrow I'm blaming you.

~~~
sushid
Thankfully, Chrome has a built-in feature to prevent this from happening (on
OSX at least). Just go to Chrome > Warn before Quitting and make sure there's
a checkmark next to the option.

Now, if you accidentally press Cmd + Q, it should prompt a "Hold Cmd + Q to
Quit" instead of actually quitting.

~~~
tremendo
Or, Settings > On Startup... "Continue where you left off". This will restore
your tabs after launching Chrome.

