
Normalization of Deviance - r4um
https://foone.wordpress.com/2019/02/14/normalization-of-deviance/
======
aasasd
> _We had built a system that generated thousands of lines of logs for every
> test, with lots of “failures” recorded in them. Things like “tried to
> initialize FOOBAR_CONTROLLER: FAILED!!!,” but we just ran that code on all
> machines, even ones without the FOOBAR_CONTROLLER hardware. So no one
> noticed when another 5 lines of errors popped up in a 2000-line log file._

This right there is a big red flag. The whole bendy business is bad enough,
but here you're actively training people to ignore the wolf cries.

Don't allow false failures in tests. The entire test suite needs to be binary:
either everything works, or it fails.

~~~
taneq
Likewise, this is why it's important not to let compiler warnings slide. If
your build process spews out a bazillion lines of cruft then that one new
warning which indicates a serious error will get lost in the noise. I don't go
so far as to enable warnings-as-errors or anything but I do build with -Wall
and generally won't check in code which builds with warnings.

~~~
munk-a
As a company you can decide where to fall on this scale, I applaud you for
-Wall'ing but for most it's too much, the important thing is to make sure that
everyone has agreed which warnings are acceptable and you refuse code that
violates that.

We're using PHP (so runtime warnings, instead of compiler issues) but we have
a zero-error policy, if anything raises a warning or error we refuse to merge
(or ticketize fixing since a bunch of stuff is legacy and we're still paying
it down).

~~~
stdclass
The standard in PHP-frameworks like Laravel is to automatically throw
Exceptions when a Notice happens, which at first I thought was overly
aggressive but I would never go back to writing PHP where a Notice is thrown
and ignored.

~~~
munk-a
It depends on the era of PHP to be honest, pre-PHP 5.2 (IMO) notices were
mostly for pretty small stuff that was accepted as being necessary for sanity,
i.e. trying to read a value out of $arr[$dem1][$dem2][$dem3] would require
three separate isset checks to do safely, I always leaned toward building a
user-space recursive isset function but it came with a penalty to performance.

Starting in 5.4 and getting far more prominent in 5.6, PHP started breaking
bad backwards compatibility and failing to respect notices could easily lead
to errors on version bumping up to 7.

~~~
aasasd
As I mentioned in the sister comment, by ignoring notices you're setting
yourself up for garbage values being passed through your code and into user-
facing functionality, including handling of money. That doesn't depend on the
PHP version.

------
tonto
The turn in the middle of the article about taking care of yourself is
interesting. If you only skimmed the first part of the article you might miss
that positive message

~~~
Topolomancer
I was very surprised to discover this part. It also led me to discover a
rabbit hole (spurred by the 'Spoon Theory' example) for which I did not have a
name previously. It is always amazing how certain articles suddenly give a
name to a phenomenon that I had to use so many words to describe.

This kind of content is what I come to HN for!

~~~
nradov
Spoon theory is a attractive concept, but it appears to be largely based on
the ego depletion hypothesis. And that one hasn't held up very well now that
scientists have attempted to reproduce the original experiments. So I'm
skeptical whether it's of much practical use.

~~~
krinchan
Ego depletion isn't really the same thing. Ego depletion is about doing things
you don't want to do or not doing something you want to do. It implies we have
limited bits of self-control.

Spoons theory is more metaphor, trying to describe the penalty on both time,
energy and capacity a disability applies to your day to day to people who have
an excess of both to handle the mundane.

One problem I have is that spoons theory says, "You have 20 spoons per day, I
have 5." While the end result is the same, it's more that getting dressed in
the morning requires 1 spoon for me and 4 spoons for someone with a
disability. Similarly, other mundane activities require four times the amount
of time and energy.

Also you can borrow spoons from the next day, either by skipping sleep or just
outright pushing through an activity well beyond your disability's limit.
However, then you have less time and energy the next day.

It's definitely in line with normalizing deviance. Most people who can hold
down a job and an independent life with a disability are operating at their
limits. They have no spare spoons for emergencies or unexpected events. The
people around then normalize that and then become surprised when "one small
thing" blows everything out of whack for two days. That person had no more
spoons to spare. They borrowed from the next day, and now their Fibro has them
bedridden, the stress triggered a manic episode, etc.

~~~
nradov
I've seen people increase their number of "spoons" day after day by just
grinding it out and refusing to quit. Some folks with severe, crippling
disabilities have accomplished amazing feats largely through determination and
force of will.

~~~
travisjungroth
Cool. And a lot of people’s disability is non-adaptive, meaning no matter how
hard you try, it never gets any easier.

------
cwmma
How speed limits are enforced in America always bothers me, because there is
this great disconnect between planners and everybody else.

Planners think of speeds of roads being intrinsic to the design of the road
thus if a people are going to fast on a road, you need to change it by
narrowing it or putting in bumps or something.

The other side of this is to think of speeds of a road as based on whats
around it, so if there are a lot of houses on a road, people should go slower
so you don't hit people so you put in speed limits.

But the problem with limits is that since the road feels faster then the the
speed limit, people just go faster then the limit, but since changing the road
is a lot more expensive then just putting up speed limits that tool is used a
lot less frequently.

~~~
jnty
It is bizarre that we are discussing making cars entirely autonomous as a way
of improving safety, but refuse to consider the much more feasible option of
technologically restricting speed on public roads.

~~~
lotsofpulp
It might be technically feasible, but not politically feasible.

While it is a no brainer to say every vehicle should be traveling sufficiently
far behind another such that they have enough time to stop before they hit the
one in front, this spacing out of vehicles will result in longer travel times
as you are effectively reducing road capacity by having each vehicle take up
more space on the road. Instead, it's (politically) easier to have each person
take individual risks of getting into a collision.

However, with autonomous vehicles, it should be politically easier to force
this constraint because the liability will be shifted from individuals to
presumably the manufacturer so the manufacturer isn't going to stick their
neck out so the individuals can save time on their commute.

~~~
andyjpb
Slowing down traffic can, in fact, increase the throughput of the road,
especially when it is busy.

This is for two reasons:

1) slower traffic requires a smaller safety gap between each vehicle.

2) the dominating factor in traffic jams is often people responding to the
brake-lights of the people in front of them. People over compensate and
therefore a "braking bubble" (Soltion:
[https://en.wikipedia.org/wiki/Soliton](https://en.wikipedia.org/wiki/Soliton)
) forms in the traffic which causes it to get even more spaced out.

Keeping traffic flowing is more important to throughput than keeping it
flowing it fast. Constant braking and speeding up on a busy road means that
the flow rate is constantly disrupted and this causes the traffic to tend
towards clumping and congestion.

You're not "in traffic"; you "are traffic".

~~~
lotsofpulp
It's inevitable that people will over or under compensate, and differences in
acceleration speeds of different vehicles means the rubber banding is
inevitable, especially if you add any elevation climbs to the equation. Add in
lane changes, merges, exits, and there's no situation where I would expect a
slow and steady flow of traffic.

Maximum throughput at a fast speed with minimal spacing between vehicles is
more than maximum throughput at a slower speed with minimum spacing between
vehicles. I see this in every urban area, people sacrifice safety for speed.

------
masto
There's a version of this in SRE where the performance your system delivers
becomes the performance people expect. And then they build their systems to
depend on that performance, regardless of what your actual SLA is.
Paradoxically, delivering better than the performance you're actually capable
of sustaining can set things up to break very badly when something fails
"within SLA".

~~~
NickNameNick
There was a post here a little while ago about a google service (I think it
was 'locky') that the SREs would deliberately disable if they had over-met
their SLA. This forced the developers of dependent systems to handle outages
correctly.

~~~
nsm
It’s Chubby, and it is a lock server :)

------
KineticLensman
The article draws on the definitive text in this area by Diane Vaughan[0].
Read her work on the Challenger Launch Decision - it goes into the details of
why the deviance was normalised. Even down to the level of how important
decision making conference calls marginalised technical inputs.

[https://en.wikibooks.org/wiki/Professionalism/Diane_Vaughan_...](https://en.wikibooks.org/wiki/Professionalism/Diane_Vaughan_and_the_normalization_of_deviance)

------
jancsika
> They put their passwords in their wallet and in their phone.

The author is underplaying the problem here. There were tests that showed
burns through the o-rings and the reports rationalized the danger-- not by
normalization of deviance but through deceptive language.

It's a lot more like having an audit that shows that no users were observed
writing a password on a sheet and putting it in their wallet. And since extant
passwords sheets stored in wallets don't match an idiosyncratic definition of
"written down" they pass the audit.

That's not to say that normalization of deviance didn't happen. Obviously both
it and a more direct type of corruption happened. But I get the sense the
author here is trying to cram everything into the former to make a tractable
problem out of a messy political situation.

------
jki275
[https://fastjetperformance.com/podcasts/how-i-almost-
destroy...](https://fastjetperformance.com/podcasts/how-i-almost-
destroyed-a-50-million-war-plane-when-display-flying-goes-wrong-and-the-
normalisation-of-deviance/)

------
gnuvince
Dan Luu has an article titled exactly the same:
[https://danluu.com/wat/](https://danluu.com/wat/)

~~~
eadmund
> There's the company where I worked on a four person effort with a multi-
> hundred million dollar budget and a billion dollar a year impact, where
> requests for things that cost hundreds of dollars routinely took months or
> were denied.

I’m so used to seeing this, but I still don’t get it. Seriously, we spend
$X/week on some nice-but-unnecessary luxury, but won’t spend $X/16 once for
some really useful thing.

I suspect that it has to do with how corporate budgets are designed, but …
maybe they could be designed better?

~~~
pjc50
Power and control.

It's rarely quite so concrete, but because money is so _countable_ and is
explicitly subject to control that it is seen most keenly. Most companies will
have some sort of "financial controller" role, and it all flows downhill from
there.

It's where the false positive / false negative phenomenon comes in, too. The
person doing the controlling has an incentive to reject as many requests as
possible, because they get criticised for any _retroactively_ identified as
"waste" \- but they don't get to see, and can't count, all the time wasted
dealing with the process and opportunities lost as a result.

I once worked for a small company that would let you buy anything under £100
on the company card so long as you sent in an explanation by the end of the
month, preferably identifying a client it could be billed to. This worked very
well because when you put "£100k engineer time, £10k custom electronics, £10
misc stationary" on the same invoice _no sane person is going to question the
stationary_.

------
alexpetralia
One of the best essays I've read in a while. A new mental model to keep in
mind. Thanks for this.

------
christophilus
Here is an excellent talk on this subject[0]. It's one of the few
presentations I like to watch every once in a while.

[0]
[https://www.youtube.com/watch?v=Ljzj9Msli5o](https://www.youtube.com/watch?v=Ljzj9Msli5o)

------
killjoywashere
This would be a good speech for everyone a few years after graduating college.

------
aelmeleegy
This is a great piece! I started reading it and was 100% captured by how
informed the arguments are.

Thank you!

------
lifeisstillgood
'''The crew probably survived in the reinforced cabin until it struck the
ocean.'''

I went cold reading that.

I assumed the explosion took the whole shuttle out instantly.

------
gumby
BTW this is about deviation, not deviance.

------
thegeomaster
Great, well written piece with an eloquently explained and useful, positive
message. Awesome!

------
petermcneeley
This unintelligible scree doesnt even get the challenger disaster correct. The
challenger disaster has almost nothing to do with engineering but has
everything to do with management and politics.

~~~
pjc50
For a project of more than one person size, such as a space programme, you
cannot separate the management and politics of engineering from the
engineering itself.

~~~
petermcneeley
Right but the article describes the event as though the engineers in the
ground were just normalizing deviance which is simply not the case.

~~~
detaro
Where does it say that about the specific engineers on the ground vs the
entire org doing/causing it?

~~~
petermcneeley
"But you’ve launched at 40F and it was fine, and then one day you had to
launch at 35F and it was fine, and then on a particularly bad day you had to
launch at 30F and you’re fine. So you normalize this deviance. You can launch
down to 30F, if you really have to. But then one day you’ve missed a bunch of
launch windows and it’s 28F and the overnight temperatures were 18F but you
did a quick check of the designs and specs and you probably have enough safety
margin to launch, so you say GO."

Now contrast this with the wiki entry "NASA managers also disregarded warnings
from engineers about the dangers of launching posed by the low temperatures of
that morning, and failed to adequately report these technical concerns to
their superiors."

~~~
mannykannot
From what I recall of the inquiry's findings, that statement is a reasonable
(if simplified) synopsis of the managers' (specifically, the managers being
referred to in your wiki quote) reasons for dismissing the engineers'
concerns.

Here's another quote, from the beginning of the article: "... I think it’s too
easy to think of it as just a random-chance disaster _or just space /materials
engineering problem_ that only has lessons relevant to that field. And that’s
not really the most important lesson to learn from the Challenger disaster!"
[my emphasis.]

~~~
petermcneeley
Your not reading that line correctly. What that line is talking about the
physical nature of the device which I agree is not the focus of the article.
The article is talking about the human side of engineering.

Now this is suppose to sound like an engineer "But then one day you’ve missed
a bunch of launch windows and it’s 28F and the overnight temperatures were 18F
but you did a quick check of the designs and specs and you probably have
enough safety margin to launch, so you say GO." But in reality the engineers
never said that. The managers made the call in opposition to engineering.

If you want to have an example of when there is a Normalization of Deviance in
engineering you need to have the engineers say actually say "GO" and for there
to be a disaster. You cant have the managers "dismissing the engineers'
concerns" and then turn around and suggest that engineering Normalized
deviance. Thats simply the wrong lesson here.

~~~
mannykannot
"Now this is suppose [sic] to sound like an engineer..."

You just made that up. No-one else is reading it that way, and for a good
reason: _it isn 't written that way._

Also (following on from what pjc50 wrote above), most (if not all) of the
managers referred to in your wiki quote were also engineers - so, not only did
the author not make the claim you made up, it could arguably be justified if
he had done so.

~~~
petermcneeley
"check of the designs and specs and you probably have enough safety margin to
launch" I hope to god this is an engineer (or scientist), otherwise you have
non technical people making technical calls.

~~~
mannykannot
Read the second paragraph of the post you are replying to - and, while you are
about it, maybe you could finally respond to detaro's point, above.

Your criticism of the article fails on at least four grounds: 1) the managers
in question were engineers, as is often the case in large, highly technical
projects; 2) engineers can make mistakes, and did so here; 3) engineers can
disagree (especially when some of them make a mistake), and did so here; 4)
the article does not actually make the claim you say it does.

~~~
petermcneeley
Thank you for all of your replies.

The primary issue here is the choice of example (Challenger disaster) to
explain "Normalization of Deviance".

Its like if you were an SSE working on a product that had some big security
issues. You tell your manager not to launch the product because there is a
high risk of a very bad security violation. The manager under pressure to get
it done decides to ignore your warnings and launch the product. There is a
huge security violation just after the product launch and it causes a scene.

Why reach for some pet theory (Normalization of Deviance) to explain this?
This is just the ongoing tension between the desires of higher management and
the reality that those on the ground know.

I think the reason why I am so insistent on this is because I am worried about
the wrong lesson being learned. Politics and Power play far more of a role
than we like to admit. This hierarchical structure is also very good at
deflecting and distributing blame. Is Normalization of Deviance just another
excuse to explain the commonality of bad management?

~~~
mannykannot
It is something of an insult to the accident investigators to suggest that
someone just "reached for some pet theory" to explain the Challenger crash
(and that of Columbia, for that matter.) There was a very thorough
investigation, that clearly established that there had been a normalization of
deviance leading up to the final showdown over whether to launch. The
commission would have come to a shallow and unsatisfactory conclusion if it
had looked at that showdown alone and not recognized the normalization of
deviance that had set the stage for it.

I encourage you to read both reports, available from NASA:

[https://history.nasa.gov/sts51l.html](https://history.nasa.gov/sts51l.html)

[https://www.nasa.gov/columbia/home/CAIB_Vol1.html](https://www.nasa.gov/columbia/home/CAIB_Vol1.html)

The term 'nomalization of deviance' was coined precisely because accident
investigators have found a recurring pattern. The recognition of a recurring
pattern is more helpful in accident prevention than simply attributing each
incident to "just the ongoing tension between the desires of higher management
and the reality that those on the ground know."

Here's another example of it, not involving engineers, and where explicit
management pressure was not an issue:

[http://www.rapp.org/archives/2015/12/normalization-of-
devian...](http://www.rapp.org/archives/2015/12/normalization-of-deviance/)

~~~
petermcneeley
thanks for you reply. I would agree that the Columbia and Bedford disasters
are a much closer fit for this theory.

~~~
mannykannot
Ha! You are not, of course, agreeing with me (or the accident investigation
board) so long as you dismiss normalization of deviance as a key issue in the
Challenger crash.

