
The Space Shuttle Challenger Explosion and the O-ring - sethbannon
https://priceonomics.com/the-space-shuttle-challenger-explosion-and-the-o/
======
nemild
I'm the author of this piece, happy to answer questions.

I grew up with stories of the Challenger after my father - a statistician -
and 2 of his co-authors were selected by National Academy of Sciences to study
if the danger could have been predicted beforehand. They showed that the
likelihood of failure was 13% at the launch temperature, but would have been
negligible if NASA had waited just a few hours. (His co-author, Ed Fowlkes,
was dying of AIDS at the time - and considered this paper one of his life's
great achievements)

Bad statistical inferences were a huge part of the launch story, and you can
see more in Richard Feynman's critiques:

[https://en.wikipedia.org/wiki/Rogers_Commission_Report](https://en.wikipedia.org/wiki/Rogers_Commission_Report)

Secondly, the effect I highlight (a biased data sample) is a key issue with
news/social media - and can lead us to heavily flawed inferences if we don't
correct for it.

I'll dig deep into this in future posts with a substantial amount of data and
visualizations.

~~~
kevinpet
Feynman's actually observations are well worth reading for anyone who builds
anything that may be vaguely considered engineering.

[http://science.ksc.nasa.gov/shuttle/missions/51-l/docs/roger...](http://science.ksc.nasa.gov/shuttle/missions/51-l/docs/rogers-
commission/Appendix-F.txt)

The key items for me were:

1) While they had no expectation of erosion, and the design did not call for
the o-rings to erode, once they observed them eroding, they retroactively
invented a "margin of error" based on what fraction the o-rings eroded. This
was not based on an actual understood process, and is akin to saying "well,
the bridge didn't break when we drove that truck over it, so it must be okay"

2) The engineers actually knew the risk (~1% chance of loss per launch, not
specific to the o-rings, compared with two actual losses of the shuttle over
~130 missions). Management used entirely invented numbers for the risk which
were not justified.

~~~
avar
Your paraphrasing of Feynman's bridge quote is inaccurate. From Appendix F[1]
of the report:

    
    
        [..]  In spite of these variations from case to case, officials behaved as
        if they understood it, giving apparently logical arguments to each
        other often depending on the "success" of previous flights. For
        example. in determining if flight 51-L was safe to fly in the face of
        ring erosion in flight 51-C, it was noted that the erosion depth was
        only one-third of the radius. It had been noted in an experiment
        cutting the ring that cutting it as deep as one radius was necessary
        before the ring failed. Instead of being very concerned that
        variations of poorly understood conditions might reasonably create a
        deeper erosion this time, it was asserted, there was "a safety factor
        of three." This is a strange use of the engineer's term ,"safety
        factor." If a bridge is built to withstand a certain load without the
        beams permanently deforming, cracking, or breaking, it may be designed
        for the materials used to actually stand up under three times the
        load. This "safety factor" is to allow for uncertain excesses of load,
        or unknown extra loads, or weaknesses in the material that might have
        unexpected flaws, etc. If now the expected load comes on to the new
        bridge and a crack appears in a beam, this is a failure of the
        design. There was no safety factor at all; even though the bridge did
        not actually collapse because the crack went only one-third of the way
        through the beam. The O-rings of the Solid Rocket Boosters were not
        designed to erode. Erosion was a clue that something was wrong.
        Erosion was not something from which safety can be inferred.
    

His point about NASA's nonsensical use of the "safety factor" is not that you
could drive over a bridge, and look, it didn't break, so it must be OK!

It's even worse, you drive a truck over it, afterwards 1/3 of the steel is
cracked, so you conclude that it must be able to safely accept 3x the weight.
Nonesense! This is the sort of moronic engineering that killed the crew of the
Challenger.

1\.
[http://science.ksc.nasa.gov/shuttle/missions/51-l/docs/roger...](http://science.ksc.nasa.gov/shuttle/missions/51-l/docs/rogers-
commission/Appendix-F.txt)

~~~
masklinn
> This is the sort of moronic engineering that killed the crew of the
> Challenger.

Note that it wasn't even engineering, reading the story in full (it's great,
and covers the software side for which Feynman had nothing but praise) Feynman
repeatedly noted that engineers were fairly realistic[0] and had been ringing
alarms pretty much all along, this was entirely manglement mangling.

[0] unless the spectre of manglement was involved, at least for some of them

~~~
avar
It's been some time but I've read the entire report cover-to-cover. While yes,
the general conclusion is that NASA's dysfunctional management structure and
institutional optimism driven by moneyed interests is the primary culprit. The
report never really tries to perform a root cause analysis on how something
like the O-ring "safety factor" problem arose.

Something which, as summarized by Feynman's quote above, should be patently
obvious to any engineer as bullshit.

Feynman's appendix is the only part that even tries, but it doesn't go far
enough through no fault of Feynman's, he had no resources to pursue this line
of inquiry. It was a struggle just to get that appendix into the report.

They should have interviewed every single person in any way remotely involved
in that O-ring decision, find out if they objected to it, and if they didn't
what money/institutional/social obstacles there were to prevent that.

Did some engineer actually sign off on the aforementioned "safety factor"? We
don't know, but somehow I doubt that's language management came up with on
their own, and if they did that there was no way for an engineer to spot that
and report "wtf? The system doesn't work like that!".

Reading between the lines some engineer actually did come up with that
estimate, but likely that engineer was where he was because NASA had a culture
of promoting mindless yes-men.

~~~
masklinn
> It's been some time but I've read the entire report cover-to-cover.

I meant Feynman's later recounting of the whole affair (in "What do you care
what other people think"), rather than just the report.

> Did some engineer actually sign off on the aforementioned "safety factor"?
> We don't know, but somehow I doubt that's language management came up with
> on their own

That doesn't mean they were fed that by an engineer, only that they'd
encountered the term before.

> and if they did that there was no way for an engineer to spot that and
> report "wtf? The system doesn't work like that!".

And then what? Upper-management uses "safety factor" in a completely bullshit
manner, and engineer spots that (because they're masochistic and read
management reports?), tells their direct manager it's inane, and then what,
you think it's going to go up the chain to upper-management which will fix the
issue? Because IIRC (I don't have my copy of What Do You Care on me so I can't
check) Feynman noted that engineering systematically got lost somewhere along
management ladder as one middle-manager decided not to bother _their_ manager
with a mere engineer (or worse, technician!)'s concern or suggestions.

> Reading between the lines some engineer actually did come up with that
> estimate, but likely that engineer was where he was because NASA had a
> culture of promoting mindless yes-men.

That's really not what I read behind the lines considering engineers had
failure estimates in the % range and management had estimates in the per-
hundred-thousand range.

~~~
avar
> I meant Feynman's later recounting of the whole affair

I've read that too. You're dangerously close to getting me to re-read
everything Feynman's written, again. I don't know whether to curse you or
thank you :)

> And then what? [...]

I feel we're in violent agreement as to what the actual problem at NASA was,
yes, I'm under no illusion that if some engineer had raised these issues it
would have gone well for him. This is made clear in the opening words of
Feynman's analysis,:

    
    
        [...] It appears that there are enormous differences of opinion as to the
        probability of a failure with loss of vehicle and of human life. The
        estimates range from roughly 1 in 100 to 1 in 100,000. The higher
        figures come from the working engineers, and the very low figures from
        management. What are the causes and consequences of this lack of
        agreement? Since 1 part in 100,000 would imply that one could put a
        Shuttle up each day for 300 years expecting to lose only one, we could
        properly ask "What is the cause of management's fantastic faith in the
        machinery?"
    

I'm pointing out, not to disagree with you, but just to use your comment as a
springboard, that to an outside observer this whole process led to some
"moronic engineering". Engineering is the sum of the actual construction &
design process and the management structure around it.

The real flaw in the report is that it didn't explore how that came to be
institutional practice at NASA, Feynman is the only one who tried.

> That's really not what I read behind the lines.

Regardless of what sort of dysfunctional management practices there were at
NASA they couldn't have launched the thing without their engineers. If they
were truly of the opinion that shuttle reliability was 3 orders of magnitude
less than what management thought, perhaps they should have refused to work on
it until that death machine was grounded pending review.

Of course that wouldn't have been easy, but it's our responsibility as
engineers to consider those sorts of options in the face of dysfunctional
management, especially when lives are on the line.

~~~
dzdt
I think the engineers (and astronauts) accepted 1 in 100 odds of failure as a
price they were willing to accept to be part of the project. That is not a
"death machine", just a risky and exciting one. For comparison, that risk is
equivalent to working 5 years in a coal mine in the 1960's.
[https://www.aei.org/publication/chart-of-the-day-coal-
mining...](https://www.aei.org/publication/chart-of-the-day-coal-mining-
deaths-in-the-us-1900-2013/)

~~~
avar
Yes, which is fair enough, and personally I think that's fine. With odds like
that you'll still get people to sign up as astronauts, and it'll be easier to
advance the science. In the grand scheme of things it's silly to worry about
those deaths and not say death from traffic accidents.

The real issue was that that's not how NASA presented it outwardly. I doubt
that teacher that blew up with Challenger was told about her odds of surviving
in those terms.

As human launch vehicles go I think the shuttle's reliability was fine. The
reason I called it a death machine is that if you make a vehicle that explodes
1% of the time you better advertise that pretty thoroughly before people step
on board. NASA didn't.

------
seliopou
Tufte wrote an essay on how the data available suggested that there was a high
likelihood of O-ring failure, but that the data and findings were poorly
communicated. This led to the decision to launch, the subsequent failure of
the O-rings and loss of life. This essay appears in the booklet "Visual and
Statistical Thinking"[0], among other publications, along with an other essay
on how the source of cholera was traced to contaminated drinking water in 19th
century London by John Snow. He plotted cholera cases on a map, and looked at
where the outbreaks were most frequent.[1] This also led to the discovery of
the vector of cholera, which up until then was unknown or at least
misattributed. Both are great reads.

[0]:
[https://www.sfu.ca/cmns/courses/2012/801/1-Readings/Tufte%20...](https://www.sfu.ca/cmns/courses/2012/801/1-Readings/Tufte%20Visual%20and%20Statistical%20Thinking.pdf)

[1]:
[https://en.wikipedia.org/wiki/1854_Broad_Street_cholera_outb...](https://en.wikipedia.org/wiki/1854_Broad_Street_cholera_outbreak#/media/File:Snow-
cholera-map-1.jpg)

~~~
justin66
Some of the engineers at Thiokol whose work Tufte criticized have responded
over the years. There's at least one painfully academic paper out there,
although if you wanted to you could start here:

[http://www.onlineethics.org/Topics/ProfPractice/Exemplars/Be...](http://www.onlineethics.org/Topics/ProfPractice/Exemplars/BehavingWell/RB-
intro/RepMisrep.aspx#misrep)

[https://eagereyes.org/criticism/tufte-and-the-truth-about-
th...](https://eagereyes.org/criticism/tufte-and-the-truth-about-the-
challenger)

It's been longer since I read Feynman but I recall his assessment as being a
lot more grounded, and fairer to the engineers.

~~~
masklinn
> It's been longer since I read Feynman but I recall his assessment as being a
> lot more grounded, and fairer to the engineers.

Feynman laid the vast majority of the blame on management ("NASA officials" in
his Appendix F), noting that engineers had fairly realistic views of the
matter (and failure rate estimates) and IIRC that they'd tried to raise
concerns but those had gotten lost climbing the manglement ladder.

The one unit for which he had nothing but praise was the Software Group:

> To summarize then, the computer software checking system and attitude is of
> the highest quality. There appears to be no process of gradually fooling
> oneself while degrading standards so characteristic of the Solid Rocket
> Booster or Space Shuttle Main Engine safety systems.

Nothing that they had to constantly resist manglement trying to mangle:

> To be sure, there have been recent suggestions by management to curtail such
> elaborate and expensive tests as being unnecessary at this late date in
> Shuttle history.

~~~
alephnil
> Nothing that they had to constantly resist manglement trying to mangle:

Such a failure to hold up standard was the main reason for the failure of the
first Ariane 5 launch. They reused some subsystems from the Ariane 4 rocket,
that crashed on Ariane 5 because a numeric overflow. This happened because the
much more powerful Ariane 5 came much further during the time this subsystem
ran, and had a greater angle that caused the overflow to happen. This had
apparently been proven could not happen in the Ariane 4 rocket.

When the subsystem and its software was decided to be used on Ariane 5, they
did not even run the subsystem with the projected trajectory of the new
rocket. If they had the problem would have been found prior to launch. This
was luckily not a manned mission.

~~~
Baeocystin
>When the subsystem and its software was decided to be used on Ariane 5, they
did not even run the subsystem with the projected trajectory of the new
rocket.

On one hand, I find that incredibly hard to believe. Yet on the other I am old
enough to have seen exactly that kind of thinking often enough that I don't
find it hard to believe at all.

------
cvoss
My university has a course (maybe in the business or statistics departments?
I'm not sure, as I didn't take it) that has a project where the professor
gives this very pre-launch data to groups of students in a totally different
context: It is presented as Formula One race data, and the task is for the
students (the racing team) to decide whether or not to pull out of the
important race based on current weather conditions and the potential safety
implications for the driver.

The next class period, only after the teams propose their decided course of
action, it is revealed where the data really came from. I imagine it's quite
jarring, especially for those who decided to proceed, albeit with different
risks in mind.

~~~
nemild
This is often taught in business schools and one case is called Carter Racing:

[http://heller.brandeis.edu/executive-
education/maine-2012/ma...](http://heller.brandeis.edu/executive-
education/maine-2012/may/pdfs/BHLP-102-READING-Carter-A.pdf)

~~~
LeifCarrotson
That's only the first file in the series. There are -B.pdf and -C.pdf files as
well.

[http://heller.brandeis.edu/executive-
education/maine-2012/ma...](http://heller.brandeis.edu/executive-
education/maine-2012/may/pdfs/BHLP-102-READING-Carter-B.pdf)

[http://heller.brandeis.edu/executive-
education/maine-2012/ma...](http://heller.brandeis.edu/executive-
education/maine-2012/may/pdfs/BHLP-102-READING-Carter-C.pdf)

And here's a convenient link to all three, apparently an Apache feature I
didn't know about: [http://heller.brandeis.edu/executive-
education/maine-2012/ma...](http://heller.brandeis.edu/executive-
education/maine-2012/may/pdfs/BHLP-102-READING-Carter-D.pdf)

> _Multiple Choices_

> _The document name you requested ( /executive-
> education/maine-2012/may/pdfs/BHLP-102-READING-Carter-D.pdf) could not be
> found on this server. However, we found documents with names similar to the
> one you requested._

> _Available documents:_

> _[http://heller.brandeis.edu/executive-
> education/maine-2012/ma...](http://heller.brandeis.edu/executive-
> education/maine-2012/may/pdfs/BHLP-102-READING-Carter-A.pdf) (mistyped
> character)_

> _[http://heller.brandeis.edu/executive-
> education/maine-2012/ma...](http://heller.brandeis.edu/executive-
> education/maine-2012/may/pdfs/BHLP-102-READING-Carter-B.pdf) (mistyped
> character)_

> _[http://heller.brandeis.edu/executive-
> education/maine-2012/ma...](http://heller.brandeis.edu/executive-
> education/maine-2012/may/pdfs/BHLP-102-READING-Carter-C.pdf) (mistyped
> character)_

> _Apache Server at heller.brandeis.edu Port 80_

Also, each document has the following item in the footer. I suspect that
Brandeis.edu is violating their license agreement by hosting these, and also
that the license agreement was designed by someone who really doesn't like the
Internet or computers:

> _Not to be reproduced, modified, stored, or transmitted without prior
> written permission of the copyright holder or agent._

~~~
socalnate1
You are correct, if you want to use this for team training - you should really
buy it, it isn't expensive:

[https://www.deltaleadership.com/store/shopexd.asp?id=15](https://www.deltaleadership.com/store/shopexd.asp?id=15)

~~~
nradov
It seems like that case is missing a piece. Where is the final analysis
showing the results of choosing to race or not? Are there separate instructor
notes somewhere?

~~~
jowiar
Having had this in Policy school (albeit a decade ago), I remember the follow-
on class basically being: "So, this happened in real life. Except it wasn't
racing -- it was the Challenger". (A room of 70 promptly headdesked). What
jumped out in the class discussion was how focused the conversation was on
economic risks and rewards (x% chance of y payoff, etc), and how "life of the
driver" was basically never mentioned as one of the risks of proceeding.

Anyway, eventually you learn to play "spot the Challenger graph" from a mile
away. I think it showed up 4 times in assorted courses I've taken over the
years (re: Data Visualization and Organizational Design).

~~~
LeifCarrotson
> how "life of the driver" was basically never mentioned as one of the risks
> of proceeding.

To be fair, a failed engine rarely causes the driver to die.

------
tps5
Richard Feynman, on NASA's attitude toward the space shuttle program: "For a
successful technology, reality must take precedence over public relations, for
nature cannot be fooled."

~~~
stcredzero
The gaming industry needs to understand that this applies to coding for games
as well. It especially applies to multiplayer.

~~~
flamedoge
multiplayer? We do fool clients all the time with extrapolation, just to shave
half of the latency. And when we get it wrong, we "correct" it retroactively.

~~~
stcredzero
The laws of nature are different in this context, but you have to follow them
with the same diligence with the same harshness awaiting when you fail. In
games, it's not what is logical or mathematically correct. It's what feels
correct.

------
amoruso
I'll add another recommendation for Tufte's writings on the Challenger
explosion. It should be required reading for all engineers. People who
criticize Tufte for oversimplifying miss the point entirely. It's not about
analysis, it's about communication. It's one thing for domain specialist to
have a complex, multi-dimensional understanding of their specialty; extracting
a relevant summary for non-experts is something else entirely. If you've ever
been in a meeting where you had trouble getting your point across, you should
read this. Make diagrams like Tufte does to get your point across, make more
detailed ones as backups if you need to dive deep into details.

[http://williamwolff.org/wp-content/uploads/2013/01/tufte-
cha...](http://williamwolff.org/wp-content/uploads/2013/01/tufte-
challenger-1997.pdf)

~~~
7952
I think there is a tendancy to trivialise the difficulty of communication on
large projects. I have had people tell me that "it is not rocket science".
Well actually, maybe it is much harder than that. A small team can design a
rocket engine. Getting a small team of managers to know all the right facts is
very hard and on many projects seemingly impossible. And that is on projects
where you can have very large margins of error. Obviously that is not always
possible building things with strict mass limits.

------
dispose13432
It's an interesting example where just because something is done by the
"public" doesn't make it safer than done privately.

Everyone knows about regulatory capture (when companies manage their safety
regulators). Normally it's private organizations pushing public safety
regulators.

Here, on the other hand, it was a public organization that took risks.

The reason is simple. Every organization has certain needs. Boeing needs to
make planes (make $), Ford needs to make cars (make $), etc. Safety is an
annoying thing they need to get over with ASAP to get to their primary purpose
(make $).

Nasa needed to launch, and (at least for the managers) it became an acceptable
risk. If it flies 20 times and blows once, they win.

So should there be a NASA and a NASAOC (NASA oversight committee, to check on
them)?

Then the organization on top of both (Congress, the President) will choose
which one to listen.

This is the general problem of self-policing.

And the only way to get around this is by having multiple, independent,
providers. So if NASA doesn't think SpaceX is safe enough, they can shut down
the contract while still having access to space.

If Nasa had that in 1986, the Shuttle would have been (rightfully)
decommissioned then and there. Unfortunately, it required _another_ accident
before anything moved.

And the lesson can be learned to matters outside space.

------
Pigo
I was only 5 when this happened, but I remember what a blow this was to my
school. Were past missions, like the Apollo's, dismissive of concerns like
this as well and just lucky? Or was the shuttle missions just more complex,
with more points of failure to be concerned about?

~~~
dbcurtis
My understanding is that the attitude of NASA management changed from the
Apollo era's "Prove to me we are good to go." to one of "Prove to me we can't
go." Some might put it: Gene Kranz retired.

Source: Anecdotes from an old friend who is a quality assurance engineer. He
was one of the boys on the ground in Huston that brought the Apollo 13 crew
home.

~~~
aidenn0
Another comment on this thread linked to a paper[1] that tells a narrative
that the "Prove to me we can't go" was specific to this launch (specifically
it claims that since NASA didn't want to ground all shuttles for 2 years, they
instead accepted the recommendation that no launches be made outside of the
environmental envelope of previous launches, but then this decision was
reversed specifically for Challenger).

Not in the paper, but from my own memory, the launch was high profile due to
the first civilian on a NASA mission and was repeatedly delayed by the time
they launched. In fact my family had tickets to the launch, and we ended up
getting various tours of the space center instead since they kept on delaying.

[edit]

Also, my understanding is that Kranz's hard-line began after the Apollo 1
tragedy. Was your friend there early enough to comment on that?

1:
[https://people.rit.edu/wlrgsh/FINRobison.pdf](https://people.rit.edu/wlrgsh/FINRobison.pdf)

------
mavhc
Of course if politics hadn't caused parts of the shuttle to be built far away
from the launch pad, they wouldn't have needed O-rings in the first place

~~~
greenshackle2
Yes, it is well known that American engineers do not use O-rings.

In fact, Americans could have built the entire shuttle without using any parts
at all.

~~~
snrplfth
The Shuttle Solid Rocket Boosters were mainly built in Utah, and many people
assert that this choice of location was due to political patronage. The fact
that these boosters were built inland, far from barge-capable waterways and
distant from the launch site, meant that, rather than being completed as a
single large piece at the factory, it arrived at the Kennedy Space Center as
four pieces, which were then joined together with the O-rings in question.
This made them vulnerable to a blowout of the kind experienced with
Challenger; solid rocket boosters made in a single piece are generally much
less vulnerable to this kind of failure.

~~~
greenshackle2
Fair enough, I don't know what I'm talking about. Thanks for explaining.

(I thought OP meant no O-rings would've been used in the entire shuttle, not
just these particular ones.)

------
joering2
Just so we on the same page - if you read enough about this sad story, you
know damn well it wasn't an explosion or the faulty/frozen O-ring that killed
those brave souls - it was a horrible amount of bureaucracy and ignorance
towards engineers, who rang warning bells long before Challenger's liftoff.

------
Splines
This is why I always have my visualizations overlay both failure rates and
usage.

A given failure rate (or even worse, a failure count) doesn't tell you much
about the system without also including the totals of both success and
failure.

I'm sure I could be more rigorous with this though. Is there a way to express
a given failure rate in terms of certainty? As in, we have sampled the failure
rate of a component with fixed parameters a, b, c, and we are x% certain of
the failure rate? (Maybe I'm wording this wrong - I don't have much of a stats
background).

~~~
acidbaseextract
> This is why I always have my visualizations overlay both failure rates and
> usage.

Do you have an example? I love having more viz tools in my toolbox!

~~~
Splines
Sorry for the late reply - where I work we use an in-house platform that uses
Kendo for rendering charts.

[http://demos.telerik.com/kendo-ui/bar-
charts/column](http://demos.telerik.com/kendo-ui/bar-charts/column)

What I do isn't anything particularly sophisticated, I usually do a multi-axis
chart with a line & column chart, with one axis corresponding to usage, and
the other axis corresponding to failure rate. Imagine a basic multi-axis chart
in Excel, except rendered in a browser, and you're 90% of the way there.

------
lordnacho
Am I reading this wrong or did they launch the thing at a temperature way
outside the range where they normally launch shuttles?

Wouldn't there be a whole bunch of different stats measured, which would all
say that you should be cautious when trying a region far away from what you
know?

In any case, a very good example of how stats is unintuitive. I hadn't guessed
about the missing "no error" data until I read it. I'm sure there's many more
little things like that. Simpson's paradox, those kinds of things.

~~~
masklinn
> Am I reading this wrong or did they launch the thing at a temperature way
> outside the range where they normally launch shuttles?

No, that's correct, the launch was the coldest yet and reached temperatures at
which the O-rings had lost their flexibility and couldn't spring back fast
enough to seal. In fact an iconic scene from the Challenger hearings was
Feynman showing (on TV!) the loss of ductility after having dunked an O-ring
in ice-water.

------
mixmastamyk
I remember that day vividly, I was in jr. high school. It's hard to describe
how traumatic it was; the school teacher on board made it ten times worse. I
would compare it to 9/11 for those that aren't old enough to remember.

As we learned later about the rubber o-rings failing in cold weather, the
solution has always been framed as not launching in those conditions.

But, why not use a different material, or design away the o-rings to avoid the
problem in the first place?

Edit: question is already answered here:
[https://news.ycombinator.com/item?id=13239241](https://news.ycombinator.com/item?id=13239241)

~~~
iamatworknow
>I remember that day vividly, I was in jr. high school. It's hard to describe
how traumatic it was; the school teacher on board made it ten times worse. I
would compare it to 9/11 for those that aren't old enough to remember.

It's interesting (albeit depressing) to think about how every generation in
the television (and now internet) age seems to have at least one of these
events that occurs right on the border of adulthood; when you've become old
enough to have a sense of the world and the personal impact of the event kind
of jolts you into reality, so to speak. For my dad it was JFK's assassination
and for me it was 9/11.

It's also notable that when Columbia disintegrated upon re-entry in 2003 the
media and public at large didn't seem to pay it much attention at all (or at
least I don't remember it being such a big deal).

~~~
dispose13432
>It's also notable that when Columbia disintegrated upon re-entry in 2003 the
media and public at large didn't seem to pay it much attention at all (or at
least I don't remember it being such a big deal).

Probably because it wasn't broadcast live.

~~~
sharksauce
It certainly was broadcast live -- every major TV news organization in the US
broadcast the re-entry. Of course Columbia was very high altitude when it
broke apart and the cameras (fortunately in my opinion) couldn't get very
close-in shots.

There's a point to be made that people were more prepared for a possible
disaster given the reports of damage to the shuttle that the crew provided
days ahead, compared to the absolute surprise and shock at Challenger, and the
horrifyingly clear camera footage provided at a lower altitude.

------
keithwinstein
The Challenger explosion is a great case study. But in focusing on that chart
from the Rogers Commission report, this piece reinforces what is basically a
well-told fable about the Challenger disaster.

The piece says: "Below is the key graph of the O-ring test data that NASA
analyzed before launch" and reproduces the famous chart. It continues, "NASA
management used the data behind this first graph (among many other pieces of
information) to justify their view the night before launch that there was no
temperature effect on O-ring performance [...] But NASA management made one
catastrophic mistake: this was not that chart they should have been looking
at."

I think these statements are pretty misleading without some major caveats.

Tufte ("Visual and Statistical Thinking: Displays of Evidence for Making
Decisions"; [https://blogs.stockton.edu/hist4690/files/2012/06/Edward-
Tuf...](https://blogs.stockton.edu/hist4690/files/2012/06/Edward-Tufte-Visual-
and-Statistical-Thinking.pdf)) writes:

"Most accounts of the Challenger reproduce a scatterplot that apparently
demonstrates the analytical failure of the pre-launch debate. The graph
depicts only launches with O-ring damage and their temperatures, omitting all
damage-free launches (an absence of data points on the line of zero incidents
of damage). First published in the shuttle commission report (PCSSCA, volume
1, 146), the chart is a favorite of statistics teachers. [...] The graph of
the missing data-points is a vivid and poignant object lesson in how not to
look at data when making an important decision. But it is too good to be true!
First, the graph was _not_ part of the pre-launch debate; it was _not_ among
the 13 charts used by Thiokol and NASA in deciding to launch. Rather, it was
drawn _after_ the accident by two staff members (the executive director and a
lawyer) at the commission _as their simulation_ of the poor reasoning in the
pre-launch debate. Second, the graph implies that the pre-launch analysis
examined 7 launches at 7 temperatures with 7 damage measurements. That is not
true; only 2 cases of blow-by and 2 temperatures were linked up. The actual
pre-launch analysis was much thinner than indicated by the commission
scatterplot. Third, the damage scale is dequantified, only counting the number
of incidents rather than measuring their severity. In short, whether for
teaching statistics or for seeking to understand the practice of data
graphics, why use an inaccurately simulated post-launch chart when we have the
genuine 13 pre-launch decision charts right in hand?"

(For a response to Tufte's essay, see
[https://people.rit.edu/wlrgsh/FINRobison.pdf](https://people.rit.edu/wlrgsh/FINRobison.pdf),
also cited elsewhere here.)

~~~
aidenn0
That Robinson paper is amazing, and I hadn't seen it before.

TL;DR:

The engineers said "Make these two fixes" and got 1.

The engineers said "Don't launch until the O-rings are redesigned" and were
informed that 2 years of no-launch was unacceptible

The engineers said "Okay then, at least don't launch with an O-ring colder
than any previous launch[1]" and this was accepted until a high-profile launch
was repeatedly delayed.

Finally they were told "We will launch unless you can prove to us it's not
flight ready" and due to natural uncertainties and a small number of data
points they could not meet this burden of proof.

1: Well actually they weren't focused just on temperature, so it was really
more of "outside of the envelop of previous launches"

------
rietta
The Challenger explosion is one of my first vivid memories from my childhood
as my mom was letting me watch the launch on TV as a 3 year old. I also had an
understanding of death at the time as I asked my mom if they were in heaven
(which was as advanced as my understanding was at the time). This is a
fascinating read 30 years later.

------
skmurphy
The full data chart reminders me of Abraham Wald asking the question "where do
we never see damage on a returning plane?" to understand that those areas
meant complete loss of the aircraft and would most benefit from more armor.

------
babesh
Institutional failure: don't forget the second shuttle disaster on reentry
(Columbia) with the known issues with tiles getting knocked off by foam. Badly
engineered contraption.

