On one hand, the mistake was the effect of time and schedule pressure. Some of that was real, but some also illusory (as shown by the fact that Los Alamos could stop doing those experiments entirely as still deliver). They chose the approach they did because it was the easiest. But not only that - at least in Slotin's case he chose the approach he did because he didn't believe that he would make a mistake. He'd done this a bunch of times. The danger had become routine, an idea captured in Diane Vaughan's books as "normalization of deviance", and in economic and sports research as "the familiarity heuristic ("I know this, so I'm safe").
On the other hand, the experiment itself set them up to fail. Minor tweaks with near-zero cost impact, like bringing the tamper up from the bottom rather than down from the top, would likely have saved both men. My understanding is (and it's hard to get clear evidence of this) that both men designed their own experiment. Both were in a position to do it in a safer way with no additional cost. With hindsight, both likely would have. In Slotin's case it seems like the commitment heuristic ("I want to be consistent with my past actions") played a role in him doing something he knew to be dangerous "one last time". In Daghlin's case, there seems to have been some role of the scarcity heuristic ("if I don't get this done tonight after the party, I might not get another chance").
This all goes to show how fallible we are. Not only these guys, both extremely smart people. We take irrational risks all the time. The big take away here is that systems need to be safe even though people are going to make bad decisions for bad reasons. One extremely effective way to do that is to separate design from implementation, using our rational decision making processes to make the hard decisions ahead of the moment. Then write the decisions down. Then follow them in the moment.
Hypothesis: at that point you feel comfortable, but you haven't accumulated the full spectrum of experience.
I'd expect the relevant statistic would be accidents per person, by age. But everything seems to normalize per miles-driven (where no such effect is apparent) or against total accident rate.
>The big take away here is that systems need to be safe even though people are going to make bad decisions for bad reasons.
This is why I don't like the phrase "operator error". All too often, it's used to excuse systemic failings and avoid investigating the deeper causes of safety incidents.
Completely agree. But I don't think avoiding the term "operator error" (or human error, pilot error, medical error, whatever) is the solution to that way of thinking.
Instead, I think we need to keep repeating that humans, no matter how experienced, careful, knowledgable or well-meaning, are going to make mistakes. The job of system designers is to find ways to make systems robust in the face of those mistakes, and to help make it easier for humans to do the right thing.
Checklists are one powerful tool there, for example. Another one is making systems that behave like people's mental models predict. I wrote about that idea here: http://brooker.co.za/blog/2019/08/12/kind-wicked.html
I'm not sure how comprehensive / modern it is (not my subfield), but I enjoyed it. And it provided at least one framework to think about error.
Specifically, that most errors can and should be categorized by the states necessary for their happening. Because the unique characteristics of each state (of which there are many) all suggest very different approaches to resolving or eliminating them. To bring it back to the example in question here, remediating the procedure to eliminate a lack of knowledge of failure cases or risk would not have prevented either of these accidents (both were well-informed). However, technical solutions to physically prevent unacceptably risky "bypass" procedures would have.
There's a tendency to simply say "If we had more procedures, and they were followed, then this wouldn't have happened." But that seems like a dodge. More often than not (in my experience), more procedure is simply seen as overly-burdensome and actively subverted by users (in the service of laziness, performance, compensation, or deadlines). As you say, willfully making bad decisions for bad reasons.
Sometimes you have the benefit of almost absolute control over people, e.g. SUBSAFE  (or Navy or Air Force maintenance QA programs). But most organizations don't have that kind of training budget.
Consequently, the most effective procedure is the safest one that people will actually follow. So strike that balance.
 If you can get a hold of high-level documentation, an excellent example of error mitigation in practice -- https://en.wikipedia.org/wiki/SUBSAFE
> "In the winter of 1945–1946, Slotin shocked some of his colleagues with a bold action. He repaired an instrument six feet under water inside the Clinton Pile while it was operating, rather than wait an extra day for the reactor to be shut down. He did not wear his dosimetry badge, but his dose was estimated to be at least 100 roentgen. A dose of 1 Gy (~100 roentgen) can cause nausea and vomiting in 10% of cases, but is generally survivable."
Just because I choose to jump a motorcycle through a flaming hoop doesn't indicate that I think my body is impervious to flame.
It means I weighed the risks and made a choice.
In other words, ignorance, chance, and bravery are different things. And accidents are some mix of all of them.
It's available on Amazon Prime.
A sphere of metal, a bit larger than a basketball, it's just the right size and shape to be manipulated by hand tools.
I wonder if that contributed to normalization of deviance. It doesn't look weird enough to kill you.
I'd like to hear a little more about the "bad luck" that caused a bottle of muscatel to get laced with antifreeze...