
Fatal Dose – Radiation Deaths linked to AECL Computer Errors (1994) - agopinath
http://www.ccnr.org/fatal_dose.html
======
lostlogin
>>As a result of the Therac-25 accidents, the FDA now requires documentation
on software for new medical and other products: a paper trail, in other words,
that can be examined by an independent body and retraced for flaws.<<

Anyone have any idea if this can be looked at by the end user? I'm not a
radiation technologist of the flavour mentioned in the article, I'm on the
diagnostic side. I use an MR scanner with numerous software bugs that I have
reported but which remain. Similarly, the scanner can be made to display data
which it says it is going to use in the next scan, but which it isn't. I
suspected a bug and found the way to reproduce it. My last email listed 24
similar bugs (I've found more since) but other than a "thanks, we will forward
this on" there has been no reply or comment. It is hard to imagine when this
could be a safety issue, but it is a waste of valuable time, it is a waste of
money and it's frustrating when I have gone to the trouble of working out the
exact way of creating the issues. If anyone is interested, the interface is so
god awful that instead of having an on off button or switch interface, the
scanner gets the user to type 1 or 0 for on and off into a text field. Some
fields take other values like 1, 2 and 3. Some take decimal values like 0 to 1
in 0.1 increments. There is no pattern to what the user is expected to type.
Yuck. This data is not properly sanitized either, and you can make the scanner
say its "doing" something it's not. Type in 1.999, and error message appears,
the field corrects to 2.0 but the scanner does the thing that a setting of 1
would produce. These sorts of bugs occur all over the place.

Edit: The "thanks" email is the most positive I've ever got, my previous
reports were me with statements like "we have some very experienced users who
haven't had this issue" when there were clear safety problems with earlier
scanner implementations (The scanner was producing axial slices at a location
different to where I asked for them to be, on a spine patient due in theatre -
good luck operating on the correct vertebral level). Its FDA approved and its
on the latest software release. I have undergone manufacturer training and
have had additional training half a dozen times at my request and at the
manufacturers request after my bug reports were met with "you're doing it
wrong". I'm not, the software is buggy and I have some excellent and amazing
screen shots and camera phone video of the bugs in action.

~~~
tokenadult
My son the hacker used to work in the medical device industry as a summer
employee while he was a student. The code he wrote for a medical device user
interface was to be submitted for a line-by-line code review by the FDA. He
estimated that the product would actually come to market more than three years
after the summer he worked on it. And maybe that is what you are encountering
--the person at the company who built in the bugs you have discovered has
moved on, and doesn't work at the company anymore, and the other employees
there are trying to figure out how to debug that old code and fix the problem.
(Similarly, my son groused about the code in the device he was working on,
which was acquired by his company from another company that had originally
developed the device.) Always comment your code. You never know how long after
you wrote it someone else will have to fix it, especially if the code is
embedded in a medical device.

~~~
lostlogin
Thanks - this has been in the back of my mind and is a reason I'm trying to be
patient. A 2 line message saying what was happening would remove my
frustration. Usually I get a corporate speak reply with a suggestion it is my
fault though. What does the FDA code review do? If it isn't catching bugs that
take the scanner offline for hours at a time, what is the point?

~~~
kohanz
I've worked on several FDA-regulated products and have never had the FDA
review my code. I would guess this only happens in extenuating circumstances.
The FDA does not have the resources to do this for most products out there.

We are required, however, to review our own code and maintain records of those
reviews.

~~~
vardump
Exactly. FDA doesn't review code!

If there are complaints, FDA does sometimes review is the mountain of device
related documentation. Design, assembly, maintenance, end user manuals, etc.
Checking the paper trail. Is the paperwork done correctly, signed by a
competent employee and reviewed by appropriate persons. There also needs to be
watertight trail of employee training. Failure to have that does not end well!

Traceability (both physical and code) is another thing you better get right as
a medical company. You need to know where, when, etc. each major component of
the device came to be.

Medical companies literally generate so much paperwork, that separate storage
facilities are needed for it. While you'd obviously have it in digital format
for yourself, all of it is also printed out and signed.

------
kohanz
The Therac-25 case study is a tragic one, but fortunately it is not forgotten.

I work on medical devices (and have worked on radiotherapy devices previously)
and the standards for quality systems and regulatory hurdles (which I
occasionally see bemoaned here on HN) are there with good reason. In fact,
Therac-25 is often cited when training new hires on quality (as required with
any ISO-13485 compliant QMS).

~~~
lostlogin
Diagnostic imaging guy here - we point our recruits to this, balding patients
when doing diagnostic tests shouldn't happen.
[http://www.ajnr.org/content/31/1/2.full](http://www.ajnr.org/content/31/1/2.full)

------
icco
One of the more infamous classes in Computer Science at Cal Poly SLO is
"Professional Responsibilities", is taught by Dr. Clark Turner. The class
delves into Therac-25, and similar cases that have happened since. I found the
class really interesting because it does make you question and think about the
ethics of what you are building and what others have built.

Knowing about, and thinking about, the ACM Code of Ethics, Stuxnet, Therac-25,
the Windows Security Patch Policy, and other problems our programming culture
have come across is important. Realizing that the code we write can affect
people in both positive and negative ways on a long and short term scale is
something that can change both your product and how you build a product.

------
mariodiana
There was an article that appeared in the NY _Times,_ a few years ago, that
discusses the malfunctions of radiology equipment. There was one story, in
particular, that stood out for me. It describes a, reportedly not unusual,
malfunction/crash of a linear accelerator used for Intensity Modulated
Radiation Therapy (IMRT):

"An error message asked [the medical physicist operating the device] if she
wanted to save her changes before the program aborted. She answered yes."

How many programmers read that and cringe? I know I did. My guess is that the
operating system being used for the device is some standard OS (Windows CE,
maybe?) that is being repurposed to run the application and provide the GUI
for the device. It's not that this is necessarily bad, but I would think the
most important thing to do would be to strip the OS (or UI) of the various
"user conveniences" that in a life or death situation could have all kinds of
unintended consequences.

If a person is coding or doing graphic design -- or typing up cooking recipes
-- and a crash happens, it's a good thing to have the opportunity to save your
work. If 1 teaspoon of butter gets changed to 1 tablespoon because of some
kind of data corruption, big deal. So your cookies come out terrible!

It's quite a different matter if the application is coordinating 120 moving
parts to direct a radiation beam onto a human body.

The article is here:

[http://www.nytimes.com/2010/01/24/health/24radiation.html?pa...](http://www.nytimes.com/2010/01/24/health/24radiation.html?pagewanted=all&_r=0)

------
gnaffle
Thanks, I hadn't read this before.

For those that haven't read it, here's Levesons article on the Therac-25:
[http://sunnyday.mit.edu/papers/therac.pdf](http://sunnyday.mit.edu/papers/therac.pdf)

~~~
thaumasiotes
This one made me think about public outrage against tobacco companies.

One minor theme in this article is that AECL denied knowledge of any reports
of Therac-25 malfunctions even when, looking at a timeline of publicly-known
events, such ignorance might be described as "implausible".

They don't seem to have been punished for this, and while I agree that it
isn't laudable I also agree that it's not the greatest infraction. AECL really
did care about the proper functioning of their machine. They really did look
for problems. They cooperated with the FDA to a very great extent. It's hard
to fault them for not thinking of testing "what if we enter incorrect
configuration information, and then correct it within 8 seconds?"

But tobacco companies are routinely vilified for sitting on cigarette
mortality data, as if this was by itself enough to make them irredeemable.
They didn't even get off with a light punishment, much less the zero
punishment AECL received. I suspect the difference, in the minds of many, is
that AECL was a benign company advancing a useful purpose, while tobacco
companies sold a product whose only use was to kill the operator. But that was
legal then and remains legal today -- how can it be the justification for
punishing them extra-hard for otherwise minor problems? AECL's
misrepresentedly-unsafe product didn't even kill the operator; it killed
random sick people who trusted the hospitals.

~~~
gnaffle
Didn't the tobacco companies also spend money to discredit scientists and
peer-reviewed articles and seed misinformation about the real risks of
smoking, all while they were sitting on that mortality data? I think that was
the real problem.

~~~
thaumasiotes
All of what you've said is also legal today.

------
jacobparker
Related reading: [http://www.amazon.ca/Set-Phasers-Stun-Design-
Technology/dp/0...](http://www.amazon.ca/Set-Phasers-Stun-Design-
Technology/dp/0963617885)

------
blabby
At the time of Therac-25, FDA was only budgeted to investigate 6 percent of
device applications.

Currently, the same mistakes made in the eighties with Therac-25 are being
made in many radiation therapy devices. The two NY Times articles (Pulitzer
Prize winning) in 2010 and 2011 describe some of the newer cases.

What's shocking to me is that the incidents are always reported in isolation.
People become briefly outraged, then the furor dies down until the next death.

Many of the comments in this thread suggest that people can't or won't face
the fact that this is a current, ongoing problem of great complexity.

A couple of comments mentioned the coverage of Therac-25 in schools. Very
little of what is taught in schools makes it into the programming of radiation
therapy devices. History has shown that schooling is not a sufficient
solution.

Other comments claim (erroneously) that the FDA is attending to the problem.
The FDA has been carefully defanged by the medical device lobby. The FDA has
gotten smarter, but has nowhere near the funding to keep pace with its charge
and never will.

I wish I could say that I see some hope but I don't see it.

~~~
lstamour
Why hasn't the hardware failsafe for overdoses become mandatory? Why don't we
apply defense-in-depth to all worst-case scenarios involving deadly things?

Of course, sometimes hospitals aren't logical, air circulation between rooms
comes to mind. And here, I'm sure everyone just trusts the machines because
they paid a lot of money for them and it's always worked in the past ...

~~~
kalleboo
> Why hasn't the hardware failsafe for overdoses become mandatory? Why don't
> we apply defense-in-depth to all worst-case scenarios involving deadly
> things?

Because money

------
spectre256
"A professor in computer engineering at the University of Toronto told me
that, as a matter of course, his undergraduate students are warned about the
risks of incrementing numbers in a computer program."

As someone with a computer science degree who was warned of such risks and
studied the Therac-25 in my classes, this sentence made me realize how far we
have to go as professionals. Something seemingly so simple as incrementing a
number, one of the most common things done in a program, can cause serious
problems (of course we have more help with this now than in the mid 80's).
Other people must read things like that and cement any distrust they have in
computers and computer programmers. And they're probably right to.

------
merraksh
_The Therac-25 's software program, relatively crude by today's standards,
probably contained 101000 lines of code. At one error for every 500 lines,
that works out to the possibility of twenty errors._

I'd say 200, not twenty.

~~~
pathdependent
I think the article was OCR'd. There were a few other mistakes that were
clearly misinterpreted characters. I think the 1 in the thousands place is
actually a comma on the source.

~~~
merraksh
Good point. There are many hints that article was OCR'd, for instance an
".4ECL" instead of "AECL".

------
davidrusu
I had a professor read this case study in a lecture.

It amazes me that merely one programmer was trusted with building the software
for a radiation beam canon.

~~~
snowwindwaves
I think a manager for a product would give as much work and responsibility to
one person as possible if they say they can do it, and sometimes even when
they say they don't know if they can do it but they'll try. An experienced
manager might know how realistic the workload is and downsides of having only
one person on task x but every manager sees the upside, fewer people = less
cost.

------
JoshTheGeek
101000 lines, 500 lines per error gives about 200 errors, not 20

------
noddingham
We covered this in my CS courses as well. I feel bad if anyone comes out of a
CS program and isn't exposed to the Therac-25 incident even if superficially.

