

NASA code formally proven as deadlock free; later crashes due to a deadlock - timClicks
http://www.youtube.com/watch?v=_gZK0tW8EhQ

======
timtadh
Here is a paraphrase of what went wrong:

"The answer turned out to be 2 fold

1) We just got unlucky. In the Post-mortem analysis we figured out exactly
what happened. And it was event that was very crucially timing related. The
clock on the flight computer was slightly different than the clock the bench
(ground) computer that all the tests had been run on. The probability that the
sequence of events which led to the deadlock turned out to happen with a
higher probability with the skewed clock.

2) How is it possible that the situation exists after the proof of
correctness? One possibility is the proof system had a flaw. That turned out
not to be the case. It turned out to be much more prosaic. Instead some
assumption was wrong. Only the kernel of the system was proven to be deadlock
free. However, the applications were not. And one application in particular
had to work around some limitations with the kernel imposed programming model.
That work around was the source of the deadlock since it was not analysed by
the proof system.

"

------
timClicks
Salient discussion between 37–47 minutes

