
Software bug made Bombardier planes turn the wrong way - sohkamyung
https://www.theregister.co.uk/2020/05/29/bombardier_missed_approach_bug/
======
redis_mlc
I understand the bug, and it's one of the worst imaginable.

Doing an unauthorized departure turn can impact terrain. At night, it may not
even be noticed by the pilots.

Source: commercially-rated pilot.

~~~
lutorm
It seems pretty far from "worst imaginable". I mean, it's not like it goes
into an uncontrollable dive.

~~~
dehrmann
If you still have control surfaces, is a dive at sufficient altitude ever
uncontrollable? Flat spins are way scarier.

~~~
redis_mlc
The main issues are:

\- dive with power causing overspeed

\- dive causing controls to lock due to transsonic shock waves

\- improper recovery from a spiral dive will over-G

\- MCAS-style confusion

\- bottoming out on a phugoid oscillation

\- unrecoverable spins, usually in jets

But in general, if you unload the wings (reduce G), then nothing breaks.

------
dghughes
Four months ago there was a post of another article from theregister.co.uk
about another bug. The cockpit/flight deck displays went blank if a Boeing 737
landed at any runway that was oriented at 270 degrees true.

[https://news.ycombinator.com/item?id=21991087](https://news.ycombinator.com/item?id=21991087)

~~~
t0mas88
That sounds more dangerous than this one, but flying the wrong missed approach
or departure is far more dangerous than any display/instrument failure because
the crew wouldn't be aware of the problem. If the flight display fails you
look at the backup instrument and keep flying, every crew is trained for that.

But if the missed approach or departure procedure is wrong the crew has a high
probability of not noticing it (this all happens in a very high workload
situation). If you don't notice you can't fix it. What makes it worse is that
this bug happens in situations whether the procedure requires a turn "the long
way around", they wouldn't design the procedure that way unless it's really
necessary. So there is a big chance there is terrain or an obstacle on the
other side.

Source: I'm a commercial pilot

------
baybal2
Another reverse polarity incident:

[https://www.airliners.net/forum/viewtopic.php?f=3&t=1424233&...](https://www.airliners.net/forum/viewtopic.php?f=3&t=1424233&sid=6b6dc652243a92d64d2d4359a3f5bad5)

------
trhway
>Most bugs in airliners tend to be unforeseen memory overflows

the 21st century, planet Earth.

------
rrmm
Another turn the wrong way issue that happened was on the early versions of
the 737. It would cause the rudder to go hard-over to the other direction it
was commanded. It happened so hard it was difficult (if not impossible) for a
pilot to counteract it using the pedals (they would basically have to stand on
the pedal).

It killed at least 157 people. The culprit in this case iirc was a combination
of a flaw in the hydraulic cylinder design with large temperature swings. The
story of the guy who finally figured it out is a fun one.

[https://en.wikipedia.org/wiki/Boeing_737_rudder_issues](https://en.wikipedia.org/wiki/Boeing_737_rudder_issues)

------
thePunisher
I keep noticing that more and more aviation and space missions fail because of
software problems. It seems to me that the new generation of engineers are
generally less competent or companies see software as an afterthought which
can be outsourced to lower-wage countries.

The Boeing MAX and Starliner come to mind, but the failed Moon missions by
Israel and India are also examples of this trend.

Cost cutting in software development is costing companies dearly. Boeing may
even go bankrupt because of this.

~~~
cryptonector
(The MAX was not so much a software issue as an architecture issue (starting
with insufficient redundancy). So that's not a good example of software
causing problems for airliners.)

There are two reasons why you can expect software to be more and more the
cause of airliner safety issues:

\- software is eating the world

\- software is getting more complicated

The first is a long-term trend now. Look under the hood of any automobile from
before the 80s: no computer to be found. Look under the hood of any automobile
from the past 30 years: computers abound. The reason for this is that many
problems are easier to address in software than in hardware. Of course, you go
from N hardware problems to some possibly smaller set of possibly simpler
hardware problems at the cost of gaining a set of software problems -- but
this trade-off usually pays off. In some cases this trade-off enables
functionality that would be infeasible to create otherwise.

The second problem is also a long-term trend: CPUs, systems, operating
systems, and applications have all tended to get more complex. In embedded
systems the trend has been less strongly towards ever-increasing complexity,
but even in embedded systems things have gotten more complex.

Whether the problem is less competence among today's programmers is hard to
establish here. First, we need much more software, which means we need many
more programmers, which means the quality of programmers you get probably does
decrease, though then again, we do have more programmers overall as more
people (competent and otherwise) are attracted to the industry. But more
importantly, the increase in complexity of today's systems could very well be
enough to make yesteryear's competent programmers incompetent today -- you
can't really compare software development 40 years ago to software development
today.

(I object to this idea that lower-wage programmers necessarily can't be
competent, though that isn't quite what you wrote. It's true that a lax
process for outsourcing can mean you get less competent programmers, and it's
probably true that higher GDP/capita correlates with availability of competent
programmers. But it doesn't follow that there are no competent lower-wage
programmers in India, say.)

~~~
thePunisher
The problem seems to me that management doesn't appreciate the need for
competent (and therefore expensive) software engineers in critical projects
like defense, aviation and space. The whole idea of trying to save a few bucks
on something as critical as a fly-by-wire system or end-to-end testing seems
totally ludicrous to me.

Boeing has been cutting too many corners since the MBA's took over and started
reorganizing things to maximize profitability. There's a good chance the
company will fail because of this.

------
abductee_hg
not the 1st time bombies did this:
[https://www.sueddeutsche.de/wirtschaft/deutsche-bahn-
ic-1.47...](https://www.sueddeutsche.de/wirtschaft/deutsche-bahn-ic-1.4773380)

------
908B64B197
Turning off the feature doesn't sound so bad considering the CRJ-200 first
flew in 1991, it took 26 years to identify the bug so I assume it's not used
frequently at all.

~~~
throwanem
It's an avionics bug, so very likely the affected equipment is aftermarket.

~~~
MaxBarraclough
How's that? Are most avionics issues due to aftermarket equipment?

~~~
throwanem
I don't know about that, but it'd be a surprise to see an airframe of that age
in commercial service that hadn't had its avionics upgraded, since that's a
relatively simple way to gain new flight management capabilities. It'd also be
a surprise to see so severe a bug go undetected for so long, if it was part of
the original equipment.

------
parkovski
This reminds me of a meetup I attended last fall, they were talking about the
Spectre/Meltdown issues. I asked the presenters if anything in chip
manufacturing/verification processes had changed as a result of that and they
seemed surprised.

To me, when a software bug shows up in a critical system, that means you
actually have a logistics bug. Airplane control software should not be allowed
to have bugs. CPUs should not be allowed to have bugs. And OS's should not be
allowed to crash (looking at you Microsoft).

When one of these things happens, in my opinion the correct response is _not_
to just release fixes and workarounds and then say "we'll try really hard to
not let it happen again." You do that, sure. But the first time you see
airplane software malfunction, that means you need to change the way the
software is written and released so that the whole class of issues will not
ever happen again. You don't stop at a public apology, you don't fire the
person that unintentionally wrote the bug. If you have to hire mathematicians
to formally prove the critical paths of the software, you do that. If it costs
10x more to release bug-free software, oh well, you do that.

All of these corporate people thinking they can save money by spending less on
quality are extremely naive. You can do a financial analysis of this, but
they're doing it wrong. Did you ever consider what the cost of a whole
generation just not trusting air travel at all would be?

~~~
Veserv
You are correct, but airplane companies already do that for the most part and
much much more.

The difference in reliability between normal software and airplane software is
so vast that "best practices" from normal software can not be applied to
airplane software since that would be gross criminal negligence. To explain,
in the 10 years prior to the 737-MAX problems there were 50,000,000 flights
and software was not implicated in a single passenger air fatality. The
average flight is ~5,000 KM which is ~4-5 hours. So, in ~250,000,000 flight-
hours, there were two crashes due to software. A plane takes ~3 minutes to
fall from cruising altitude, so we can model this as a downtime of 6 minutes
per 250,000,000 hours which gives us an downtime of 1 in 2,500,000,000 or a
99.99999996% uptime (yes, that is 9 9s). In contrast, I think most software
people would agree that AWS is high quality. The AWS SLA specifies a 99.99%
uptime (1 in 10,000 downtime). So, by this metric, airplane software is
250,000x more reliable than normal high quality software.

The point of this is that the standard for airplanes is almost inconceivably
high compared to normal software. To think that they are incompetent or
suggest that all they need to do is adopt X or Y common-sense/best-practice is
a gross misunderstanding of what is being done and what needs to be done to
improve. It would be like someone trying to tell a civil engineer making a
50-story skyscraper that they really need to adopt high quality wood
construction techniques from makers of doghouses. To actually improve it, you
need to consider practices 250,000x better than "best practices" and go from
there.

To put it another way, the solutions are actually really really good,
unfortunately the problems are really really really really hard.

~~~
jfim
Not to detract from your point that aeronautical industry software is reliable
(it is), but the 737 MAXes that crashed were all new planes. There wasn't even
24 months between the first delivery of a MAX to the model being grounded.

The issues with the MAX were also clearly preventable and there were multiple
failures of the systems (regulators, internal reviews, etc.) that were in
place to catch these kinds of issues.

But as you point out, the aeronautical industry has an excellent track record
for software reliability, if you evaluate reliability by hull losses. By other
metrics, it's a bit more debatable (eg. the integer overflow for Dreamliners
such that they need to be restarted at least every 248 days), but still keeps
people moving safely.

~~~
Veserv
Yes. I included the MAX because otherwise the software-related fatalities over
the last 10 years is 0. If you do just the MAX, the low end in terms of
flights is ~200,000 with an average of 3 hours per flight. Using the same time
basis above, that is 1 in 6,000,000 or 99.99998% uptime which is 600x better
than AWS by my previously used metric. The software of an unconscionable
deathtrap is 600x better than extremely high quality server software.

My primary point is that many people look at these failures and incorrectly
conclude that the processes in place are objectively terrible and below
average. This leads to them discounting the processes in these systems in
favor of policies from vastly less reliable systems that they think are
quality-focused or "best practices" because they, fairly, think "bad" in a
safety-critical context means the same as regular "bad", so regular "amazing"
is clearly better. In truth, "unconscionable deathtrap" and "gross criminal
negligence" in the airplane world is more of a synonym for "amazing beyond
belief" in the rest of the software industry. The correct takeaway is
understanding that regular "amazing" is actually orders of magnitude worse
than "unconscionable deathtrap" and is thus completely inadequate for the job.
As a corollary, if you do not think you are doing "way better than amazing"
you are probably not doing an adequate job in these contexts.

To reiterate, the solutions are really really good, unfortunately the problems
are really really really really hard.

------
kohtatsu
I just flew out of this airport yesterday, only 5 passengers onboard a
Bombardier Q400.

Thankfully there are no nearby hills for this bug to kill anyone there.

Unrelated, but how many carbon offsets do I buy?

~~~
markvdb
You'd have consumed about 0.45 l/km of kerosene per flight km per passenger.
Calculation based upon [0]. That means you'd need to offset CO² emissions of
about 1.11 kg CO² per flight km [1] for yourself.

[0] [https://www.flyradius.com/bombardier-q400/fuel-burn-
consumpt...](https://www.flyradius.com/bombardier-q400/fuel-burn-consumption)

[1] [https://www.engineeringtoolbox.com/co2-emission-fuels-
d_1085...](https://www.engineeringtoolbox.com/co2-emission-fuels-d_1085.html)

~~~
kohtatsu
Wow that's great, thank you for finding these and doing those calculations.
I'm glad to read it's one of the more fuel efficient aircrafts.

Flightaware puts the actual distance at 760km, so it'd be ~844kg of CO².

I found this great guide by the David Suzuki Foundation which assesses
different carbon offset vendors with a few different measures:

[https://davidsuzuki.org/wp-
content/uploads/2019/10/purchasin...](https://davidsuzuki.org/wp-
content/uploads/2019/10/purchasing-carbon-offsets-guide-for-canadians.pdf)

Pages 42-49 go into the different criteria, Page 50 is the table scoring them
all.

I decided to go with [https://carbonzero.ca](https://carbonzero.ca), it was
$19.89CAD for 0.88t. Thank you for your help :)

