
How the Boeing 737 Max Disaster Looks to a Software Developer (2019) - ciconia
https://spectrum.ieee.org/aerospace/aviation/how-the-boeing-737-max-disaster-looks-to-a-software-developer
======
merricksb
If curious, see this big discussion from the time of publication in Apr 2019:

[https://news.ycombinator.com/item?id=19694207](https://news.ycombinator.com/item?id=19694207)

(528 points/299 comments)

------
lqet
This is one of the best paragraphs of this great article:

> I believe the relative ease—not to mention the lack of tangible cost—of
> software updates has created a cultural laziness within the software
> engineering community. Moreover, because more and more of the hardware that
> we create is monitored and controlled by software, that cultural laziness is
> now creeping into hardware engineering—like building airliners. Less thought
> is now given to getting a design correct and simple up front because it’s so
> easy to fix what you didn’t get right later.

Personally, I have noticed this laziness creeping into areas that are not
controlled by software at all after they are finished, like building or road
construction.

~~~
dv_dt
This "laziness" is not really a function of the people doing software work, or
as you point out any work in any area, it is imho, an expression of our broken
leadership and management culture that puts short term concern for profits
before quality.

~~~
lqet
Indeed, but with software, you can actually have both (short-time profits with
low-quality software, and long term quality by fixing the hacks afterwards),
so "move fast and break things" might actually be good leadership in software.
It doesn't work so well with building planes or houses.

------
sgt
Finally a comprehensive article about the 737 Max issues that I can
understand.

------
ncmncm
The following corrections to fundamental factual errors should not distract
one from the meat of the article:

1\. Jet engines, like car engines, are not described by the "Carnot cycle",
and their efficiency is not governed by it. The Carnot cycle describes closed-
system equipment like refrigerators and Nuke plants. Jet engines, like (our
now obsolete) cars, start with new working fluid -- air -- at the intake, and
discard it at the end -- exhaust. Their theoretical limit of efficiency is
technically 100%, not the 30-40% typical of Carnot equipment. In practice,
they don't get close, but those huge ship engines do achieve up to 55%: they
all take in air much, much cooler than their exhaust.

2\. The change to the thrust vector is _not_ a consequence of moving the
engines forward. It is a simple result of the bigger engine diameter making
the center farther below the wing. Moving the engines forward and up _reduced_
the size of the change to the thrust vector -- just not enough so to match the
old one. But the move had other effects, besides...

3\. The added aerodynamic lift of the engine nacelles has _nothing_ to do with
more power being produced by the bigger engines. It is purely a product of the
angle relative to the direction of air moving past. Bigger barrels have more
area, and more lift, and moving them forward gave them more lever arm to apply
their lift to rotate the plane instead of just holding it up, as wing lift
does.

4\. The change to the thrust vector is what makes punching the throttle _also_
apply a force to rotate the airframe to a higher pitch angle. So there are two
independent forces -- off-center lift, and off-center thrust -- both causing
the same effect: pitching the plane up more than the pilot asked for. How much
you get of each is a very complicated product of all the details of how the
plane is flying.

5\. On all 737s of any vintage, the controls really are still directly,
physically connected, by steel cables, to the control surfaces -- the hinged
bits at the back edges of wings and wing-shaped things. But there are
hydraulics that also yank on the cables, and other stuff yanks on the control
column, to doctor the pilot's experience.

------
ppod
Great article. I just wonder if this bit is really true:

>Most of those market and technical forces are on the side of economics, not
safety.

People seem to think about airline safety a lot, like, an irrational amount
given the risk. I would guess that over the decades consumer pressure was a
huge driver of safety improvements, and the increasing automation seems to
have been a big safety benefit on balance.

~~~
nix23
>like, an irrational amount given the risk.

The risk that 300 Peoples at once can die? I think that's very rational.

~~~
3pt14159
No. It's rational to focus on things like sugary drinks that lead to diabetes
or to work on improving road conditions for motorists and cyclists. So few
people die per unit distance traveled by plane that we're probably over
investing in airline safety. It's that when an accident happens it's front
page news, so the feedback to the overall system is more pronounced than when
a person in their 50s has a heart attack.

~~~
nix23
>sugary drinks

Whats there to think about it? Don't drink it (or not too much), that's where
YOU are responsible for YOUR life and not the life of others. The Airline and
Pilots are responsible for your safe travel.

>road conditions for motorists and cyclists

In Germany 445 people died in bicycle accidents in 2018, that's 1.5
Aircraft's. And please just use a helmet, but you are right sure. But neglect
Air-safety to make Roads safer sounds stupid...why not do both.

43% don't wear helmets and 37% are against mandatory helmets on bicycles,
again that's YOUR responsibility:

[https://www.thelocal.de/20161207/should-germans-be-made-
to-w...](https://www.thelocal.de/20161207/should-germans-be-made-to-wear-bike-
helmets-cycling-germany-australia)

>we're probably over investing in airline safety.

No we hold Airline's responsible for it, if you don't do that this will happen
over time:

[https://edition.cnn.com/2020/06/25/business/pakistan-fake-
pi...](https://edition.cnn.com/2020/06/25/business/pakistan-fake-pilot-intl-
hnk/index.html)

>50s has a heart attack

True because it's not something you or i can change, maybe he/she had bad
habits (drinking, drinking sugar etc) maybe genetic, maybe could not afford a
insurance (and this is something we should and could change)

~~~
3pt14159
> Whats there to think about it? Don't drink it

There are social costs to obesity and diabetes. They should be taxed like
alcohol and tobacco and selling sugary drinks to children should be curtailed.
We should have warning labels and government funded commercial campaigns
highlighting the issue. We should subsidize healthier foods like vegetables
and promote a healthy and active lifestyle. We should make nutritional
education a core part of education.

> But neglect Air-safety to make Roads safer sounds stupid...why not do both.

We presently do both, but we underfund road safety and overfund airline
safety. It's a matter of economics. If we're spending $10m per life saved in
airlines and $300k per life saved on the road then we could save more lives by
adjusting the expenditures.

> No we hold Airline's responsible for it

And I'm happy that we do, but we can still overfund health and safety in one
part of the economy at the expense of the other.

~~~
nix23
>They should be taxed like alcohol and tobacco and selling sugary drinks to
children should be curtailed.

Absolute no problem with that, but has nothing to-do with airline security.

>but we underfund road safety and overfund airline safety

I dont know where you life, but here 'WHE' pay nothing for Airline safety, but
every passenger does, and flying is still too cheap.

>but we can still overfund health and safety in one part of the economy at the
expense of the other.

I really don't know what a country that is, but here (Switzerland) every
passenger pays a tax for the Airport which is responsible for the Air-control
and the Security, and the Airport just lets Airlines land when they meet
international Security standards, its not a public spend but a private one
(from the passenger). And i think it's the exact same in your country.

------
jnxx
So, if I read this right, Boeing's unique feature compared to Airbus was a
more direct and manual mode of control, which they advertised as an advantage,
and they removed this feature without telling clients and pilots?

------
commandlinefan
Isn't this the company that still distributes updates via floppy disks?

~~~
Jtsummers
The context for that would be useful if you want to use it as a way to judge a
company. I was involved, in just the last few years, in distributing updates
via tape. Why? Because the target system that needed updating used tape drives
to receive the updates, because they were built in the 1980s or 1990s (but
designed in the 1970s or 1980s). You may think it's crazy to keep using that
mechanism, but replacing old hardware is a non-trivial challenge in the
aviation world.

~~~
edgriebel
Right! It costs millions to certify something for flight even if it's been in
commercial use for years. Even for something as simple as swapping a spinning
disk for an SSD. And it has to be verified in-situ with the other components
of the system. And separate testing and documentation is required for each
airframe variant ("Type certificate"), so a swap for a 747 and 737 even though
they're both Boeing products requires separate testing and documentation.
There could be THOUSANDS of pages of documentation for a non-trivial product.

------
throwaway0a5e
The author should stick to software and stay out of airframes.

>I’ll say it again: In the 737 Max, the engine nacelles themselves can, at
high angles of attack, work as a wing and produce lift. And the lift they
produce is well ahead of the wing’s center of lift, meaning the nacelles will
cause the 737 Max at a high angle of attack to go to a higher angle of attack.
This is aerodynamic malpractice of the worst kind.

>Apparently the 737 Max pitched up a bit too much for comfort on power
application as well as at already-high angles of attack. It violated that most
ancient of aviation canons and probably violated the certification criteria of
the U.S. Federal Aviation Administration.

Pure bullshit. Many low wing passenger aircraft have this tendency somewhere
in their flight envelope. You can tone it down with some aerodynamic trickery
but it's just a fundamental trade-off you get when the engines are below and
forward of the wing, the load is above the wing and you pin the throttle. The
reason we got ht MAX debacle was because the new 737 had this tendency to a
degree and in a portion of the flight envelope different than the old 737
which meant it couldn't ride on the old 737's cert without a bandage over
that. The 737 MAX8 handles perfectly reasonably for an airborne bus. What it
does not do is handle perfectly reasonably for a 737-800 hence the reason they
overlaid half baked software taking inputs from less than adequate hardware
systems (well they were adequate for the less critical systems they were
designed to feed into).

>Let’s review what the MCAS does: It pushes the nose of the plane down when
the system thinks the plane might exceed its angle-of-attack limits; it does
so to avoid an aerodynamic stall.

No. It (when it's no malfunctioning and trying to put you in a dive) pushes
the nose down when you're getting into the portions of the flight envelope
where the MAX flies differently than the old 737 in order to fudge the same
handling as the old 737 thereby skating by on the old cert.

Big disasters require many small things to go just right. If the software was
more robust this wouldn't have happened. If the hardware was more robust this
wouldn't have happened. If the UX had been compatible with the pilots existing
knowledge this wouldn't have happened. If the airlines were willing to pony up
for type certs this wouldn't have happened. If the FAA had been hard asses
this wouldn't have happened. To play it off as "Boeing tried to fix a bad
plane with software and failed, whoopies" is exactly the kind of naive surface
level postmortem that results in nothing being learned and these kinds of
failures happening again. I know that opinions that deviate from "software
engineers are angels that never do anything wrong and corporate is the devil
also those hardware guys gave the software guys a crap situation to deal with"
is not going to be appreciated on HN but the fact of the matter is that any
one of the five-ish parties here (FAA, customers, and engineering, software
and management within Boeing) could have broken this chain resulting in a
different outcome and to prevent future events like this we need to understand
the decisions that happened at the margin that prevented the chain of events
from being broken.

~~~
dTal
This and other sources disagrees with you:
[http://www.b737.org.uk/mcas.htm](http://www.b737.org.uk/mcas.htm)

>This abnormal nose-up pitching is not allowable under 14CFR §25.203(a) "Stall
characteristics". Several aerodynamic solutions were introduced such as
revising the leading edge stall strip and modifying the leading edge vortilons
but they were insufficient to pass regulation. MCAS was therefore introduced
to give an automatic nose down stabilizer input during elevated AoA when flaps
are up.

The 737 Max was not certifiable without MCAS at all.

~~~
throwaway0a5e
The law states:

>it must be possible to produce and to correct roll and yaw by unreversed use
of the aileron and rudder controls, up to the time the airplane is stalled.
_No abnormal nose-up pitching may occur._ The longitudinal control force must
be positive up to and throughout the stall. In addition, it must be possible
to promptly prevent stalling and to recover from a stall by normal use of the
controls.

Pretty much all commercial airliners are gonna pitch up if you're nose high
and have the throttle pinned. Arguably the degree to which the MAX does this
is novel. I think it likely handles like shit in that part of the flight
envelope but does it really handle like such shit that it's not safe? I don't
think so. Remember, this is basically a company truck. You just have to know
it's an issue and be trained to operate it properly. Most people would be
appalled at the braking characteristics of unladen semi-trucks but it's a non-
issue because everyone who has to drive one knows that in those specific
circumstances it will handle like crap. At the end of the day it is a matter
of degree and opinion. Yes, you can win some quick virtue points by taking a
stand in the name of safety and saying that nothing more than the most minor
pitch up is acceptable but I think that had this aircraft gone for a FAA
certification as a new type it would have passed just fine.

~~~
Swenrekcah
Not the one you responded to but:

I don't know enough about airplanes to say if the MAX could have certified as
it's own type. I'd like to hope it could because if not then that would show a
very sorry state of affairs indeed.

But the big issue with the MAX is not that it handles differently, it is that
Boeing lied to everyone about that fact. Then they made another semi-
unforgivable error by masking the handling with software, ok, but didn't even
take the most basic steps available to ensure some redundancy in that
software.

It is just so very very bad engineering and mindset that it's hard to
describe.

Would I fly on a MAX if and when they get clearance again? Maybe after it's
been flying for a year or so, and depending on what some trusted people and
institutions say about it.

~~~
throwaway0a5e
>But the big issue with the MAX is not that it handles differently, it is that
Boeing lied to everyone about that fact. Then they made another semi-
unforgivable error by masking the handling with software, ok, but didn't even
take the most basic steps available to ensure some redundancy in that
software.

Which is exactly what I'm saying and the opposite of what the author and DTAL
are saying.

>It is just so very very bad engineering and mindset that it's hard to
describe.

I don't think it's even a mindset. It's that sort of "nobody is responsible
for a large enough part of the system to be able to stop things and force
things to be done right" situation that a lot of big-cos get into that results
in them crapping out half baked products.

>Would I fly on a MAX if and when they get clearance again? Maybe after it's
been flying for a year or so, and depending on what some trusted people and
institutions say about it.

For me the decision would hinge on what training the pilots get.

~~~
Swenrekcah
>I don't think it's even a mindset. It's that sort of "nobody is responsible
for a large enough part of the system to be able to stop things and force
things to be done right" situation that a lot of big-cos get into that results
in them crapping out half baked products.

I agree that plays a part.

But the people in charge of the flight computer software should know better
than to have no redundancy.

And the executives definately are able to stop things and force them to be
done right, and instill in their employees a mentality of safety-first. But
their bonuses might have suffered, so...

------
pulse7
So they wanted to make a "backwards compatible airplane" with new engines so
that pilots could use the existing certificates...

~~~
hn3333
let me finish your summary:

... but the aerodynamics were changed in the process and they fixed it with
software, which was bad.

~~~
throwaway0a5e
Would have been fine if they hadn't fixed it with crap software that was
incompatible with the pilots existing runaway trim checklist.

~~~
raxxorrax
I bet if we look at the requirements analysis of MCAS, the developers just met
the specification. I think the article makes to much assumptions about the
state of the industry and developer hubris, but the decision of system design
of the plane was probably decided before it reached developers. They might
also just have a perspective of this one system, so the result is as expected.

~~~
throwaway0a5e
Yup. The UX guys didn't apreciate how critical the software was and how half-
baked it was going to be (and therefore require pilots to need to interact
with it requiring good UX). The software guys didn't know that the pilots
weren't gonna know how to turn off the software so they didn't know they
couldn't half bake it. The engine guys didn't know the software guys were
gonna half bake it. The AOA guys retired in the 1990s and had no idea the
system was gonna be used for anything more than warning lights.

Everyone did their job as it exists on paper but...
[https://youtu.be/452XjnaHr1A?t=20](https://youtu.be/452XjnaHr1A?t=20)

------
raxxorrax
Very interesting article. Seems like Boeing used the Apple approach to
computing here in the quest to make more money.

------
jiofih
(2019)

------
ppod
This article makes a really good case overall, but it must be deliberate that
it chooses not to mention the stabiliser trim cut-out switch. In the part
about the Cessna he says:

>There are instructions on how to detect when the system malfunctions and how
to disable the system, immediately. Disabling the system means pulling the
autopilot circuit breaker; instructions on how to do that are strewn
throughout the documentation, repeatedly. Every pilot who flies my plane
becomes intimately aware that it is not the same as any other 172. This is a
big difference between what pilots who want to fly my plane are told and what
pilots stepping into a 737 Max are (or were) told.

Which really implies that no such procedure existed for the Max, and that's
just not true. There may have been problems with the procedure, as far as I
know it's not fully known yet why that procedure didn't prevent the accidents,
possibly inadequate pilot training.

[https://www.flightglobal.com/safety/trim-cut-out-puzzle-
emer...](https://www.flightglobal.com/safety/trim-cut-out-puzzle-emerges-from-
ethiopian-crash-probe/132175.article)

~~~
roelschroeven
> it's not fully known yet why that procedure didn't prevent the accidents

The only way in the 737 Max to disable MCAS, is to disable electric trim
altogether. That means only manual trim is available. When MCAS has trimmed
all the way to one of the extremes (as happened in the Ethiopean flight), it
requires many revolutions of the trim wheel and superhuman strength to return
it to a more appropriate setting (see e.g.
[https://youtu.be/aoNOVlxJmow?t=750](https://youtu.be/aoNOVlxJmow?t=750)).

Previous generations of the 737 had two buttons related to the electric trim:
one disabled the automatic trim system, the other one disabled electric trim
altogether. The 737 Max still has two buttons, but they both have the same
effect: disable the electric trim altogether. The old system could easily have
prevented the crash by disabling MCAS but leaving the electrically assisted
trim enabled.

Boeing's procedure to make manual trim adjustments possible is to put the
airplane in a dive, because that lowers the aerodynamic forces on the trim
surface. When you're only a few hundred feet above the ground, that's not
really an option. The pilots re-enabled the electric trim in the hope they
could use it in their fight against MCAS. It's the only thing they could think
of in the stress of the moment, faced with near-certain death and an airplane
actively trying to crash.

In hindsight, after reading accident reports and thinking about it in the
comfort of chair that's not crashing, there were things they could have done.
Lowering the throttle would have lessened the forces on the trim surface,
activating the flaps once the speed was low enough would have disabled MCAS. I
can't blame the pilots for not coming up with that in the heat of the moment.

~~~
ppod
All I said was that as far as I know it isn't fully known. The final report
isn't out yet. I'm just an aviation enthusiast, but it's notable that in many
discussions anything that could possibly be perceived as even possibly
implying any pilot fault is kind of taboo. I'm not saying the pilots did
anything wrong, I'm just saying that leaving the stab cut-out switch out of
the story seems strange to me.

~~~
roelschroeven
I don't really understand what you mean by "leaving the stab cut-out switch
out of the story". Can you point me to a report that ignores the stab cut-out
switch?

~~~
ppod
I meant the OP IEEE article that these comments are on.

