
737 Max Explanation by a Software Engineer - paulsutter
https://twitter.com/trevorsumner/status/1106934369158078470?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1106934369158078470&ref_url=https%3A%2F%2Fwww.zerohedge.com%2Fnews%2F2019-03-17%2Fbest-analysis-what-really-happened-boeing-737-max-pilot-software-engineer
======
JorgeGT
I'm an assistant professor of aerospace engineering and I find this analysis
quite spot on, in which this is representative of a much larger issue of
economic and regulatory negative incentives, rather than just a "software
issue" as some news outlets have reported. What I find downright criminal is
this:

> _Boeing sells an option package that includes an extra AoA vane, and an AoA
> disagree light_

The fact that the redundancy of a sensor on which a system capable of sudden,
large control inputs relies is an optional package to be purchased
separately... I simply have no words.

How was this package advertised in the brochure? Pay extra and when the
airplane nosedives at high speed, this useful indicator will helpfully warn
you it's because AoA reading disagreement?

~~~
Animats
I was amazed at that. Boeing used to be known for overdesign for safety. The
B-747 had four redundant hydraulic systems. Here's a 787 doing aerobatics at
the Farnborough air show to show that it can operate way outside the normal
passenger aircraft flight envelope.[1]

Boeing used to be an engineering-first company. HQ was at Boeing's own airport
near Seattle. Then they got new management and moved corporate HQ to Chicago.

[1]
[https://www.youtube.com/watch?v=vzr313wSY_Y](https://www.youtube.com/watch?v=vzr313wSY_Y)

~~~
tyingq
Similar video with a test pilot pushing a 707 pretty hard. Includes a barrel
roll, which isn't hard on the aircraft, but unusual to see with a passenger
aircraft.

[https://youtu.be/Ra_khhzuFlE](https://youtu.be/Ra_khhzuFlE)

The 707 was pretty similar to the 737...same main fuselage dimensions, similar
pax capacity, etc.

~~~
nickgrosvenor
Love the 4k footage.

~~~
tyingq
Ah, well, it is from 1955.

------
gok
I really wish people would wait for the report before drawing conclusions like
this. These investigations take a long time, and it's often not the issue that
gets circulated on Twitter.

AirAsia 8501 was widely suspected to be caused by a thunderstorm. Wired [1]
and WaPo [2] still have articles up blaming the weather. When the
investigation came out a year later, it turned out to have nothing to do with
weather. The fly-by-wire system malfunctioned and the pilots got confused.

[1] [https://www.wired.com/2014/12/airasia-
qz8501-thunderstorms/](https://www.wired.com/2014/12/airasia-
qz8501-thunderstorms/)

[2] [https://www.washingtonpost.com/news/capital-weather-
gang/wp/...](https://www.washingtonpost.com/news/capital-weather-
gang/wp/2014/12/28/the-challenging-weather-conditions-when-airasia-
flight-8501-disappeared/)

~~~
the_mitsuhiko
The analogy does not make much sense because the majority of what is in this
twitter thread is not new information or disputed. We also know that boeing is
fixing it by a software patch already.

~~~
losvedir
Eh, we still don't really know if MCAS is the cause of the Ethiopian crash,
though. Some things point to it (flight fluctuating up and down, jackscrew
found with full nose down trim), but some things are different, too, like
crazy acceleration and handling issues right from takeoff, when the flaps
would still be up and MCAS inactive.

~~~
spectramax
Even if it Egyptian Air didn't crash their plane, Lion Air investigation alone
exemplifies systemic negligence, not from the software standpoint, but from
the top-down executive level and the negligence of the FAA.

So, your point is valid about that we need to wait for Egyptian Air's
investigation, but misplaced because of the aforementioned argument.

~~~
croisillon
*Ethiopian

------
rdiddly
This is a "Twitter sucks" off-topic rant comment, so if you're not interested
in that, just move along. At no point in my reading on this topic or any
other, did I say to myself "Boy this thing would be great if it were broken up
into a series of small brainfarts and served up one at a time on a bloated,
slow-as-molasses web platform." I'm embarrassed every time someone tries to
express a complex thought on Twitter. It's like a machine that turns your
thoughts instantly into listicles. And every time I go view something there,
I'm astonished all over again at the dismal user-experience people put up with
in exchange for "access" to a "network." (Facebook is worse... it looks like
some crap I built for my big-company employer. sic. I'm not much in the front-
end department, so yes, my UI sucks balls. But _my_ users _have_ to use my
app, and they get _paid_ for doing so. Facebook users, I can only weep at the
thought. But I digress. This was supposed to be about Twitter.)

~~~
dilap
I'll take a contrary opinion -- forcing each thought into a tweet is a nice
constraint that compels people to get to the point. This would probably be
less well-written as, say, a Medium article.

~~~
rdiddly
I actually agree with that - it's an interesting exercise to communicate
concisely within a limit. (Sort of like Vine, which Twitter destroyed, but
oops there I go getting smart-assy again.)

It's just, if that's your game, stick to the game, don't cheat by sprawling
across 14 of those. It fails as an instance of the Tweet artform because
cumulatively it's too long, and it fails as a longer-form piece because it's
all broken up.

If I strung together 1780 Vines to make Fellowship of the Ring (and yes,
nerdily, the math works out there), what have I proven? My powers of
conciseness and economy? My respect for rules and limits? My ability to choose
the right tool for the job?

------
VBprogrammer
Often times as a Software Developer I encounter a bug which has an obvious two
line fix. Rather than implementing that though I often spend another few
minutes digging into how and why that bug was introduced. Often times I'm left
with a greater understanding of the problem or encounter a requirement that
the previous developer was trying to implement that my fix would have broken.

Other developers will simply assume the previous developer was an idiot and
bash in the fix.

I feel like in this case a lot of people are assuming the engineering team
were idiots, or criminally trying to make an aircraft which didn't pass safety
standards. Rather than taking a look at what caused the bug in the first
place.

~~~
rlabrecque
I deal with this constantly. Someone gets a bug report for a crash, let's say
a null pointer dereference, so often I see:

> if (pPointer == nullptr) { return; }

> Crash is fixed!

I mean sure... but that's not the problem. Why was pPointer null here in the
first place? So few people take the time to understand that :(

~~~
aphextron
>So few people take the time to understand that :(

Because "fix this null pointer exception" is ticket number 14 this week, and
your PM just wants it checked off. They don't want to hear that you need
another week of digging through layers of spaghetti to track down the source;
that doesn't bode well for their KPI goals.

This is a systemic issue in the way software companies function.

~~~
quickthrower2
We can blame management but sometimes the developer just doesn’t give a f* or
it doesn’t fit their agenda. I guess both are ultimately management issues,
but it’s a shared responsibility.

~~~
thatoneuser
Ultimately it doesn't matter if your engineers don't give a fuck if management
will sabatoge their efforts when they do care. Only if you have management who
won't forgo quality for ticks can you really blame the dev.

------
userbinator
More information about the MCAS than you probably ever wanted to know:
[http://www.b737.org.uk/mcas.htm](http://www.b737.org.uk/mcas.htm)

That page includes this noteworthy and unusual design decision:

"MCAS is implemented within the two Flight Control Computers (FCCs). The Left
FCC uses the Left AOA sensor for MCAS and the Right FCC uses the Right AOA
sensor for MCAS. Only one FCC operates at a time to provide MCAS commands.
With electrical power to the FCCs maintained, the unit that provides MCAS
changes between flights. In this manner, _the AOA sensor that is used for MCAS
changes with each flight_."

~~~
mannykannot
> the AOA sensor that is used for MCAS changes with each flight.

My first thought was that, in the Lion Air case, it happened both on the crash
flight and the one before - but an attempt was made to fix the problem between
flights, so the FCC may well have been powered down (alternatively, maybe both
senors were faulty.)

------
rwhitman
One of the trends I find most disturbing in business over the last few years
is the nonchalant passing of the buck on hard business problems, down the food
chain to software engineers.

The Silicon Valley mantra of "software can change the world!" has infected
every corner of our lives but frequently people misinterpret this as "software
can solve anything! (so I don't have to)".

Software engineers also tend to eagerly say "yes" to solving every problem
with code, when sometimes a problem just can't be solved with code. Thus
compounding the issue.

I'd argue that many of the macro problems in our world right now stem from
this cycle.

My PSA to all devs - if someone asks you to patch a major business problem
with software, push back. Sometimes a puzzle to solve, isn't _your_ puzzle to
solve. Send it back up the food chain. You don't have to say yes to
everything.

~~~
thatoneuser
That's easy enough to say while talking about crashing airplanes. Harder when
your H1B or family's dinner relies on you keeping your job.

I think everything keeps pointing to more punishment for management and
corporate decisions. I mean management doesn't really do the work, they should
at least be responsible. Otherwise it's just a system to attenuated blame.

------
WalterBright
The failure of the MCAS system does not indict using automatic controls to
adjust the flight envelope of the airplane. Lots of systems do that already:

1\. The autopilot

2\. The feel computer

3\. The device that reduces elevator authority at high speeds

4\. The stall stick pusher

5\. Hydraulically boosted controls

Modern jets would not be flyable without these, and the net effect of them is
to make the jet much safer.

The failure of the MCAS system does not indict the purpose of the MCAS system,
either. The problem with it was it continued operating with a failed sensor.

------
mcguire
"Hey, Bob, we need you to write the software for this system. It's based on
one, non-redundant sensor and can move the elevator trim to an extreme
position. Sound good?"

"Sure, no skin off my nose."

Isn't software engineering a wonderful field to be in?

~~~
JustSomeNobody
Except that's not really how it works

Bob to team: "What should happen when the two AoA sensors disagree?"

Team: "We should alert the pilot"

Manager: "We can't alert the pilot because the manual will need to change.
What if we only use one sensor?"

Bob: SMH

Team: "That's not a good idea. We need redundancy."

Manger: "Well, we're not alerting the pilot. Use the one sensor."

Bob: Writes the code to use only one sensor.

~~~
TheHypnotist
This shouldn't happen in a vacuum. Who's writing the requirement? The test
case?

------
sunnyP
AoA = Angle of Attack
[https://en.wikipedia.org/wiki/Angle_of_attack](https://en.wikipedia.org/wiki/Angle_of_attack)

------
credit_guy
Just in case anyone is wondering why more efficient engines are bigger: the
energy is quadratic in speed (mv^2/2) while the momentum is linear (mv). For a
given amount of energy (which comes from burning fuel) you can choose to push
the airplane forward by pushing air back in 2 ways: 1. less mass, more speed,
2. more mass, less speed. It turns out 2 is better, for example you can push 4
times as much mass for half the speed, which results in twice the momentum of
the air pushed backwards. Now the amount of air you can push is the amount of
air you can get, and that's proportional with the front area of the engine.
So, you always want to have as large an engine as possible. Bonus: the larger
the engine, the slower the air moves through it, and so the less noise it
produces. When you read that engines have become both more efficient and more
quiet over the years, the second part was just a nice side-effect of the
first.

~~~
Gibbon1
This is one of the reasons I don't poo poo hybrid electric aircraft. With
electric you can drive two or more fans off one turbine. Which allows you to
increase the bypass ratio. As you mentioned the gains from that are quadratic
where the efficiency penalty is linear.

Notable is using larger diameter high bypass ratio engines is what lead to the
373-MAX design compromises.

------
abalone
Sure, it’s a system failure not strictly a software failure, but I don’t think
the Boeing software engineers are off the hook here. Software is where the
whole system comes together. Software is what can mitigate sensor failures.
Software is the top of the stack that gets certified for reliability.

A good safety culture will not have even a whiff of a “not my job” attitude.
The software team should never have signed off if they noticed that a single
sensor failure could cause their “correct to spec” program to crash the darn
plane (if that’s indeed what happened).

~~~
autopilotsw
I think when people say software error it is in a general sense. It mean the
problem is in the software as opposed to hardware. The are different types of
software errors. A software error can be a coding error or a bad requirement.
In the case the requirements are the issue. In my career we had safety,
systems and software engineers. An experienced software engineer might have
challenged the requirement in this case but the design safety would fall more
on the system and safety engineers.

------
djvu9
I think a lot of the discussions are missing the point. The mcas system itself
is indeed just a duct tape for a _known_ design defect, ie using a new engine
on an old body. It is like you replace a part in your car, find it over
heating, and put an ice bag on it. The planned software “fix” is something
like changing the volume of ice. I think it is a dead end and it is scary.

~~~
macspoofing
>ie using a new engine on an old body.

Would you ever be surprised if an old car got brand new tires? No? Then why do
you find it so surprising that engine manufacturers would build new engines
for existing airliner designs?

~~~
sundvor
That's more like fitting oversized wheels / tires that will rub into the well
/ body every time you hit some proper bumps. Sooner or later they will fail,
spectacularly.

~~~
macspoofing
Obviously the analogy breaks down once you start unpacking it.

Question to you though, what makes you so sure that this is in fact what
happened here?

------
aphextron
All of this stems from a pointy-haired marketing decision to push the MAX as
an upgrade to current technology requiring no new training for airlines. If
they had made the ethical decision to seek a new type rating and force every
pilot to be trained in the MCAS system, 400 people would still be alive. Those
executives have blood on their hands, and they know it.

~~~
Xixi
The ET pilots were trained with the MCAS system.

But to speculate: ET pilots seem to have experience issues before the MCAS
would be active, so it's not unlikely that there was another issue with the
plane, compounded by the MCAS kicking in when pilots where already fighting
the aircraft...

------
linuxftw
No new information here, just another person pretending to be some
authoritative source on this.

There's 0 proof the software worked correctly or that it's fit for purpose
whatsoever.

------
CodeSheikh
Fixing an "aerodynamic" problem with a "software" solution is already cutting
over to a different problem domain and it will lead to unforeseen
circumstances. What can go wrong, will go wrong.

People at Boeing who made decisions for this project whether it is a team lead
or a test lead or a project manager or a sales exec or a CEO; are all equally
blamed for this. These deaths are on their conscience.

~~~
macspoofing
>Fixing an "aerodynamic" problem with a "software" solution is already cutting
over to a different problem domain and it will lead to unforeseen
circumstances.

I have a hard time parsing this. A modern airliner is a conglomerate of
physical aerodynamic design, electronics and software. I am not convinced that
something like MCAS is so out of the norm from modern aviation design
principles.

>People at Boeing who made decisions for this project whether it is a team
lead or a test lead or a project manager or a sales exec or a CEO; are all
equally blamed for this.

Maybe. Or maybe there is no actual underlying problem. Or maybe the problem
has nothing to do with the MCAS system. Let's wait a little and see how it
plays out.

------
gnulinux
What's the reason people write long stuff like this on Twitter? Literally
unreadable.

------
ksajadi
With this line of argument pretty much nothing is a software issue since
software is mostly there to compensate for something else: speed, errors,
efficiency, manual labour, etc?

Highlighting the facts behind the design decisions of 737Max 8 is good for
general knowledge but doesn’t help with much else in this context.

To follow this line of argument, I’d claim that this is the fault of old
airports that didn’t have jetways so 737 had to be designed with lower body to
allow folding stairways and so on...

~~~
D_Alex
Yes... and furthermore: it seems that a key problem was:

> MCAS can make huge nose down changes

This, to me, is really odd. All the hardware changes could not, AFAICT,
require that. It seems really dumb that the MCAS system was made to be
capable, in principle, of completely overpowering pilot input.

Is it a software problem...? Well, if the MCAS was limited to making only
small changes in the stabiliser position, that could be counteracted by
pilot's input to the elevators, these accidents would not have happened.
AFAICT. It does seem that software contributed to the accidents.

------
newscracker
I read the entire thread, but the summary is that this is a harsh indictment
of Boeing, its handling of this aircraft and the accidents. It describes
Boeing as cutting corners in many places, and makes it seem like what has
happened was inevitable (in retrospect).

~~~
RandomTisk
We don't have the official word on what happened yet.

------
gabrielblack
I'm a frequent flier and I'm scared. I try avoid companies with low standards
reading all the news about incidents related to the use of used or
counterfeited parts, lack of maintenance, etc. But now it's clear that, in
times of low cost companies, cheap airplane are requested and even the
redundancy in critical subsystems is sacrificed both by the producer and the
flight company that didn't pay for a "optional" that actually is a lifesaver.
How many critical subsystem haven't redundancy to reduce the costs of
airplanes ? Maybe some regulation in this market is needed to avoid other
disaster like this one, imposing standard for the critical system and denying
the routes to the airplanes that do not meet the specifications. I don't think
that the market can play with human lives.

~~~
7952
This kind of compromise is made all the time. In the past you needed 4 engines
to cross oceans, now it is normal to just have two. It saves lots of money,
and has been a massive success in terms of safety. The industry generally
seems to get these compromises right when you look at the amazing safety
record.

------
scoutt
_" we're ... called on to fix the deficiencies of mechanical or aero or
electrical engineering"_ __

As an embedded and firmware developer, I can tell you that this happens almost
every day. If you ask how it is even possible to fix mechanical issues with
software, know that it is true.

But, you know, this time the electrical engineer screwed up the power supply
and there's noisy glitches everywhere, _we just can fix it with software_ they
say. Or the mechanical engineer designed the cover plastic with the wrong
material and LEDs light comes out _ugly_ : no problem, _let 's arrange the
weirdest PWM sequence with SW so it looks nice_.

This time, people died. Don't throw at us badly designed system so easily
because _it 's just software_.

------
selimthegrim
This makes Boeing sound like VW - “We don’t want people to have to refill
AdBlue except at oil changes”

~~~
Iwan-Zotow
Do you mean DEF Blue?

~~~
selimthegrim
Yes

------
blackrock
I think this is perhaps a serious design flaw with the plane.

Boeing wanted to make the 737 more fuel efficient, but they didn't want to re-
certify the frame, and design a new body. So, they put bigger engines on the
wings. This sounds simple enough.

Except that the engines were too powerful for the frame to handle. So on take
off, these extra powerful engines would push the nose of the plane up, to such
an extreme angle that it could cause the plane to stall, and risk falling out
of the sky.

In order to compensate for this, they introduced software and sensors that
would mechanically adjust the flaps of the plane, in order to help "level out"
the plane. This is probably ok for inherently unstable fighter jets, but for
commercial aviation, a single crash is devastating.

So, this issue is not just a software defect, that can easily be fixed with
code. This is a serious design flaw, where the planes are a death trap just
waiting to happen. There is a mismatch between the geometric placement of the
powerful engines, in relation to where it should be on the plane, in order to
achieve balanced flight, without the need for software to auto-correct for an
excessive nose-up pitch. It was probably only a matter of time, before sensors
start to fail, and the software can no longer handle the situation.

~~~
godson_drafty
This is incorrect. The engines are not too powerful for the airframe. The
problem is that the engines themselves create lift at high Angles of Attack,
pitching the nose of the plane up.

"This new location and size of the nacelle causes it to produce lift at high
AoA; as the nacelle is ahead of the CofG this causes a pitch-up effect which
could in turn further increase the AoA and send the aircraft closer towards
the stall."

[http://www.b737.org.uk/mcas.htm](http://www.b737.org.uk/mcas.htm)

------
yingw787
I wonder how this will affect pre-sales and sales of the Boeing 797.
Apparently, they're going to pull the trigger on whether to build it this
year:
[https://en.wikipedia.org/wiki/Boeing_New_Midsize_Airplane](https://en.wikipedia.org/wiki/Boeing_New_Midsize_Airplane)

I think it would be a good decision to do this. Not only because the 757
design in 50 years old, there's no planes Boeing offers that easily substitute
for the 757, it would fit well alongside the business direction of the 787
(which has proven itself out quite well), but also because it would be a
completely new plane, with few to no band-aids. I would trust a 797 over a 757
refresh, because Boeing would be much more terrified of a new plane with so
much invested capital never achieving market acceptance than an older plane
that has already been sold with money in the bank.

I would also hope Boeing's sales/marketing department understands planes
falling out of the sky is bad for current and future sales growth, and now
appreciates the difference between a properly safe plane and an unsafe plane
with lots of band-aids.

~~~
Xixi
The 757 production line doesn't exist anymore, so a 757MAX is completely off
the table. Boeing even refused to build more passengers 767 even though the
line is still running (for freighters and KC-46). The 797 as it is currently
showed to prospective airlines is closer to a 767 replacement than a 757
replacement.

The real kicker is: if 737MAX becomes a hard case with lots of cancellations,
or Boeing simply cannot sell it any further without cutting the price too
much, Boeing will have to build a replacement from scratch sooner rather than
later. The nickname for this project is NSA (New Single Aisle, I think). Or
Boeing could try to build both at the same time (similar to what they did with
757 and 767).

The 797 is in an interesting situation: I believe Boeing is sitting on an
incredible plane from a technical perspective, but the business case is hard
to close. Of course the engineers want to build it: it's an incredible plane.
But in my completely uninformed opinion it would be a mistake: no matter how
great an airliner is from a technical perspective, and how alluring it is to
engineers, it should not end up being a perfect solution looking for a
problem.

Delta really wants the 797, but the design might be a little bit too US-
centric, as if I understand correctly the capacity to haul cargo is sacrificed
to keep flying costs low. But that makes it a complete no-go in the asian
market, and is arguably not very forward looking (assuming ongoing rise of
cargo needs). If the business case is hard to close, Boeing should just move
on and build the NSA.

Airbus did that mistake with the A330NEO. They didn't have a clear business
case, but a couple of customers and lessors kept pushing because they really
wanted it, so eventually Airbus agreed. At least it's a "cheap" mistake,
compared to a clean sheet design...

~~~
ggm
Do you have pointers to A330Neo problems which put it into this fail bucket? I
found stuff about delayed delivery, and I found some scuttlebutt about the RR
engines, but I can't find something which says its a fundamentally flawed
idea. Bearing in mind that the 787 did not exactly have a stellar launch,
having an Airbus A330 in the space feels to me logical: Many airlines have
pilots trained in the A330.

Oh wait.. Is that what you mean? That there may be lurking differences in the
flight envelope in a NEO to any prior experience on 330?

~~~
Xixi
The problem is very simple: the A330NEO is not selling well at all. It was
supposed to be a cheap alternative to the 787: not quite as good, but much
cheaper. The problem is that Boeing has managed to reduce the manufacturing
cost of the 787 so much that you can essentially buy a 787-9 for the same
price as an A330-900.

The A350 is also suffering from the cheap price of the 787: it is too
expensive, so Airbus has to work hard to lower the manufacturing cost...

~~~
ggm
This is declaration by fiat. Do you have pointers to back this up?

Web searches are much more equivocal. Many pro Boeing but not all.
Observations that by type and training and flexibility an a330 fleet with a
mix of ranges can suit.

Emirates has made big orders.

------
mastazi
If you have 20-ish minutes to spare, I suggest this video about the same topic
by Mentour Pilot who is a 737 NG captain:
[https://www.youtube.com/watch?v=TlinocVHpzk](https://www.youtube.com/watch?v=TlinocVHpzk)

It's not very technical, but very easy to understand. Assumes you have some
basic aviation knowledge e.g. what a stall is and how weight & balance affects
flying.

------
tzs
> If the pilots had correctly and quickly identified the problem and run the
> stab trim runaway checklist, they would not have crashed.

I'm curious how long it takes to run that checklist, and how much altitude
would be lost while doing this? How long does it take to reach sufficient
altitude to have time for this?

Also, I have a question about stall recovery and altitude. Are there any
altitudes for which it is better to go ahead and stall and fall flat out of
the sky than to nose down and risk flying into the ground at above terminal
velocity? If so, do any automatic systems on any planes recognize you are in
such a "must crash" situation and try to pick the least worst crash?

~~~
gvb
Video of training for runaway stabilizer trim:
[https://youtu.be/3pPRuFHR1co](https://youtu.be/3pPRuFHR1co) (time 2:45)

* The clanking sound is the stabilizer trim "runaway". In the video, it starts while the video is zoomed in; when the video zooms out you can see the trim wheels (next to their legs) spinning.

* The trainer (left hand seat) says "rudder", but he means "stabilizer" (he says it correctly later in the video)

* The pilots in a real plane would likely not hear the noise because they will have noise canceling headsets on, but the manual trim adjustment is the big wheel next to their leg that spins very visibly and they would feel the trim pushing the plane's nose down

* The stabilizer trim adjustment is relatively slow - it takes just under 10 seconds to travel end-to-end, so runaway time is going to be at least five seconds.

------
OJFord
URL has a load of junk on the end of it.

Suggest change to:
[https://twitter.com/trevorsumner/status/1106934369158078470](https://twitter.com/trevorsumner/status/1106934369158078470)

------
zyngaro
Is used to think aerospace industry was the most safesty conscious industry
because people trust manufacturers with their lives but now Boeing is selling
an essential feature like sensor redundancy as a option to make extra money .

~~~
throwaway808080
Kind of how Mixpanel used to sell single sign on security at an extra price
and free users didn’t get it.

Then they got hacked and shit blew up on their face.

Safety and security aren’t add-ons. It seems that in the name of making a bit
of $$$, Boeing cut corners and led to loss of life.

------
zyngaro
Given given the relative shortage of talented software engineers in a world
where software is eating the world, I find worrying that aircrafts are
increasingly relying of software systems to make them airworthy.

------
nbevans
Whilst this is an interesting read and is almost certainly largely true it
does not entirely square with the fact that Boeing are working (and the FAA
expect the certify) a software fix by April - as noted in their press release.

Software bugs or not - it does seem a major factor was the lack of an extra
"AoA sensor" and a "sensor disagreement indicator". Presumably a very low cost
option in reality that Boeing should have made standard fitment at least for
the first year or so whilst they worked out any kinks in the MCAS system.

~~~
ncallaway
The fact that Boeing is working on a software fix doesn't contradict the
thread. It's explicitly mentioned in the thread.

[https://mobile.twitter.com/trevorsumner/status/1106934422249...](https://mobile.twitter.com/trevorsumner/status/1106934422249582593)

> Nowhere in here is there a software problem. The computers & software
> performed their jobs according to spec without error. The specification was
> just shitty. Now the quickest way for Boeing to solve this mess is to call
> up the software guys to come up with another band-aid.

(some related follow up tweets)

> I'm a software engineer, and we're sometimes called on to fix the
> deficiencies of mechanical or aero or electrical engineering, because the
> metal has already been cut or the molds have already been made or the chip
> has already been fabed, and so that problem can't be solved.

> But the software can always be pushed to the update server or reflashed.
> When the software band-aid comes off in a 500mph wind, it's tempting to just
> blame the band-aid.

> Follow @davekammeyer if you want to dig in.

~~~
buchanan
I don’t get this thinking, if you’re in the bandaid making business, maybe
make sure it doesn’t cause an infection ?

In this case the software was developed to compensate for the system
characteristic,it did not fully do that. Of course, it is immensely
frustrating that software is always called upon to the papering up, but that
is another issue.

~~~
ncallaway
Sure, I'm not necessarily endorsing the thread.

I'm just saying the existence of an in progress software patch in no way
contradicts the thread itself.

It's part of the premise of the thread itself.

~~~
buchanan
I was looking at it from the narrow view that it was to do A (the papering
over), and it did not do that (fully).

On second thought, it’s more a systems engineering issue not to take that case
into account. Software engineering doesn’t get off scot free though, as they
are an important voice.

With the presumably tight engineering controls that are practised, I can
speculate that it may have fallen into “the pilot disables and takes over
control” branch. The gap would then be that they did not think that the
airlines would be given the option to “not” install the sensor failure
warning.

------
toomuchtodo
[https://threadreaderapp.com/thread/1106934362531155974.html](https://threadreaderapp.com/thread/1106934362531155974.html)

------
NikkiA
What worries me is if we're going to see some kind of relevant similarity with
the 777X which afaik is 'type rated as the same as the existing 777', despite
having new mechanical processes to fail (wing fold) and entirely new
undercarriage.

Still, the 777X is still some way from hitting customers, so maybe Boeing will
spend some time contemplating the way they gamed type rating with the MAX
before hitting customers.

------
jshowa3
What do you expect? Boeing is so integrated with the government that it's
hardly surprising that poor regulatory decisions influenced the crashes. In
fact, that's like every project. Nobody willing to assert themselves and say
no. So they start integrating a bunch of unnecessary systems to compensate for
flaws until you get one, big giant mess that you can't control with a deadline
looming.

------
throw0101a
Whenever there's talk about "causes" of things, it makes me wish that more
people had studied Aristotle and his four causes:

* [https://en.wikipedia.org/wiki/Four_causes](https://en.wikipedia.org/wiki/Four_causes)

This was a pretty good example of material, formal, efficient, and final
causes.

------
cletus
This story keeps getting worse and I was already shocked and stunned beyond
belief.

\- Boeing recycles 737 airframe moving the engines. This seems largely about
reducing costs, decreasing time-to-market and (importantly) maintaining a
common type rating.

\- To compensate for the engines moving, which could cause the nose to dip,
they add a software solution (MCAS) that could dip the nose without really
telling any pilots or airlines. Worse, it's based on a single input (well, one
of two but it only listens to one at a type), this being the AoA sensor.

\- Blaming pilots for the Lion Air disaster. Whatever the truth, that's
certainly premature.

\- Boeing refusing the ground the aircraft after the second crash.

\- The FAA apparently complicit in this until it finally capitulates to the
inevitable and grounds the plane after Europe and several others already have.

\- The hubris of not wanting to appear wrong or like they're capitulating to
public pressure, Boeing sticks to their guns til the better end.

\- An AoA sensor upgrade as an option for what is arguably a critical system.

What's also fascinating is all the Boeing apologists who have come out of the
woodwork (eg [1]). I've seen comments about how the airlines "demanded" the
737 MAX. There might be a demand for a low-cost narrow body passenger jet and
I'm sure that's the reason the 737 MAX was developed. Anecdotally, it seems to
be terrible for passengers (eg [2]), which would certainly be compatible with
the idea that this is a low-cost solution.

It's also worth mentioning the rudder issues of the 737 that was posted here a
few days ago [3].

I honestly don't understand how Boeing's management can be so reckless with
the hard-earned reputation for safety. They've done so much damage to their
brand with this that if it wasn't for the fact that hundreds of people have
died here, Airbus would be laughing all the way to the bank (or at least it
would take the edge off the giant A380 boondoggle).

As much as pilot error has been a significant cause of air disasters (eg
experienced pilots pulling the plane up to cause a stall as in the Air France
crash), you get a sense of how hard it would be to fully automate piloting a
plane. What I find disturbing is how hard overriding automated systems seems
to be. When a plane's automated systems fails, shouldn't a pilot be able to
easily take full manual control? I would've thought so. You see examples of
this like Qantas Flight 72 [4].

And flying a plane is in some ways a much simpler problem than driving a car.
You takeoff, you fly a predetermined route and you land. There are some
adjustments for weather and other factors and occasionally you have to turn
around or deviate and make a landing. I'm obviously oversimplifying here but
cars seem to have so many more corner cases here. People seem to think
autonomous cars are right around the corner. I'm not so sure.

[1]
[https://news.ycombinator.com/item?id=19389791](https://news.ycombinator.com/item?id=19389791)

[2] [https://thepointsguy.com/2017/11/first-look-aa-
boeing-737-ma...](https://thepointsguy.com/2017/11/first-look-aa-
boeing-737-max-8/)

[3]
[https://news.ycombinator.com/item?id=19385980](https://news.ycombinator.com/item?id=19385980)

[4]
[https://en.wikipedia.org/wiki/Qantas_Flight_72](https://en.wikipedia.org/wiki/Qantas_Flight_72)

~~~
jacquesm
> To compensate for the engines moving, which could cause the nose to dip

The engines could cause the nose to go _up_ , leading to a higher chance of
stalling the plane. The reason why the nose would pitch up is because the
engines are below the center of gravity and that's what more engine power
would cause the plane to rotate around. To offset that they came up with the
idea of changing the trim.

------
dandare
* Management problem. The senior executive who smooth-talked every department into bending their own rules, using phrases like "working together as a team", "focusing on the solution, not the problem", "agile" and "MVP", was hailed as a hero and financially rewarded.

------
evilotto
It's not a software problem. It's a software _engineering_ problem. It's the
attitude of "it met the specifications, so I did my job and it's not my fault"
that separates this kind of software "engineer" from the likes of William
LeMessurier and Bob Ebeling.

------
tbeutel
Are attitude indicators or gyroscopes used as as inputs to automated systems?
Or are only external sensors used?

------
jacurtis
If you found it difficult to read the twitter thread, this is the same thing
in blog format.
[https://threadreaderapp.com/thread/1106934362531155974.html](https://threadreaderapp.com/thread/1106934362531155974.html)

------
swiley
Yet more deaths because people aren't looking at what the software controlling
their lives actually does (in this case, ignoring extra sensor readings that
could indicate a failure of one of the sensors.)

I feel like it's getting better in most industries but not in things like
aerospace.

------
platz
Likely, the reason the 737max story receives so much attention is because
software devs (consciously or not) feel this could affect our industry.

There may even be some guilt involved (justified or not), if it involves
software in any meaningful way.

Various community members have been warning for some time that we'll face
regulation sooner or later; all that needs to happen is a sufficiently large
disaster. The dependence between life-critical hardware and software will only
increase in scale.

Whether or not this begins our "Iron Ring" moment, I think it's something devs
implicitly feel, and is culturally resonant for them.

\---

> my brother in law @davekammeyer, who’s a pilot, software engineer & deep
> thinker.

> I'm a software engineer

The thread does feel a little defensive, no?

I'm not saying that software was the cause, or even the main cause; though
even if the other causes appear to be the precipicating factors, we should be
on the watch out for defensiveness, without knowing the _whole story_

* I don't care that in this case, the software was not to blame - that is not the main point i am making.

~~~
sbarre
I wouldn't say defensive, but rather informative.

The main point of that thread is not actually to "solve the mystery" of the
crash, or even to point fingers at where the fault lies.

The point I took away from the thread is to show that these issues are
complex, and there is never one single thing you can point to as "the
problem". In most cases, it is really a series of interconnected events.

Our media (and I think many of us - so I'm not singling them out) loves to
simplify problems in an effort to make them understandable by the average
person, and while that may be necessary for them to get people to pay
attention, it does us all a disservice in the long-term I think.

~~~
mcguire
...as long as you aren't pointing at the software engineer, whose products did
exactly what they were supposed to do.

~~~
platz
Ah yes.. the code meets the requirements. It follows the spec. No further
involvement necessary.

And if Boeing does release a 'software upgrade', what is there to say about
why the software wasn't required to be that way in the first place?

------
systemBuilder
"EVERY HARDWARE BUG IS _FIXED_ IN SOFTWARE" \- Motorola Iridium management in
the late 1990s. The hardware was so bad when they upgraded from 68040 cpus to
powerpc 603 they got a 0% improvement in performance despite the ppc603 being
2x faster ...

------
zakki
So someone built a (traditional) weight scale. Then he can’t find the balance.
Thus he added a gyro sensor and develop software for it. And they call it
modern scale When somehow the software didn’t deliver the balance, they say we
will fix the software.

------
cmurf
This is why speculation needs to come with a warning label in advance, or
people retreat to their corner and start attacking to defend.

A student pilot is hyper aware of the economics of aviation. It's frigging
expensive. The only thing cheap and plentiful is the opinion of another pilot.

The tweet author's shot gun spray of possible things to blame other than
software, is fine speculation, because aviation accidents are rarely simple.
Likely more than one thing happened. But the author rapidly falls into a trap
of making claims that are not yet in evidence.

Alpha sensor failure? There's a categorical statement the sensor failed? How?
Was it transient? What was the range of the readings? How did the software
interpret it? Could the sensor data be corrupted in between the sensor and the
interpreting software, i.e. in the communication pathway?

Author says Lion Air pilots weren't informed of MCAS. Southwest Airlines
pilots weren't informed either. I haven't heard for sure whether AA pilots
were informed. Of the pilots who were informed, what exactly were they told?

 _If the pilots had correctly and quickly identified the problem and run the
stab trim runaway checklist, they would not have crashed._

It is not a fact in evidence they failed to do this. We have no idea what the
position of the stab trim switches were, or the autopilot, and we don't know
when they got into that position, or in what sequence.

Central to troubleshooting system failures in-flight is buying time. And
buying time means stabilizing the plane. Uncommanded behavior can have dozens
of causes. It's supreme hubris and insulting to propose pilots did something
wrong without evidence, to propose they were incompetent, without evidence. To
assume the failure should have been recognized, without evidence.

What will cause a consistent uncommanded roll? Could be a powerplant failure,
could be control surface failure (could be any of several or a combination).
Could you get a control surface failure that will kill you before you know
what failed? Absolutely. Your job is to rely on training to manage the failure
with a reaction that will stabilize the airplane, to buy a little extra time
so that you can figure out what's failing so you can properly mitigate the
failure. The proper mitigation will inevitably be different than the initial
stabilizing reaction.

What happens when the failure is intermittent? That means you can't trust your
stabilization routine, and therefore you haven't bought time, and therefore
you're switching from death defying reaction to having to think critically.
Both are being perturbed, panic is likely. The _normal_ behavior of MCAS nose
down is not described as consistent if you're not aware of it in advance; you
might need a few minutes to recognize the pattern. Is this for sure recognized
as runaway trim? We have circumstantial evidence that it may not be.

 _Nowhere in here is there a software problem. The computers & software
performed their jobs according to spec without error._

That's a concluding statement for which there's insufficient evidence. It's a
reasonable assumption, because computers and software are expected to be
deterministic, in particular in industrial applications. Obviously the code is
not changing on the fly. And this had to be demonstrated for certification.
But it's still an assumption until all the facts are available, and we have a
high confidence explanation for all of the facts.

Further, the author totally ignores public statements from Boeing and the FAA
that a software "fix" or update, is expected for MCAS, the software routine
under discussion, by the end of April. If there's no software problem, why
update it? Perhaps it's a work around for some other design deficiency. But
could it be fixing a non-deterministic software bug, however unlikely?

This is the folly of speculating on airplane crashes.

------
quickthrower2
Am I the only one who sees a tiny bit of irony that this is posted as a
twitter thread?

------
keymone
Am i missing something? Who and where claimed it was a software problem?

From what i read it’s always framed as Sensor issue (and non-redundancy
issue).

It boggles my mind that a plane can be certified to fly with non-redundant
sensor of such importance. Boeing should go bankrupt paying off victims and
management that let that happen should be jailed.

------
krm01
What a wonderful breakdown analysis of the chain reaction one small change can
have.

------
xmly
Let us make it simple.

It is a Boeing problem!

~~~
sundvor
That we know about. I keep thinking, what other gremlins have they got hiding
in the closet?

------
bjowen
But - but - but - software is indistinguishable from magic!

------
caberus
in such a complex system, software is the first thing to save your ass if
there is a problem, and also the first thing to blame

------
sanj
The nose wheel was made 8in higher.

------
dosy
I really don't appreciate this attempt to shift blame away from any group and
onto any other group. it's unprofessional and suggests working out who we can
point the finger out and convincing people not to point the finger at our
group is more important than the tragedy that happened and trying to work out
ways to take responsibility for that. I'm sure all systems involved in the
failure could be improved in some way. to emphasize how one system is not
responsible is not a very empathetic response.

~~~
ww520
The group you are complaining about, the software people, are already being
falsely blamed. This is a rebuttal to that. Just refuse to be a doormat.

~~~
evilotto
Accepting responsibility, is not being a doormat. No matter what systemic
faults were in play, the software was a part of it, and if the software
engineers had made different choices - such as refusing to allow a flawed
system to go forward - then the outcome would have been different.

~~~
speedplane
> if the software engineers had made different choices - such as refusing to
> allow a flawed system to go forward - then the outcome would have been
> different.

You're assuming that the software engineers had sufficient information to
identify the system as flawed.

The MCAS problems appear to stem from faulty sensor data, we don't yet know
much more. However, suppose, for example, that the software engineers were
told in by the sensor manufacturer that when the sensor had an error, it would
shut off entirely and no signal would be sent. If that was the case, it would
be difficult for the software developers to forsee and account for _incorrect_
sensor data, rather than just no data.

In something as complex as a commercial airplane, no one person can know all
the systems. There has to be information "hand-offs", and it's understandable
that the person receiving the information would rely on it.

It's not that different in more prosaic software development. If an API has a
bug in it, it's hard to blame the API users for not accounting for the bug.
You generally trust that the API does what it says it does.

~~~
evilotto
> You're assuming that the software engineers had sufficient information to
> identify the system as flawed.

No, I'm assuming that the software engineers had sufficient information to
know what the gaps in their knowledge might be.

Following your example, if the engineers were told by the manufacturer that an
error in the sensor would result in no data rather than bad data, there should
immediately be followup questions: What is the redundant source of data? What
is the valid range of data? Is there a positive way to detect and identify
errors? How should detected errors be handled? The answers to these questions
should be provided by the manufacturer. It may not be the software engineer's
responsibility to double-check all the answers, but they do need to check that
they were answered in the first place.

There absolutely needs to be information hand-offs; blindly accepting such a
hand-off does not absolve you of responsibility.

~~~
speedplane
> No, I'm assuming that the software engineers had sufficient information to
> know what the gaps in their knowledge might be. ... if the engineers were
> told by the manufacturer that an error in the sensor would result in no data
> rather than bad data, there should immediately be followup questions ...

Yes, of course there should be due diligence with any hand-off. However, lets
assume for the sake of argument that there was, and the engineers using the
sensor data received appropriate answers to their questions, and yet still the
sensor did not perform as specified.

It's hard to blame someone who did their due diligence, did everything right,
and relied on ultimately inaccurate information.

~~~
evilotto
My point (poorly made) is that there is a difference between _blame_ and
_responsibility_. "Blame" is answering the question "who screwed up";
"responsibility" is answering the question "who is going to make this better".

The original tweet-stream post was making the argument that the software (and
so naturally the software people) did nothing wrong and thus was not to blame,
but also made the argument that since everything else was wrong ("not my
fault!") that there was nothing the software people could have done to make it
better, i.e., they have no responsibility.

------
revskill
That's why software integration testing is nessessary for any critical
feature. Considering airplane as a software, i think there're no integration
tests to be run in this case before delivering actual product to customers.

------
punnerud
On top of this you have the legal system in US, where you only (?) have to
prove that it is not unsafe, compared to Europe where you have to prove that
it is safe.

You can see how this play out in how the nations handled the news of the
accidents. Europe put the plains on the ground as precaution, US needed proof
that is was unsafe.

------
jorblumesea
Ultimately, the core issue here is corporate greed. Redesigning an entire
plane is expensive and time consuming. Slapping hacks and patches onto an
existing airframe with modifications is cheaper. To use a software analogy,
Boeing hasn't paid down its technical debt. Except instead of broken services
or customer outages, it's peoples' lives. It's the same reason that many
companies don't pay down their tech debt, it only indirectly helps the bottom
line and it's hard to get business leaders to understand why engineering needs
to redesign something.

~~~
lutorm
It's not really about corporate greed, it's about _people wanting to fly as
cheap as possible_. Redesigning an entire plane is indeed expensive and time
consuming, and that cost would be passed on to the passengers. The airline
industry is very cost-sensitive.

Gradually refining an airframe is done all the time and is not a cause for
concern. If corners were really cut in analyzing the implications of such
modifications, that's where the problem lies. But the vast majority of
airframes in use today, civilian and to an even greater extent military, and
they were often designed in the 70s. (And some very successful planes, like
the C-130 and B-52, were designed in the _50s_.)

~~~
aphextron
>It's not really about corporate greed, it's about people wanting to fly as
cheap as possible. Redesigning an entire plane is indeed expensive and time
consuming, and that cost would be passed on to the passengers. The airline
industry is very cost-sensitive.

Completely regardless of a redesign, the MAX is a perfectly safe aircraft (
_WITH_ the proper additional MCAS training). The reason these people died is
because Boeing didn't want to have to get a new type rating for their modified
airframe, thus making it an easier sell to airlines. Corporate greed killed
those people.

