Hacker News new | past | comments | ask | show | jobs | submit login
Boeing Built Deadly Assumptions into 737 Max, Blind to a Late Design Change (nytimes.com)
478 points by razin 10 months ago | hide | past | web | favorite | 270 comments

I posted this as a comment the other day in another Boeing and MCAS discussion.

In my research into the topic the saddest bit of information I've seen is the image of the black box data for the flight (the first crash): https://i.imgur.com/WJuhjlO.png You can see from the graph that in the final minutes and seconds, the pilot put insane amounts of force on the control column (aka the yoke) to try to pull the plane out of the dive - to save the 189 people on board. But no, MCAS was overpowering and lacked the documentation for the pilot to try anything else.

Also interesting to see is the amount of times the pilots bring the nose up, only for MCAS to kick in and force the nose back down. 26 times.

All data from this Seattle Times article, which was written before the second crash occurred: [1] https://www.seattletimes.com/business/boeing-aerospace/black...

If the pilots were putting that much force on the yoke it is pretty apparent that the pilots really truly wanted the plan to do what they asked of it.

I am curious what scenarios the designers of the plan drove then to not trust that the human in the seat really has no idea what they are doing. Was the ignoring the wishes of the pilot and attempt to prevent a crazed irresponsible and unlicensed idiot from doing something? These are trained humans why does the computer totally ignore there efforts?

Perhaps related crash[0] where the autopilot was disabled because it was listening to user input, in this case a child.

[0] https://en.wikipedia.org/wiki/Aeroflot_Flight_593#Accident

That was because the pilots were not aware that the autopilot had been disabled.

Pulling on the yoke 26 times, as hard as you possibly can, is a bit different.

Oooof. Well the real lesson there is that unqualified people (especially children) shouldn't be in the cockpit of an airliner at all. Autopilot can't be made smart enough to determine if the person operating the controls actually knows what they're doing, and thus to follow/ignore them as appropriate.

I mean, pick your poison. Either the humans are more reliable or the computers are. Either way is subject to some level of fallibility.

The wikipedia article calls out that the compounding sequence of issues started when autopilot was silently disabled, so the human system was in control when the pilots did not intend it to be.

I think the critical issue in both cases is that the wrong system was in control against the pilots wishes.

So, the pilots can pick my poison based on their training and best judgement (and maybe not put their children in control?).

That was interesting and horrific.

You make it sound like the plane has some kind of agency. Though a plane is very complicated, the treacherous MCAS system is actually very simple.

If it only were that simple. Airbus in particular has a lot of systems which PREVENT the human from doing things.

One plane actually crashed because the prevention system disabled itself and the pilots believed it was still there to protect them from bad actions on their part:

> caused the autopilot to disconnect, after which the crew reacted incorrectly and ultimately caused the aircraft to enter an aerodynamic stall, from which it did not recover


AF447 is fascinating for other reasons as well - on Airbus it is possible for two pilots to provide directly opposite control inputs because there is no mechanical linkage between the sidesticks, unlike on yokes in most other airplanes. It does produce a warning (dual input) but it is not tactile - you cannot feel what the other crew member is doing.

https://en.wikipedia.org/wiki/Armavia_Flight_967 https://en.wikipedia.org/wiki/Indonesia_AirAsia_Flight_8501 https://www.youtube.com/watch?v=8F5bRxTIn24

A pretty extreme UX issue.

It seems a bit rough that you're getting downvotes. [Edit: The parent was grey when I wrote this].

AF447 seems like a pretty important case study for figuring out where to put the balance of trust. Along with the very final stage of Aeroflot 593 (after the child had left the seat), also mentioned.

There's a few more: Colgan Air 3407, AirAsia 8501, Korean 501, etc, where flight crew ignored or overrode systems. Which is not to deny that in the majority of cases the flight crew are the best judge of what is needed, just that one surely has to look critically and honestly at counter examples.

There is of course also Germanwings 9525, though as the GP implies, systems protecting against malicious piloting will probably be counter-productive.

I liked the part where it would turn off the stall warning when the nose was too high, then turn them back on again when the nose dropped, which may have convinced the copilot that holding the stick back was helping.

Direct link to image bypassing imgur’s awful interface: https://i.imgur.com/WJuhjlO_d.jpg?maxwidth=1640&shape=thumb&...

Thank you, forcing everyone onto a landing and wasting loads of data just rendering the garbage page is the worst UX on the planet.

I disagree, It is a free image hosting platform for sharing images. There is nothing wrong with linking to it.

If you upload or want to share what is on the platform, they would prefer if you link it that way, as is their right.

It is a business after all. And If people want to hotlink directly to an image they should pay up and host it themselves.

I don’t disagree that if people want to be able to hot-link images they should pay for the hosting themselves (It’s something I do myself - instead of replying on imgur I just pay to host the files on S3).

I just find it amusing that imgur was born out of being a better image host then the competition as linking to images on reddit at the time sucked, and now (IMO) has turned to “the dark side” and are doing the things that made the competition so crappy to start with.

I’m not saying they shouldn’t show ads but I’ve seen and reported countless bad ads on the site (forced redirects away from the site, unannounced “would you like to open the App Store” dialogs. APKs just auto downloading. Seemed at the time they or their ad network would accept any old crap of an advert (this was a quite a while ago, hopefully they are more choosing about the ads they run).

Also their UI is a pain in the arse (IMO) on mobile Safari it’s a pain in the arse to pinch and zoom (not that it will do any good as the image on mobile UI has been resized) and getting to the full sized image is another pain in the arse. For images like above it makes it impossible to read. They said look at the force applied but the wording on the scale is just a blur making the graph hard to understand.

It’s their site and they can do what they like with it. I just find it amusing that (IMO) they have become what they hated. At least they managed to include reddit style communities/comments before reddit started hosting images themselves.

> they have become what they hated

I think that's the fate of all image hosts: current offerings have a lot of (crappy) ads to pay for hosting => competitors arrives, starts with less ads to incite onboarding => competitors gets a market share => competitors now has to turn a profit => add more ads (or ask money for hosting)

Thing is imgur was profitable (before reddit became its own image host, dunno what their status is these days, I just know there submission numbers dropped sharply after reddit did that, but maybe their own community was able to recover from that).

Back in the day you used to be able to pay to upgrade your imgur account to allow larger file uploads (which was great for gifs), disable the ads you saw site wide and disable the ads people saw coming to your submissions. I guess the Rev they were generating off ads back in 2015 was out stripping the Rev generated from Pro accounts (esp with the ability to prevent ads on your submissions. If you were a imgur pro user you were either trying to support that platform or was a power user and getting a fair few visits to your submissions) because that’s when they they dropped them and they were reporting as profitable back then. Which is odd (to me) because it’s been since then that the service (IMO) has gotten worse. To me they had a winning formula and the user base to make it work but fucked it up (or not, they are still around today. So maybe they did the right thing, I’m just an old grumpy user who doesn’t like change. Now get off my lawn, I have clouds to shout at...)

Interesting that there are two angle of attack sensor readings in the black box image when the major issue with MCAS seems to be that it was relying on only a single sensor?

MCAS use(s/d) one sensor at a time, rotating which one is used.

Apart from at the start (when the plane is on the ground), both sensors seem to report the same data?

So why did it crash?

I really see this as a failure of the Systems Engineering process. With so many people unaware of the impacts of the changes, it’s up to the systems types to have the big picture view and make sure these sorts of things are taken into account.

Especially if as the article says a failure of the AOA sensor on the system would be Hazardous (looks like it was Catastrophic when paired with MCAS in retrospect), that would have made the functional Design Assurance level for this system DAL B, which adds enough rigour not only in the software development process but so much before you even get to that in terms of Safety Assessments and ESPECIALLY change impact analyses when the function changes.

For sure there may have been pressure from management to keep MCAS out of the manual but it’s not really up to he regulatory agency to be experts on the aircraft design, if things are being hidden by the company then I’d consider this bordering on professional misconduct on the parts of the engineers overseeing this work.

I say this as a Professional Engineer working as an aerospace systems engineer.

Thinking about this a little more, I also see a failure here in terms of naming things, which I have noticed in my career can be scoffed at but is so important.

As the article says, the function of MCAS changed and its operational envelope was greatly expanded. What if, internally at least, the new system was referred to as MCAS2?

This is somewhere that things can get political, as was exactly the case with the Max, where they did not want anything to be considered a change to the aircraft type, let alone knowing the MCAS system existed in two majorly differing versions.

This is so true. Naming to me is one of the most important decisions. When I see someone who frets over naming decisions for hours, or will sit down with me and really think through what we're going to name something - I know they get the gravity of the situation. I trust those kinds of people much more for big picture architectural or project lead decisions.

Do you have examples off the top of your head of really great names that have come as a result of this process?

Perhaps something entirely different than what you're talking about, but...

I believe changing the aircraft type would trigger regulatory events carrying rather gargantuan costs.

Avoiding those seem to be the entire reason for the existence of the 737 MAX.

Naming things and communicating changes are all core parts of the Systems Engineering process.

Failing to clearly communicate precisely how big this change was, and not making it extremely clear to all stakeholders what was happening and doing analyses of how these failure modes have changed, is really awful.

Admittedly I have 0 familiarity of the internals of a corporate electrical engineering environment, but this is a logically sound idea that in many other industries is standard. OS 10.02, Apache license v. 2, Archer router xyz Rev 02. What's the limiting factor here in the case of MCAS?

There are definitely different versions of software, each with unique identifiers, such as typical version numbers. There’s also the idea of “flight test ready” software, which is able to go onto a flight, vs software that is only to be used in a lab that is not truly flying in the air and controlling the plane.

As a former Systems Engineer who worked on avionics, I would tend to agree; a major change to a system should have at least triggered the integration-level engineers (called the Platform team at my former job) to think hard about potential broad impacts.

It should have resulted in that team actively seeking buy-in, and clear communication that all subsystems comprehended precisely what was changing.

It likely should have triggered some kind of additional scrutiny from the safety organization.

That it didn’t is heartbreaking. It seems like either some common practices were not followed or were rushed.

I can’t see Boeing keeping their CMMI certification level after this news breaks. Certainly some major steps were skipped in the Systems Engineering process.

>Especially if as the article says a failure of the AOA sensor on the system would be Hazardous (looks like it was Catastrophic

From my understanding, this was an intentional decision, as the only way they could certify the airframe without simulator training being required was to feed the MCAS with only 1 sensor.

The only way they could do that is keeping the system overall rated as hazardous, as the Catastrophic rating would require multiple redundancy plus the training.

This can be corroborated from the Australian 60 minutes expose.

>For sure there may have been pressure from management to keep MCAS out of the manual but it’s not really up to he regulatory agency to be experts on the aircraft design, if things are being hidden by the company then I’d consider this bordering on professional misconduct on the parts of the engineers overseeing this work.

That is my conclusion as well.

Thanks for the tip on the AUS 60 Minutes expose. For anyone else interested: https://youtu.be/aO7_indbfME

engineers overseeing this work

There the thing: nobody wants to pay engineers on par with managers. Maybe one can not manage something he/she can not understand.

Given all the environments I've worked in, including midsize construction projects I'm willing to bet my hand on the fact that some engineers or third party planning bureaus actually did spot these flaws and reported them and then were turned down and ignored, but not before having been told that it's not their decision/problem.

That's always the moment when I'm happy to have documented the decisions.

It’s what happened at NASA...

Just to be clear, are you referring to https://en.wikipedia.org/wiki/Linda_Ham ?

> There the thing: nobody wants to pay engineers on par with managers

I used to think like that but I learned in time that the core issue is that engineers rarely know how to present their job in terms of monetary benefit during salary negotiations.

i.e. as an automation engineer I led an initiative that saved a previous employer more than 60k€ yearly using automation and optimizing other validation workflows. as principal engineer at my current job I saved them more than 30k€ yearly by replacing a licensed component with an open source one, filling the feature gaps myself outside of work hours.

these thing get noticed, not just the stuff done but the ability of thinking in money, and unlocks the full engineering potential in salary negotiations

engineers rarely know how to present their job in terms of monetary benefit

The point is, they shouldn't have to. The managers should be handling that fairly. Otherwise, it's an adversarial game where only one team knows how to play.

Engineers should be experts in engineering. Holding them at a disadvantage because they are not good at the other feels wrong.

but even in such a scenario how a manager would know how to assign merit? they aren't expert in engineering either.

someone in between has to connect the dots, but there is no such figure and the incentives for doing that communication work are all on the engineers' side

addendum: I do however agree it sucks to have to compete with other fellows

I would think the manager would have a bigger picture view of the value added, but it probably varies with particular cases.

>>> by replacing a licensed component with an open source one, filling the feature gaps myself outside of work hours.

Your achievement is to save money by doing all the work for free in your spare time?

While I agree that engineers should negotiate better and keep track of their achievements, that is not one at all.

> that is not one at all.

I'm significantly ahead of all my peers in job position and salary so I must have been doing something right.

I've a family now and such investment done prior are helping immensely.

all my peers know I'm dependable and all previous manager know I do get things done.

the large reference network is both a source of work opportunities and has been a safety net during times of need.

why would people choose the misery of strictly contractual relationship when you can build partnerships and friendship wherever you go with just very little effort.

Sometimes an initial sacrifice for future mutually beneficial dividends is warranted, if it’s likely to be reciprocated. Keeping things purely transactional is better if you expect your counterpart to try to take advantage of you, of course, but I don’t think this is the case for all professional relationships, always.

I assume these must be small examples on a long list of accomplishments, because 30k/year doesn't seem like it's enough to offset your salary. I have to imagine the impact of a principal engineer should be measured in the 10-100x of their salary?

That's some pretty impressive ingrained bias towards rent-seekers. Why not 1.1x or the board can try doing the work?

Consider a fast food business. Should the manager be satisfied with making 10% profit on french fries?

10% on raw materials would be losing money, but 10% on total expense, with the difference going to the workforce is fine to me. I don't see why fast foods shouldn't be cooperatives, they're not exactly capital intensive or high risk.

To be clear, I wasn't referring to the board having a choice in the matter.

Sorry I both agree with you but reckon 10x is about the correct ratio. It’s not rent-seeking, it’s “sharing wealth” and “diminishing inequalities”. A good engineer needs to pay for: - Administrative employees, - Office rental, - Junior members of the teams who are growing up towards seniority, - Women who are non productive but required to pass the diversity requirements and make the company look good.

You may claim it’s unfair that productive members are exploited to pay for nonproductive ones, but that’s the neo-conservatives’ argument, and has been rejected by most of the Silicon Valley. I honestly think that the currebt situation is already the middle ground betweeb “pay proportional to your added value” and “diversity, equality and social responsibility requires adjusting the salaries of highest-paid engineers.”

it was just a bunch of examples of stuff that was specifically done on top of my other job resps. also, not every place is the silicon valley and not every company is a fortune 500, plus some perspective about relative worth also applies - i.e. the automation stuff was calculated in time saved by the Indian based testing team, which reduces the raw impact.

Leaky systemic process abstraction.

I don't know if one could say a failure of the Systems Engineering process. One could also say it is a situation where management, regulatory and market constraints became so unwieldy that there was simply no way Systems Engineering could possibly satisfy the requirements. Sure, it's a failure to push back again this but that seems a bit different.

Management and "the market" were providing faulty feedback that engineering couldn't push back against, no matter how much force was applied.

Systems Engineering -- The art of working on the overall design while not overlooking any detail that later turns out to have been important.

This whole plane sounds like any ugly hack. They slapped very different engines on an existing airframe. Then, when it inevitability exhibited undesirable behaviour, they tried to paper over the cracks. Then they hid this information from their customers, regulatory agencies and the pilots.

It makes me wonder if there are other issues with the Max that the public doesn't know about yet.

I hope a thorough review of Boeing's internal communications is already underway. If there is proof that these decisions were made for financial gain, they should face criminal charges.

IMO, whether it was greed or just general incompetence, Boeing has demonstrated that they are not responsible enough to self-certify their aircraft.

The sad thing is that we'll probably see this plane fly again by the end of the year, because the millions of dollars in retrofits will still be cheaper than having to scrap all existing planes.

We have no idea what other potentially lethal corners have been cut. What if they go back into service, after several months of retrofitting all of them at Boeing maintenance hangers, and then the following year there are two more deadly crashes from some other overlooked hack.

Really, these planes need to be scrapped. The engines, equipment, seats, etc can all be stripped and used in other planes, but the air frames will need to be recycled and this line of planes should end here.

Even if it doesn't (probably won't), I highly doubt we'll see another generation of 737s. They did survive the rudder problems way back from the... 80s? or 90s? .. So their reputation might recover, but they still can't make the types of planes airlines want and keep that name/certification.

I had forgotten about the rudder problems! Worth a read: https://imgur.com/a/5wcFx8M

> Perhaps the single most complex, insidious, and long-lasting mechanical problem in the history of commercial aviation was the mysterious rudder issue that plagued the Boeing 737 throughout the 1990s. Although it had long been rumoured to exist, the defect was suddenly thrust into the spotlight when United Airlines flight 585 crashed on approach to Colorado Springs on the third of March, 1991, killing all 25 people on board. The crash resulted in the longest investigation in NTSB history, years of arduous litigation, and a battle with Boeing over the safety of its most popular plane.

For anyone who enjoyed the write up, it's from reddit user AdmiralCloudberg who has a bunch of them, and they're all good: https://www.reddit.com/r/AdmiralCloudberg/comments/a4ckhv/pl...

Those images are blocked for me at work, would have been helpful if they had added the text itself to the Reddit posts!

In what world is scrapping the airframes due to a (serious) software fault the best and most sensible solution? Do you believe there could be undiagnosed problems with the wings, fuselage, tail, hydraulics, electrics, fueling system, gear, etc.?

In what world is scrapping the airframes due to a (serious) software fault the best and most sensible solution?

A world in which the faulty software was required to fix faulty hardware.

The hardware isn't faulty. The problem is the way Boeing tried to achieve a zero training delta so pilots wouldn't have to get a second type rating.

the airplane did not have stable flight characteristics because of its physical design.

that’s a hardware problem.

MCAS only exists because of that hardware problem.

the fact that boeing also did not train or tell pilots about MCAS, in order to make the airplane more financially appealing by retaining the 737 type rating, is a separate (also bad) issue.

All jetliners are unstable at altitude in cruise, and require augmentation.


Aircraft design is a giant bag of compromises between desirable characteristics, most which are in conflict with each other.

Best-stated comment so far. We seem to forget that this whole problem exists in an edge domain only.

an INOP yaw damper doesn’t cause a hull loss crash

Fun fact. The original 737 actually had two yaw dampeners because they expected it to have diverging yaw stability. Just by chance the airframe is actually positively stable so can operate safely with no yaw dampeners so one was removed.

>the airplane did not have stable flight characteristics because of its physical design.

This is a gross oversimplification of the problem. In most of the flight envelope the aircraft is stable.

At high alpha the aircraft has pitch problems.

There are myriad ways to address this, and MCAS was one (bad) choice of many available to Boeing.

That's the devilish thing, I think. Making the plane unstable then patching things up with MCAS, or something else, isn't a fault in any straightforward way. But it is taking a risk, and the source of risk isn't simply the risk that eg. MCAS will fail on any given flight, it's also the higher-order risk that eg. the MCAS system will have a faulty design. (A bit like rewriting code in C for performance, or keeping on generally unloved code to satisfy a paying customer, presumably.) I assume that makes the ethical and safety decisions a lot more murky for the engineers than having a specific fault to point to, especially in an organisation afflicted with go fever.

I believe the airplane is aerodynamically stable even at high alpha. What it fails in certification is the requirement to have continually increasing pitch feedback forces. I believe the pitch feedback forces are still in the stable region, just too low. This is not a flying wing or intentionally unstable airplane (as the F-16).

I believe the airplane is aerodynamically stable even at high alpha.

That would put you at odds with Boeing's test pilots.

No it wouldn't (or "citation needed" if so).

The issue is not that the airplane shows negative or neutral aerodynamic pitch stability, it's that it does not exhibit an increasing stick force gradient, as required by certification rules.


>> [...] In most of the flight envelope the aircraft is stable.

>> At high alpha the aircraft has pitch problems.

> I believe the airplane is aerodynamically stable even at high alpha. [...] This is not a flying wing or intentionally unstable airplane (as the F-16).

In some ways that's likely worse news than an airplane that's inherently unstable in general, no? A corner case, and one that evidently isn't actually all that uncommon. If you're building something like an F-16 you know that you absolutely have to make the fly-by-wire correct and robust, and the ground crews similarly know that if anything affects its performance the plane isn't fit to fly.

Citation needed.

FAR 25 applies to all transport category aircraft. The section on stability (§§ 25.171 - 25.181). In exactly what manner is the airplane not stable, with or without MCAS?

From TFA

But a few weeks later, Mr. Wilson and his co-pilot began noticing that something was off, according to a person with direct knowledge of the flights. The Max wasn’t handling well when nearing stalls at low speeds.

Insufficient. Stability requirements in FARs are clear, and not handling well might mean "substantially different from prior 737s" not unstable.

i haven’t looked at the fars in quite a few years, but i’m pretty sure there would be stuff in there that references stuff like pitch stability (which is how i’d define the “hardware problem”), is that not the case? i’m afk atm.

i still maintain my macro point, either way.

making the airframe on a pax airliner aerodynamically stable during normal takeoff and landing operations seems like basic “good engineering” to me.

I'm not so sure that it is good engineering to be inherently stable during normal takeoff and landing operations.

Modern fighter planes are inherently unstable because this is required for better maneuverability. Passenger planes can certainly benefit from that too.

Consider gusty crosswinds during a landing. With an inherently unstable aircraft, there is greater capability to compensate. You can have a computer stabilize the plane, preventing the tail or wing tips from striking the ground. When wake turbulence threatens to flip the plane or when a microburst threatens to pound the aircraft into the ground, a fast response is possible. Stability would deaden the performance of the needed response.

The extreme example is probably wings that are low-mounted anhedral and forward-swept, with the bending controlled by rapidly actuated aerodynamic surfaces near the tips.

A common fallacy I'm afraid - modern fighter planes are unstable for improved supersonic performance - reduced drag. So relaxed stability may benefit civil aircraft in terms of fuel efficiency. There is an argument that inherently unstable aircraft makes the manoeuvrability worse or harder. What many forget is that issue with manoeuvrability is actually at the end of the manoeuvre. The handling qualities aim is to point the nose in a new direction and an unstable aircraft makes the design challenge harder to stop the aircraft at the position required.

It's not stable in a "crashes into the ground and kills everyone on board in spite of the best efforts of its pilots" manner.

In exactly what manner is it not obvious this is not an acceptable design outcome?

You are confusing stability, which is a specific aerodynamic term, with two examples of catastrophic outcomes. Reread the original post I replied to, none of the first three sentences are true: unstable flight characteristics, instability is a hardware problem, software routine only exists to paper over the hardware problem.

The problem results from an edge case or it would be happening a lot more often. That it's an edge case doesn't mean it isn't serious or shouldn't be fixed or that it's not a design flaw. But it is not a stability problem, it's the wrong word to use.

It misdirects the conversation from where it should be. The airplane aerodymics are the distraction. The central problem is when perturbed, this feature becomes a saboteur, 2.5 degrees of deflection in 10 seconds is asinine at Vmo. A human pilot acting on all the same information the flight computer has available, would be considered a maniac to correct for a clearly bogus angle of attack value with 40 degrees of nose down. It's that insane. And Boeing knew about the possibility, classified it as hazardous, and yet somehow no further exploration of what would happen upon arrival at such a hazardous event (MCAS upset) by any team at Boeing or 3rd parties or the FAA. It's mindboggling.

Meanwhile some people prefer distractions from those issues by using the wrong terms: it was designed badly, and the whole plane should be scrapped. With the above systemic problems at Boeing and FAA, who knows what kind of airplane they'd design to replace it and what sorts of problems it would or could have.

The whole impetus of the 737 MAX was a race against time to compete. If they had faced a much longer time frame for a whole new model, the pressure to cut corners is even higher. The opportunities to make mistakes are even higher.

I'm one of the chief repeaters who has Harper on the stability issue and the control stick force curve; I usually Eve up dropping a post or two about it in each MAX thread.

You are right on target, but I do wish to point out the aerodynamics are still a problem, and a problem that has caused a great deal of grief in aviation history.

Take a trip down memory lane, and give the D.P. Davies Interview from the Royal Aeronautics Society a listen. Specifically, the one revolving around the 727 certification.

There seems to be two schools of thought to aircraft design. One is the test pilot's wet dream: simple layout, well behaved, neutral stability, or minimal bad behavior up to the corners of the flight envelope, then easily discernable, and recoverable stall behavior.

The other school is the realm of the Engineer. The Tricky Sick school if you will. Apply enough computer and piloting aid to the properly shaped brick, and it can be flown like a 737! Or Airbuses version of "let the plane fly itself, just tell it where to go."

Even as far back as the certification of the 727, test pilot's saw the shift away from the meutrally stable machine that "just flew" to an ever increasing complex mishmash of complex systems working in the background to male unstable airframes fly like naturally (neutrally) stable ones. Which is all fine and good until something goes wrong, and those systems fail, leaving a pilot in uncharted waters.

The control stick force stuff is not a distraction, just another link in the chain of normalization of deviance that resulted in a departure from "building an airworthy frame" to figuring out how to mask the "unairworthyness" of a frame sufficiently so as to get it by the regulators.

That's not to say it can't be done, but one approach is definitely inherently riskier than the other, and requires increased levels of communication among everybody involved.

Point being: this has been built up to since as far back as the 60's. See the 727 certification in above mentioned interview, the many difficulties that the MD-11 ran into with it's LSAS, and note the similar less than stellar results that emerged from trying to optimize for fuel efficiency at the cost of having to implement increasingly complex control system hacks to maintain parity with regulations/previous airframes.



Even the 737 Max variants are statically stable in pitch, at least up to the stall. The issue that MCAS was intended to address was the handling characteristics, again in the flight regime prior to the stall. I do not know whether the airplane could have been certified, as a separate type, without some sort of augmentation, but it seems that Boeing did not think it could be certified under the common type rating that covers prior variants.

Maybe we should just scrap all airplanes due to the inherient risk of strapping jet engines on metal tubes and forcing them 10's of thousands of feet into the air.

I think a better alternative would be to deny it a shared type certification with the prior 737 and requiring recertification (and pilot retraining), thus removing the incentive to use shortcuts.

>>> The sad thing is that we'll probably see this plane fly again by the end of the year, because the millions of dollars in retrofits will still be cheaper than having to scrap all existing planes.

In the US maybe. In Europe and Asia, I don't think so.

The planes have been grounded pending investigation. Given the speed of investigations and the political ramifications of all of this, I bet they won't be ungrounded anytime soon.

It will fly only to be voted down with customers feet. I know a lot of people who looked on the plane type for the first time during their holiday bookings. This plane is finished.

Is there a requirement to fly the plane advertised? In other words, if I book specifically a non-737 800 Max plane, and the airline switches, can I get my money back, or make them fly me on a different airline?

Why did you put the 800 in there? The 737-800 is a separate plane from the 737-MAX, and doesn't suffer from these problems (and is still operating; I just flew one cross-country last week). You might be referring to the 737-MAX 8 (not 800), which is the only variant of the 737-MAX delivered so far.

My mistake.

Interesting question. I have never had this happen, and I think I'll be asking a lot of questions if it does.

It's not something an airline can do hastily, as different aircraft (well, at least those of differing type rating) require different crews to operate. So the airline will have to grab a totally different crew to match their replacement aircraft.

I assume it happens regularly. I had a couple of flights myself on other plans than initially advertised. I always assumed that the airline canceled/delayed other flights because of it (taking plane a and crew a from flight a to do flight b with plane a).

Before the 737 Max got grounded (after the 2nd crash), I remember reports that some airlines allowed passengers to rebook their flights for free (to end up on a different plane). I doubt they will continue doing it once the 737 Max is allowed to fly again. Eg, Southwest is too dependent on it.

When it has happened to me, the crew came with the plane.

I think this is false thinking. The 737 is by now a very well tested airframe. Yes, you change some things and there are unexpected results, but the core is sound. Starting over on a completely new design, everything has to be debugged from scratch.

This reasoning reminds me of the tendency to argue "this code needs to be thrown out and rewritten from scratch" among software engineers. It's easy to see the flaws, but not so easy to all the things that have been fixed. See https://www.joelonsoftware.com/2000/04/06/things-you-should-...

Please, no. This is not the same thing at all.

This is like taking your MySQL app and running it against MongoDB with a bunch of hacks to translate the SQL into mongo calls.

And then claiming you did all this so you could "avoid a rewrite".

Some architectural decisions are far reaching and leaky. This is not the ideal but it often is the ideal trade off. The alternative is decoupling to the nth degree resulting in a hunk of junk impossible to change that won't get off the ground.


Changing the position of the engines means it's no longer the same 'tried and tested airframe'.

I dont think its corelous (is that a word?). It's not really a lift and shift to new platform.

However the analogy to "rewrite all this" politics is good I think. Often you could walk into a building and say "all this needs to go" but truely the costs of doing so and starting from scratch are hidden in code and engineering design that's survived a trial period.

No matter how nice and neat everything might look, that's only because it doesn't yet account for 10,000 other edge cases you're not able to simultaneously consider...a system as complex as a commercial plane is still going to care about these.

Engines were mounted to make a stable configuration inherently unstable in not so uncommon circumstances. The pitch was used to control the the most critical axis of the plane - something done normally with the rudder. This gave an automated system the authority to override the pilot with a wide margin. The behavior of the plane differs wildly from the old 737 as soon as the surprising narrow corridor of normal flight situation is exceeded.

"Other issues" is what worries me most about this as well. We now know that MCAS is safety-critical, so it's going to be reviewed, but what else got missed/minimized during the design and review process? The fact that Boeing is still trying to minimize the danger of MCAS does not fill me with confidence in their willingness to raise and fix other problems.

They don't seem to have learned from this, which means it's likely to happen again.

> This whole plane sounds like any ugly hack. They slapped very different engines on an existing airframe. Then, when it inevitability exhibited undesirable behaviour, they tried to paper over the cracks. Then they hid this information from their customers, regulatory agencies and the pilots.

Don’t forget, the system they used to paper over the cracks had a single point of failure.

*One of the systems that we are aware of

Indeed. The article makes it sound like a foul-up late in the design process, but this plane was corrupt from the very beginning when Boeing set out to dodge the requirement for re-certifying the airframe.

To be fair, it's not like this isn't standard industry practice. Airframe evolve and change, with incremental changes typically only affecting limited areas of impact.

OTOH seeking a non aerodynamic solution to a significant stability degrading airframe modification was IMHO a bridge too far. If the pitch stability (not the pitch feel) problems couldn't be dealt with aerodymancally without busting type certification, than perhaps the whole concept was just too much of a stretch for the venerable old 737.

It surprises me, though, that this couldn't have been engineered out with enhancements to the horizontal stabilizer, such as tip fences or a span increase to offset the lift from the engine nacelles.

If it could have, but software was cheaper, then that's even a darker indictment of Boeing's engineering incompetence.

Re-engineering the rudder may also have helped avoiding employing pitch control (with it's inherent delay and hard to undo actions). But that would have made it obvious that this is a different plane...

If you make the plane wider you run into issues finding room for it at the gates.

The FAA needs to grow a pair and declare this type certification and all other certifications older than ~30 years EOLed. That doesn't mean you can't continue operating those planes, it just means you can't retrofit some changes onto a deeply legacy model and still call it the same type.

I think its not about retrofitting and more about retrofitting (engines in this case) in the wrong place.

Say what now? It sounds like you want the FAA to revoke the A320 type certificate?

Yes. The A320neo should be the last new design allowed under the same certification. They'll be forced to re-certify whatever they do next anyway, so they won't have to try to shoehorn things into the older design like Boeing did with the 737.

But Airbus shoehorned the same thing - a CFM LEAP - into an older design, just like Boeing did.

It's not the case that they won't have to try, they already did it years ago.

It's an ugly hack because the design tries to stay within what The Regulator considers the same airplane type.

This in turn is because getting a new airplane type to market would cost some (I assume, facts are welcome!) unholy amount of money and time to get approved.

If you consider that decision "greed" or "a rational response to perverse regulatory incentives" is I suppose a personality test as good as any :)

Those regulatory incentives are there for a reason. They are written in blood for the most part.

It's one thing to scoff and believe "Oh, it's nothing! You're just holding us all back!"...

...Right up until a plane load of freight or people plunges out of the sky.

Free market economics optimizes for one thing, and one thing only as a first,order optimization. That's why we regulate. To ensure that all those nuisance secondary facets are accounted by everyone equally to ensure that market forces natural race to the bottom doesn't compromise the central tenet of air safety; that everyone and everything that goes up, comes back down, safely, controlled, and alive.

The business part, if you think about it, is secondary to the capability to make and safely deploy a new plane. A nice bonus.

Sacrificing the quality of the final product for the sake of looking better on the balance sheets us a cardinal sin. Plain and simple. Based on testimony from inside, that sin seems to be SOP at Boeing for the better part of the last decade.

I think this sort of excuse is always superficially applicable and therefore meaningless. There is no malfeasance that can't be described as "a rational response to perverse regulatory incentives", but "rational" falsely implies a singular possibility chosen objectively. It's incorrect to defend a specific failure as being compelled, because one hasn't explained why this failure and not the near-infinite number of other possible ones.

The day someone very high up the corporate ladder truly gets held responsible for this type of greed & negligence and will be put a way for a long prison sentence would be a good day for society. But I am not holding my breath...

But I hope that the CEO Dennis Muilenburg deep down understands he seriously fuxxed up real bad and every now and then is having a hard time falling a sleep in his $10M mansion knowing that he is ultimately responsible for hundreds of peoples unnecessary deaths due to his failed values as a leader.

I agree. I think the FAA failed in it's mission also.

Their credibility outside the US is shattered. Their word & opinion caries zero weight anymore since they have proven themselves to be morally corrupt.

They have outsourced certification (their job!) to the manufacturer (Boeing) since the 787. My confidence in that plane thus isn’t sky high either.

it’s reputation has been destroyed in the international aviation community.

which is extremely sad because it was really hard won.

classic story of gutting a gov organization, and regulatory capture.

i know some really good people at faa and the situ makes their blood boil. mine too.

I haven't seen commentary from international pilots who are worried about flying Boeing planes.

Even with the two MAX planes that went down, fatal crashes are enormously rare (for Airbus/Boeing/CRJs/ERJs). This one is only getting so much attention because it was that extraordinarily rare "design defect" rather than a maintenance issue or pilot issue.

Now better?

> https://www.thedailybeast.com/pilots-complained-to-feds-abou...

More specifically, the complaints reportedly referenced issues pertaining to a takeoff “autopilot system” and situations where the plane is “nose-down” while trying to gain altitude. One pilot reportedly wrote that it was “unconscionable” that Boeing and federal authorities allowed pilots to fly the planes without fully describing how the 737 Max 8 was different than other planes. “The fact that this airplane requires such jury-rigging to fly is a red flag,” the same pilot wrote.

His response to the question of "should you step down?" was "no, absolutely not, people's lives depend on me leading this company." No one is that important, and 346 people died unnecessarily on his watch from negligent business processes. He should be forced out.

Your comment makes it sound like you believe there was a single person with nefarious intentions or criminal negligence who chose to put lives at stake in exchange for profits.

That is almost certainly not what happened. It is more likely a system of procedures and policies which failed. The company should take the hit, but unless an investigation reveals otherwise, I see no reason a single individual should take the blame for all of this.

> That is almost certainly not what happened. It is more likely a system of procedures and policies which failed. The company should take the hit, but unless an investigation reveals otherwise, I see no reason a single individual should take the blame for all of this.

Aren't the insane compensations of executives justified by their "great responsibility"? So I think it makes them eventually responsible for what their companies do.

I heard someone say this and I can't remember who, but it seems very appropriate. One of the key responsibilities of the CEO (and by extension upper management) is to acknowledge reality. When the company is given permission to act on what it already knows, then employees have the right incentives to do the right things and not just the things that won't get them fired.

In that sense, it absolutely is the fault of Boeing management for not acknowledging reality.

I for one miss the concept of full responsibility and ownership. What happened to “the buck stops here”?

How MCAS slipped through certification process was not the main issue (mistakes in complex products can happen). The main issue was Boeing not caring that MCAS was dangerous even after discovering it.

After the Lion Air crash, it was very apparent to Boeing that MCAS was not safe. This whole article focuses on how MCAS slipped through development+certification - but really even after Boeing new the dangers of MCAS, the MAX still was allowed to fly.

It was hidden and dangerous. Then it was open and dangerous but was still defended by Boeing. Damning.

How MCAS slipped through certification process was not the main issue

I think it's a huge issue, but perhaps not criminal. The hiding/lying/etc is a criminal issue in my view.

Great article. But for me there's a huge question being left unanswered, like the elephant in the room:

Why did exactly did the engineers/test pilots feel the need to "enhance" the original MCAS with the new, more powerful version that worked at lower speeds? What did they know? I doubt they did it for the hell of it. And therefore, what has changed that that enhanced functionality is now no longer necessary, and it's fine that MCAS is being returned to its original, more subtle implementation?

These things just don't add up for me and Boeing's constant pronouncements that they did nothing wrong, everything was fine, and now they're fixing it so everything will be even more fine ring very hollow indeed. I would almost like to see everyone involved in this subpoenaed so the public can learn the truth of what, exactly, took place.

Until we have some answers, especially to my main one - what was so bad about the airframe's handling that it was necessary to massively increase the power of the MCAS system, but is now apparently not necessary anymore and it's fine for them to nerf it - I don't think I'll be flying on a MAX.

The answer to why they wanted to “enhance” MCAS is that they wanted it to be certified as a 737 like all previous versions, which means pilots need to be able to fly it exactly like previous 737s without additional training, and a technical hack which “corrects” pilots' actions facilitates that.

Is this only me, or all this one-vs-two AoA sensor talk seems some kind of diversion from the real problem with this plane.

I mean, if one-sensor based MCAS failed twice so early in the life span of the plane model, what is the probability that a two-sensor model will fail pretty soon as well? The math should be simple, we have all data needed: combined hours flown by all planes of the type and number of failures (at least two known, which can help us to estimate a MTBF of the sensor).

The problem isn’t failure, but detecting failure.

If the sensor had just stopped responding, there wouldn’t have been any problem. The planes would keep flying, the sensors would get replaced, and everyone would be fine.

What happened was that the sensor gave erroneous readings. The MCAS system reacted to those erroneous reading and crashes the plane.

With two sensors, you can detect failure. It’s very unlikely that both would fail simultaneously. If they did, it’s very unlikely that both would provide the same erroneous readings.

It’s very unlikely that both would fail simultaneously.

Birgenair 301 crashed into the Atlantic because mud dauber wasps built nests in both pitot tubes while the plane was on the ground. It happens.

Airspeed is required for safe flight. The failure on that flight was detected immediately, it just couldn’t be handled. AoA on a 737 MAX is not required for safe flight and the system just needs to refrain from taking any action if it fails.

But MCAS was added because the plane doesn't handle well in some situations.

I haven't heard of it ever activating except in the incident/accident flights. It's required for certification, but you would either have to be mishandling the plane or get in some extreme weather for MCAS to activate.

Think about it like the Antilock brakes on your car. Suppose the wheel position sensor fails. It's fine if the car puts up a warning light and says that you don't have antilock brakes anymore. You can drive fine without them until you can get them fixed with a minor safety impact. It's not fine if the wheel position sensor fails and this causes the car to slam on the brakes going 65mph down the highway.

I haven't heard of it ever activating except in the incident/accident flights.

You wouldn't hear about it unless the activation was triggered by egregious pilot error and you're scouring aviation news sites.

It's not fine if the wheel position sensor fails and this causes the car to slam on the brakes going 65mph down the highway.

Been there, done that. It's an unpleasant failure mode, but it is survivable.

ABS isn't the best example because it does prevent lots of accidents in its own right, including a >50% prevention rate of some types of accidents in rainy, snowy, or icy weather. The overall fatal accident reduction is 15% for cars and 27% for trucks and light trucks. Source: https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/...

And, anecdotally, I've had ABS kick in in some occurrences for which I was very thankful.

Having a safety system that rarely fails to prevent an accident is much different statistically than a safety system that rarely causes an accident.

Suppose the crash probability on a normal flight is 1/1E7, but without MCAS it's 100x more dangerous, or 1/1E5. Suppose MCAS failure probability is 1/1E6, the probability of an additional crash due to the failure of MCAS is 1/1E11, which is acceptable.

The problem is that in practice, the crash probability if MCAS fails is empirically 2/3 instead of 1/1E5, because MCAS actually causes the crash rather than merely failing to prevent a crash.

The first thing I read in your link seems to be the opposite of what you wrote:

"ABS has close to a zero net effect on fatal crash involvements."

Further down, there is a chart which seems to show that ABS is associated with a huge increase in fatal accidents in inclement weather, and a large increase in side impact accidents, both fatal and non-fatal.

Maybe there's something obvious I'm missing?

It handles fine, except it doesn't handle like a Classic/NG 737. This is a problem because pilots were trained to expect it to handle like a Classic/NG. The discrepancy between reality and pilot expectations is the gap MCAS was meant to fill, such that it wasn't necessary to correct pilot expectations.

It sounds like in normal flight you could cope without it, not easily enough that you'd trust an entire fleet without it, but enough that in the rare case of a detected AoA sensor failure, you could land the thing in a controllable way.

Doesn't handle well is not equivalent to "is uncontrollable", though.

If I recall, it made it more prone to stall in certain conditions. I'd like to call that hazardous, but without test pilot feedback I'm not sure how to classify it. My recollection from articles a few months ago is that they added MCAS not due to simple differences in handling, but due to a potentially hazardous difference in handling.

Mentour Pilot on youtube, a 737 instructor, has a few videos about it. The gist is that in any plane when you're close to stall, you want to pitch down and increase airspeed over the wings. Depending on the precise circumstances you do this by pitching down and throttling up. However in the case of every 737 and in fact any aircraft with engines slung below the wings, the act of throttling up the engines causes the airplane to pitch up. The reason is because the engines are not in line with the center of mass. Imagine rowing a row boat with only a single oar.

This is something all 737 pilots are trained for, so they have to balance that pitch up from the engines throttling up by pitching the plane down more than they would otherwise need to. The precise relationship between throttling up and pitching up was changed when the 737 was modified to create the 737 MAX. Namely, on the 737 MAX the pitch-up is more extreme. It is not so extreme that it makes the airplane a bad design, but it is extreme enough that pilots need to not expect the legacy 737 behavior.

Had pilots been trained for the performance characteristics of the 737 MAX specifically, it would have been perfectly fine. But they weren't, and instead MCAS was meant to paper over the difference so pilots could be kept in the dark (which is cheaper but borderline homicidal...)

To put this another way, in some hypothetical universe where the 737 MAX was the first 737 ever, it would have been introduced without MCAS and pilots would have said it handles like a dream. Then, when the 737 NG was introduced after the 737 MAX, there might have been a reverse-MCAS system implemented to make the NG handle like a MAX. That reverse-MCAS system may have then failed catastrophically.

MCAS was not designed to make the 737 Max behave like the 737 NG or Classic. It was designed to make the 737 Max certifiable at all.

The problem with the 737 Max is that the engine nacelles are further forward and larger than previously, so the lift that the nacelles make at high angle of attack is greater. This results in less stick force being necessary to maintain a high angle of attack than at lower angles of attack. This is uncertifiable behavior. Hence, MCAS, to augment the manuvering characteristics of the 737, to increase stick forces at high angles of attack.

In a hypothetical universe where the Max was the first 737, it'd be fly by wire and all the stick forces would be synthetically generated anyway.

It handles differently, yes. That could be handled with a bit of training.

It will not satisfy the current handling requirements, hence either making it completely unairworthy or a special category with vastly different handling from usual passenger-rated airplanes.

According to Wikipedia, only one of the pitot tubes was blocked:

“The investigation concluded that one of the three pitot tubes, used to measure airspeed, was blocked.”

No, the accident report mentions only a failure of the air speed indicator on the pilot side. The copilot and backup indicators were working, but the crew failed to recognize that.

I think the point is at least both sensors wouldn't fail literally at the same exact second. There would be a period of time where the sensors are disagreeing, and that would alert the pilot about what is going on.

> It’s very unlikely that both would fail simultaneously. If they did, it’s very unlikely that both would provide the same erroneous readings.

They don't have to fail simultaneously in a flight. And they don't have to fail by internal sensor problems. There are many cases in which they can simultaneously fail and give same readings, article even mentioned such types of events:

>> That probability may have underestimated the risk of so-called external events that have damaged sensors in the past, such as collisions with birds, bumps from ramp stairs or mechanics’ stepping on them.

And AF447 gives an example when such erroneous readings combined with pilot errors may lead to.

If they’re damaged on the ground, surely it’ll be noticed that the sensors are claiming an extreme AoA while just sitting there, and they’ll be fixed.

AF447 is an example of a fly by wire system that has to keep working no matter what happens, thus a bunch of redundant systems and a series of alternate modes the system can fall back on to operate in a degraded state.

MCAS, in contrast, is not a critical system. It could shut down with no problems at all. These crashes happened only because it didn’t shut down when faced with a failed sensor, because it couldn’t detect the failure.

> If they’re damaged on the ground, surely it’ll be noticed that the sensors are claiming an extreme AoA while just sitting there, and they’ll be fixed.

AoA sensors by design don't work reliable in low speed and they don't work at all on the ground.

> AF447 is an example of a fly by wire system that has to keep working no matter what happens, thus a bunch of redundant systems and a series of alternate modes the system can fall back on to operate in a degraded state.

AF447 is a good example that two AoA sensors can simultaneously have same erroneous readings. It's not that unlikely.

AF447 had two working aoa sensors that never disagreed. It didn’t have an aoa indicator in the cockpit though. You can infer what the aoa is by the vertical airspeed and engine output though. The inputs from left and right stick canceled each other and it power on stalled out of the sky landing belly first. This was pilot error because they had unreliable airspeed indication due to pitot tube icing.

Airspeed is a critical system while aoa is not. (Unless aoa is tied to mcas like in the max jets)

Can you elaborate on AF447? I’ve never heard of the AoA sensors being connected with that crash, and a quick search indicates that they were working fine.

I've confused AF447's Pitot tubes with AoA sensors. But I think point is still valid: two sensors _can_ simultaneously have same erroneous readings and we have to be sure pilots can handle such situations.

First: I said it’s very unlikely, not impossible. Second: the failure of AF447’s pitot tubes was detected immediately and the system switched to an alternate law as a result; this contributed to the crash because the pilots were less familiar with how the system operated in that mode. Third: AoA sensors operate by a completely different mechanism so even if this was what happened (it wasn’t) this demonstrates nothing relevant.

You’re just badly arguing details while ignoring the actual point here.

You’re just badly arguing details while ignoring the actual point here.

What was the point? Two sensors (in this case alpha vanes) can fail at the same time and in the same way on an Airbus:


> With two sensors, you can detect failure. It’s very unlikely that both would fail simultaneously. If they did, it’s very unlikely that both would provide the same erroneous readings.

> First: I said it’s very unlikely, not impossible.

> You’re just badly arguing details while ignoring the actual point here.

My point here was that with two AoA sensors you can't reliably detect failure. They can both fail simultaneously and provide the same erroneous readings. And because it's not that unlikely we have to be sure that pilots can handle MCAS problems when two AoA sensors fail.

> Second: the failure of AF447’s pitot tubes was detected immediately

Because A330 have three pitot tubes, right? Not two out of two sensors?

I'm not arguing about AF447 case, I gave it as an example that two sensors _can_ have same erroneous readings. Airbus engineers were not sure that can happen in real life, so stall warning issue was a real surprise for them.

Increased redundancy in airborne systems is very unintuitive. Double redundancy can be more dangerous than having a single system (at least when you're talking engines or sensors anyway).

Triple redundancy is the norm for the specific reason that it's highly likely for symmetrically placed sensors to be prone to failing in the same way not long after each other, but having a third differently placed can keep you flying.

Although there's at least one instance where an Airbus plane had two AoA sensors malfunction at the same time and outvote the last remaining sensor.

This is why critical systems are built with higher degrees of redundancy and graceful degradation of operational envelope in mind.

Training on how to deal with unaided flight is also absolutely essential. Many Airbus accidents where pilot's were caught off guard when the automation that kept them from breaking out of the operating envelope failed.

Long story short; Boeing has put themselves in the unenviable position of having delivered a product in ways that are not only illegal, but deadly, and short of pilots accepting a significant burden in the form of being as good at or better than the MCAS system at this point; a lot of man hours and capital has been expended to end up in a situation where every MAX is in a not inconsiderable risk of being scrapped.

It is not unlikely, I don't remember details, but I remember the case when two broken sensors voted over working sensor.

The problem is that in order to save tiny amount of money Boing made plane rely on unreliable sensors.

It’s not unlikely... because of some vague memory you have?

Assuming that’s true, the fact that it has happened before does not mean simultaneous failure is not incredibly unlikely. Unlikely things do happen. One might call it extraordinarily bad luck.

Maybe you're thinking of QF72 where erroneous data was "too wrong" to be smoothed out by the computer?

> " it’s very unlikely that both would provide the same erroneous readings. "

You're assuming that faulty sensors will tend to have random output. But since we're talking about a real life mechanism, it seems likely it has some erroneous states that are more likely to occur than others. For instance if the mechanism often fails up against one of it's mechanical limits, the sensor might erroneously read out the limit position every time.

You can't actually say anything about the distribution of failure states for a sensor without evaluating that particular sensor.

> It’s very unlikely that both would fail simultaneously.

Ice, insects, birds, and volcanic ash are all things that tend to cause the pitot and the static tubes to become blocked. When you encounter ice, insects, birds, and volcanic ash, it is often the case that you get multiple simultaneous blockages. Blockages of the various tubes are not statistically independent events in practice.

Pitot and static tubes are irrelevant here.

Would it trouble everyone to get a basic grasp of what we’re talking about before replying? I’m getting a little tired of constantly correcting basic stuff in the replies.

Well the other issue is the system used that bad sensor data to automatically correct the pilots with no indication of why or how to even turn it off. The pilots knew the system was wrong and couldn't force the place to correct.

> With two sensors, you can detect failure

You get a reading of 20 on one sensor and get a reading of 34 on the second, which one is correct. To achieve reliability a minimum of five sensors need be used. four primary and one back-up. If three primary agree then system normal. If two primary disagree then switch to backup.

If you get a reading of 20 on one and 34 on the other, you disregard both and disable the system.

There’s a big difference between a system which must work and a system which must not go wrong. For example, the fly by wire system in an Airbus must work. A failed sensor must not disable the system. Thus, you need at least triple redundancy to keep functioning in the event of a failure.

Boeing’s MCAS system, on the other hand, doesn’t need to work. The plane flies just fine without it. It merely needs to not go crazy. Two sensors is sufficient.

Yup, the difference between Fail Safe and Fail Operational.

I've read several of these articles about the MAX and I'm not seeing the explanation for how allowing MCAS to fly the plane only on input from AOA sensors (1, 2 or 5) is different from asking pilots to fly the plane with a fogged-up windscreen. Why not cross-check against the true horizon, for example? Doesn't seem safer to unnecessarily disregard context.

MCAS only exists to paper over a small handling deficiency. Apparently nobody (at least nobody with the power to force a change) thought that it could pose a safety problem. It’s not safety critical, so who cares if it fails? Except that it can fail in a way that crashes the plane.

MCAS only exists to paper over a small handling deficiency.

Per the article MCAS was originally intended to handle uncommon edge cases but was extended to cover additional (low speed) deficiencies. This expanded scope is what made MCAS as problematic as it is because it did away with the second input (accelerometer) and expanded the authority dramatically (from something like 0.6 degrees to 2.4 degrees of stabilizer movement).

The problem occurred in that that sensor had a privileged (unoverridable) pipeline to the horizontal stabilizer.

The pilots knew something was going wrong. That wasn't the issue. The issue was that the bloody thing could mistrim the plane to the point of nigh irrecoverability, and no one knew enough about it until two planes full of people plunged out of the sky.

The plane may be able to fly just fine; but the way this thing was developed and brought into mainstream use had critical problems in terms of essential information being communicated.

All the decisions and motivations behind these lack of communication have to some point been traced back to trying to circumvent regulations in order to prop up share price by scoring sales of a new airframe of comparable efficiency to the a320neo.

True horizon has nothing to do with angle of attack. Angle of attack is the direction the wind is coming from relative to the aircraft. It's possible to have a nose up attitude relative to the horizon, and have the actual aircraft motion be downwards at 10,000 feet per minute.

There’s a big difference between a system which must work and a system which must not go wrong. For example, the fly by wire system in an Airbus must work. A failed sensor must not disable the system. Thus, you need at least triple redundancy to keep functioning in the event of a failure.

Fly-by-wire Boeings still only have two alpha vanes. Go ahead, take a look at the next 777 or 787 you come across.

Presumably the AoA sensors are not required for that system to function.

AoA sensors are not required for Airbus FBW systems to function either. But they are required for the flight enveloppe protection system to function.

> and disable the system

When you do that you now have an aircraft the pilots aren't certified to fly.

> When you do that you now have an aircraft the pilots aren't certified to fly.

It would increase risk. But for that increased risk to materialize into harm, the plane would also need to experience an unlikely, near-edge-of-flight-envelope situation that the working MCAS was intended to handle.

This would be comparable to a plane with any other mechanical defect that is discovered in-flight. If the above situation is expected to be too-risky to continue the flight and repair on the ground, then it would give cause for an emergency landing.

> the plane would also need to experience an unlikely, near-edge-of-flight-envelope situation that the working MCAS was intended to handle.

Failure of the AOA sensor and edge of the flight envelop events can't be assumed to be uncorrelated.

Would you not need three sensors? With only two, wouldn’t it be difficult to determine which is correct?

You don’t need to determine which is correct. MCAS is not a safety critical system and can just shut down if the sensors disagree.

That assumes the sensors are likely to disagree if they're broken, which may not be true.

That's why you need 5 sensors or so on something this mission-critical. Enough that you can have a clear democratic majority if one or two goes on the fritz.

My point is that it’s not mission critical. You can lose MCAS and be just fine. That’s why two sensors would suffice.

... or two sensors and a pilot who is in control.

> what is the probability that a two-sensor model will fail pretty soon as well

It's actually higher than the probability that a one-sensor version will fail. With two sensors, you have an effective failure if either sensor fails, and the probability of that happening is roughly twice the probability that a single sensor will fail (assuming failures are independent, which is not necessarily a valid assumption).

However, with two sensors you can tell when one has failed (even though you may not know which one it was) and so the consequences of the failure might be less severe.

The problem is: now pilots need to be prepared to fly the plane with a failure sensor, which is to say, without MCAS. To do that, they will need additional training. Avoiding that was the whole point of MCAS in the first place. That's the reason it's taking so long to sort this out. Technically, it's an easy problem to solve. It's the economics that are daunting.

If MCAS is disabled for some reason because of sensor failure, how does that factor into the common type rating? Same goes for if they significantly lower how much input it provides.

The AOA sensors are effectively a consumable, and would undergo regular replacement over the life of the aircraft, the odds of BOTH of them failing at the same moment in the same flight is very very small.

If they are replaced simultaneously the chance of them simultaneously not working the next take off is non negligible.

Ok, that makes sense. But are the hours at which they got replaced are on order (or several) of magnitude lower than a two-sensor failure can occur? I hope it is calculated.

The point is that, as safety-critical equipment, you can't fly the plane if one is broken. So you'd need to have two fail within a single flight, and fail in the same way, in order to cause an incorrect activation of MCAS. With just one sensor, it's much more likely.

Note that Airbus uses three of these sensors on their planes, so that when one fails you know which one it is, and can still rely on the signals from the two remaining good ones. Then you replace the failed sensor before the next flight.

This works until two sensors fail in the same direction:

>The aircraft's computers received conflicting information from the three angle of attack sensors. The aircraft computer system’s programming logic had been designed to reject one sensor value if it deviated significantly from the other two sensor values. In this specific case, this programming logic led to the rejection of the correct value from the one operative angle of attack sensor, and to the acceptance of the two consistent, but wrong, values from the two inoperative angle of attack sensors. This resulted in the system's stall protection functions responding incorrectly to the stall, making the situation worse, instead of better.


That was my original question. What is the probability of them failing at the same flight in the same way (say they got frozen at the same angle, hit by blizzard, etc, etc). Intuition is that the chances are high given their low MTBF.

Not necessarily a diversion, but certainly not the only cause in a proper failure analysis.

I have the impression that people are overlooking the sensors. They are suppose to be very, very, reliable. Two different planes got wrong reading from sensor in the same side, this seems to be a red flag for me. I wonder in what side of the sensor cable the problem is.

They’re not expected to be that reliable. They’re small vanes sticking out the side of the nose, vulnerable to bird strikes. The article mentions hundreds of reported failures over the years. The way to make the system reliable is redundancy.

The article says it had 122 failure due bird strikes plus 85 unnamed problems in about 30 years of data.

Considering the number of flights, that does sounds reliable to me.

I still think two failures in the same sensor, in the same airplane, under the same condition, in less than one year did not happen by chance.

You have a very poor understanding of probability. At a mean failure rate of 0.66 per year, it is quite probable for 2 failures to be spaced 4.5 months apart.

I want to know why the Boeing flight computer needs pitot tube input at all. Modern ublox GPSes can easily obtain 3D lock on multiple satellite constellations within a minute of booting. Several of these in parallel for redundancy if you are paranoid. Flight controllers on fixed wings don't even need a magnetometer to stabilize. Just GPS path heading. If all else fails, solid state accelerometers are very reliable. Accelerometer only based dead reckoning works great. If all else fails, a single accelerometer should be sufficient to get the plane relatively stable. A barometer can help too, but doesn't seem necessary. These systems can be easily combined with fallback logic to keep the plane in the sky. I just don't understand what is so hard about this for Boeing. I understand airspeed is not the same as ground speed, but this should provide enough information to the flight computer to keep the plane in the air or at least stable.

If all you are using is ground-based position/speed, then you are ignoring the very real possibility that the air you are flying through is not stationary relative to the ground. In actual fact, especially at high altitudes, the air can be moving very fast, and the difference between ground speed and airspeed can be the difference between flying and stalling.

Also, your GPS measurements give you position, direction, and speed, but they don't give you orientation. You would have to have another instrument to feed that into the system (such systems exist).

But yes, it would be a sanity check.

Can a near-stall condition be detected solely with some combination of GPS, accelerometer, and barometer?

[Disclaimer: not an aeronautical enginner or pilot.]

Unfortunately, no. Stalling is a function of the wing's _angle_ relative to the flow of air, not of speed. If the angle is too sharp the air can't follow the curve of the wing. The critical angle is (pretty much) independent of speed. For example: if you stick your hand out of the window of a car traveling at 60 MPH, and hold it almost flat to the wind (say 80 deg.), then the air can't follow down the back of your hand. All of the "push" is backwards, and there's no push up. If you hold it at 30 deg. then the air flows around your hand, which deflects the air down and your hand up, very strongly.

Even if you're only traveling at 5 MPH, if you hold your hand at 30 deg. the air will flow around your hand and deflect it upward; it will just be a very weak effect.

The angle between the wing and the air flow is what is called the "angle of attack", and what the AoA sensors measure. The only other instrument that comes close is the Attitude gauge (the globe thing). However, it measures the plane's angle relative to the horizon, and air moving relative to the plane usually isn't parallel to the ground in conditions where the AoA matters.

Wikipedia article, with much detail, pictures, etc.: https://en.m.wikipedia.org/wiki/Angle_of_attack

You'll need the air speed and direction.

Normally, speed is from a tube aimed into the air. Normally, direction is from a little fin that can spin.

There are lots of alternatives:

Direction can be via multiple tubes aimed into the air, each with slightly different direction.

Speed can be from a hot wire. Weather stations sometimes use this.

You can get both via lidar. You just need to make it sensitive enough to pick up a response from minute particles of dust or ice.

I think I just invented a new way: do a short-duration high-power pulse of an electron source or an EUV laser, causing the air to fluoresce at enough distance from the aircraft to be clear of the boundary layer. Track the motion of the fluorescing air with multiple cameras.

Yay, yet another way to accidentally fry people on the ground if you accidentally switch on the wrong system. Radar already provides a way to do that.

Unless you limit yourself to flying very near the ground and very near sea level, the speed of an aircraft is more complex than a single number. In fact four different speed numbers are commonly used: indicated airspeed, calibrated airspeed, true airspeed, and ground speed.

* IAS is the raw airspeed reading from the pitot tube.

* CAS is IAS corrected for instrument errors, e.g. if the plane is at an angle that disrupts air flow around the pitot tube.

* TAS is basically CAS adjusted for altitude and air pressure. It’s the aircraft’s speed relative to the air around it.

* Ground speed (or speed over the ground) is TAS adjusted for the wind. This is the number that GPS is going to give you.

IAS and CAS are particularly important for describing performance characteristics - if an aircraft stalls at 100 knots CAS, then it always stalls at that CAS. If you try to describe the stall speed in terms of TAS you go from a single data point to a graph of speed and altitude.

"Why (does) the Boeing flight computer needs pitot tube input at all ?" If there is a strong tailwind, the plane needs a much higher ground speed to avoid stalls.

If these accidents prove anything, it's that we need a computer that takes many different inputs (GPS from the tail and the nose, pitot, barometer, AoA indicator, input from the pilot, engine RPM, etc) and put them into a mathematical model of the airplane before overriding the pilot.

Additionally the AOA sensor - which is basically a weather vane - does not output usable data before the airflow around the airplane has reached certain velocity (it needs air flowing around it). Which is reported... by the pitot tubes.

Boeing engineers did consider [MCAS activation due to failed sensor] in their safety analysis of the original MCAS. They classified the event as “hazardous,” ... could trigger erroneously less often than once in 10 million flight hours.

The incuriosity of all parties to an event categorized as hazardous is astonishing. Boeing says it's a system that's completely transparent to the pilot, and therefore there is no need to describe a failure that they say would be hazardous. What part of that passes a reasonable smell test? It's safe unless it fails, which would be rare, but if it fails people could die? But meh, it's rare so let's not even find out what would happen if it happened?

Boeing must be compelled to show their work for this probability computation, because it is clearly wrong. And both Boeing and the FAA have to answer why there's no mandatory testing of hazardous events. At least what does a simulator think will happen in various states of perturbed sensor data, and how does a pilot react when not expecting such an event?

Oh, and the part about depending on a single sensor is not, per Boeing, a single point of failure because human pilots are part of the system? That's a gem. The pilots are the backup? This poisonous form of logic is perverse.

If the pilots had recieved training, then they could be a backup. So probably whoever did that safety analysis was assuming pilots would know how and when to turn off the system, but the pilots in fact didn't know this system existed at all.

Also don't forget that they also jiggered the switches to not be able to shut off MCAS alone, only together with automated trim.

Administrative mitigation like pilots are usually the least preferential ways of mitigating hazards. Humans are often the least consistent, most fallible part of a system. If there were engineering solutions available I would hope Boeing would implement them.

There are many examples of automated systems not accounting for novel or rare situations that the original designers didn't plan for or ignored. This is why manual override should always be available as a last resort if possible. No automated system we can design today is perfect. While protections and automatic mitigation should be implemented, taking away agency from pilots or whoever else is a recipe for disaster.

I wasn't implying that humans should be taken out of the loop. I was more referring to the hierarchy of mitigation. Most preferable are to design the hazard out of the system, followed by engineering controls, and lastly procedural/administrative mitigation.

Too often systems are designed with procedural mitigation as the primary way of controlling a hazard without realizing all the human factors that come into play. Maybe the pilot is distracted because she just had a fight with her spouse. Maybe her co-pilot a bad night's sleep. Or maybe he isn't physically capable generating the force necessary to move the trim wheel.

I think too often designs can over rely on administrative mitigation because the engineering controls seem too costly or difficult to implement. In some cases, this rationalization that a person "just" has to do XYZ activities to control the outcome falls short because we don't acknowledge all the factors that person is dealing with in the moment.

In this case, to someone like me without intimate knowledge of the Boeing process, it looks like they failed at their hazard analysis. They did not design the hazard out of the system (airframe design), the engineering controls were inadequate (MCAS), and the administrative controls were poorly managed (pilots did not understand the procedures for disabling MCAS or the procedures were not capable of being executed effectively). In other words, they did not apply appropriate hazard analysis and mitigation. Hindsight is easy, I know, but when schedule pressure hits a lot of these processes are rushed.

This was premature automation caused by not fully understanding the context. Results in less friction at the cost of enabling a black swan. Bad trade off. The Viking Sky cruise ship that was 1 minute away from releasing its damage potential of about 1300 people. 4 engines stoped simultaneously to protect them selves. Risking the entire ship in one of Norway’s most dangerous waters during harsh weather. There are so many similar examples. Tank turrets self protecting and killing soldier during peace time. Automatic gearbox on military vehicle self protecting against overheating although vehicle is under enemy fire, but the sensor can’t know that.. we need to rethink how “security automation” should work. How do you know if an override is relevant? How to train the operator?

It’s like these examples of security automation are designed to have the exact opposite effect as chaos engineering.

Whatever it's worth, this whole thing has traumatized me so much it makes me fearful of flying at all. But one thing's for sure, if I have any say, I'll probably never fly a 737 MAX again.

I'm sure there are many people who will do the same. In fact, every flight I do go on now, I check to make sure it is not a MAX.

I doubt there will be enough people who think this way that it would cause a problem economically for any airlines that carry this line, and I'm sure with time, people will forget, but I sure as hell will do my best not to.

"As part of the fix, Boeing has reworked MCAS to more closely resemble the first version."

Be very wary if pilot training is not part of the "fix" to getting the Max back up in the air. If MCAS is being "rolled back" then certain situations such as "The Max wasn’t handling well when nearing stalls at low speeds." come back.

Anything short of admitting "we fundamentally screwed up, and are rethinking the poor decision to pair this engine with this airframe" as well as "we are reviewing all our design processes and how the FAA oversees every step of the process" is unacceptable. MCAS is just the horrific bloody bandage that is peeling away, it's not actually the problem here.

This probably won't happen of course, all they seem to want to do is fix as little as possible as quickly as possible while denying they ever knew anything.

If I were someone powerful like a pilot union leader I would start throwing conniption fits in public and refuse to let my people fly on Max's at all.

Anything short of admitting "we fundamentally screwed up, and are rethinking the poor decision to pair this engine with this airframe"

Can you cite the basis for this often-expressed sentiment? There's absolutely no reason why a properly-designed and -vetted MCAS system wouldn't have been a perfectly acceptable solution to any handling irregularities caused by the engine configuration.

The idea was fine. The fault was 100% in the implementation.

And no, downvotes are not a valid citation.

It's not impossible to make it work, and in the future I'd expect more and more automated systems in planes for sure.

But you have to recognize the whole engine hack is just a convoluted workaround to avoid as much pilot training as possible. The entire goal of the project seems to be to avoid ever training pilots for as long as possible. It's a brand new plane, the newest plane on the market, and the first thing you need to do to take off is turn off the cabin air conditioning. Why? Because that's what we had to do 50 years ago in the first 737.

God forbid this plane startup any way besides turning off the cabin air conditioning. If we changed that, we'd have to... gasp retrain pilots!

The problem with training pilots for a new machine isn't the training itself but rather, that a pilot is rated for one machine type only. If the MAX had a different type rating, MAX pilots would no longer be rated for the non-MAX 737. There are some larger US carriers which are 737 only, partly so that all pilots are trained for all of the machines. Having to split the fleet into two types would have a huge impact on business. Most likely these carriers would avoid getting any MAX as long as possible.

I don't know what is the correct answer to the problem, but clearly good safety regulations are trapping some carriers and Boeing. Sooner or later Boeing will have to build a true successor to the 737 (and I guess, they now wish they had sooner)

>The problem with training pilots for a new machine isn't the training itself but rather, that a pilot is rated for one machine type only.

Citation needed. I've never heard this before, except for some other person on a message board, and I've been involved with aviation and known pilots with multiple type ratings.

> There's absolutely no reason why a properly-designed and -vetted MCAS system wouldn't have been a perfectly acceptable solution

Surely that was exactly the MCAS that was installed in both of the planes that crashed.

But the fact it failed so badly suggest it might in fact be a rather difficult system to get right.

If you're arguing that MCAS was "properly designed and vetted," you're the only voice crying in that particular wilderness.

Boeing's implementation will be a mainstay in engineering ethics classes for the next 100 years, right next to the Therac-25 and the Kansas City Hyatt.

my prediction: no matter what they do, nobody is ever going to fly as pax on 737-max and fedEx, UPS, and (lol) Amazon are going to get really sweet deals on brand new fleets of freighters.

> It never tested a malfunctioning sensor, according to the three officials.

That one popped out to me. Man. Lots to learn.

> Boeing continued to defend MCAS and its reliance on a single sensor after the first crash, involving Indonesia’s Lion Air.

Also...how? So many non safety critical services use a load balancer and at least a couple of servers because who can trust just one thing working perfectly all the time?

Another one that popped out: test pilots were pushed to simulator-only by management, with simulators apparently being incomplete with regards to MCAS behavior. Bean counting and incompetence abound in safety critical areas of airplane design - I will think twice before stepping in another recent model of Boeing.

How would you design your bureaucracy so that this kind of thing can't happen? I see this type of failure all the time in organizations big and small. Sometimes things are just too complex to have an auteur that can understand the entire system and when every department strives to optimize for its specific goal shit can really hit the fan.

Don't disincentivize defect reporting?

Don't restrict your definition of "actionable" to only pertain to fixes that don't come with a monetary cost?

Don't get rid of your Quality people. Definitely don't get rid of them for raising too many defects.

Don't stop focusing on "the box" (I.e. the plane) because customers already assume it will be "high quality", and reengineer a physical engineering/design firm into some ungodly act of "financial innovation".

Don't treat regulations as something to be worked around.

Don't skimp on Acceptance Testing of outsourced software deliverables.

Make sure your CEO and Sales staff understand there are things you can not (and should not) sell.

Listen to your Unions. Don't try to work around them.

These aren't hard. They are all also things that by not doing them, Boeing set the stage for this cascaded failure of epic proportions.

Pity that American manufacturing and Engineering firms never (in my experience) took Edward W. Demmings seriously. His 14 points are a hell of a good start.

Eventually, artificial intelligence.

Maybe not in the near future, but as technology progresses and every manufacturer strives to optimize their designs with the latest features, it will become an unsourmountable task to oversee every aspect of it (efficiently). I'm not talking about actively designing, but rather for warning/flagging for potential error. In very complex enterprises like global transport or building skyscrappers there is a lot to learn from experience and little human time, but it might be very cost-efective to train all-observing self-learning AI to look over everyone's shoulder, and warn you about using the right type of bolts, or how the coming heavy rains in Guatemala might affect your supply chain.

It's not that far-fetched when you realize it doesn't need to really understand anything, just be very good at playing word association and micromanaging.

Preventing problems like this seems to me to be what AI as we know it is least suited to - failures of human organizations generally seem to be failures to choose the right context due to no individual grasping it, and my impression of AI today is that it doesn't even engage with the problem. You can have a computer program that recognizes cats, or plays Go, but nobody even thinks about "how do we make this same program spontaneously respond to a spilt glass of milk, or a hostage situation, or a fire alarm...", let alone take into account everything in the world while doing so. It feels to me like people have tunnel vision that is getting worse, and the "intelligent" software is inheriting that.

AI is only as good as its training data and goals/success conditions.

Yes, AI is designed in very particular situations to fit ver specific tasks. I never meant there to be a single mind controlling the whole world. Today you could program an assisting AI that told you when "you missed a spot" when painting your house. It's simply not cost effective. But eventually driving and medical diagnosis AI, while imperfect, will have a better success rate that humans. Do you really think that won't apply to industrial production eventually, say in a hundred years?

I've never understood why they make planes with 1-2 sensors for a crucial reading like airspeed.

Why not have 20 airspeed sensors of 5 different types? It's an obvious failure mode that your one sensor will fail and then the pilots and the computer will be left in a state of dangerous uncertainty about the situation.

I am surprised that I haven’t seen anyone make the connection to “normal accidents” [1] yet, but feel it is quite relevant in this case.

[1] https://en.wikipedia.org/wiki/Normal_Accidents

All this talk about how mcas was not designed properly or how it could be prevented from failing is eroneous.

Good safe airplane design is about a neutral flying design without the need for complex systems.

This plane is fundamentally flawed because the engines are in the wrong position because the landing gear is two short to fit them in the correct position.

The test pilot was clear about very poor flying characteristics at slow flyong speeds requiring mcas to be more aggressive.

This plane should not be flying with this engine configuration as it fails the most fundamental principal of good aeroplane design of neutral handling.

FTA: The Max wasn’t handling well when nearing stalls at low speeds.

In a meeting at Boeing Field in Seattle, Mr. Wilson told engineers that the issue would need to be fixed. He and his co-pilot proposed MCAS, the person said.

It is not clear this translates into a fundamentally flawed design. It's a serious assertion, even though at the same time it's vague. Why did it need to be fixed? To avoid pilot training? Or to pass a FAR 25 airworthiness certification requirement? We can't tell from this reporting. Months after these accidents, people are still asking this question. The difference matters.

I'm very skeptical that software can legally be used to paper over aerodynamic flaws, as I read FAR 25. In fact, neutral design is not adequate, it must exhibit positive static and dynamic stability in all three axes. Fly by wire software doesn't make a plane with negative stability behave as if it has positive stability, the software provides various safeguards in a layered manner.

Sounds to me like the main failure here is that Boeing went too far with optimising cost, in the sense that MCAS was not properly designed.

I'm certain correctly designed software can safely control critical functions, otherwise failure in a large category of aircraft systems would result in many more MCAS unrelated accidents.

This particular MCAS control philosophy seems to be a flawed control system. With reference to the the graph (link provided by obituary_latte):


With only one sensor being "looked at" at any time, and with the system not having the sense to know to stop commanding pitch down after 26 times with attempted pilot overrides, it would seem almost beyond belief that any competent team of on-the-ground engineers (as per Boeing) would not see that the system is flawed.

Would be interesting to see if this was the case, and how the likely good engineering decision was overridden by the commercial aspect.

With increased tech, comes increased scope for this kind of cost optimisation, and we must be careful in many more industries. Eg Automotive self driving cars.

This article reads to me like Boeing and the FAA have gotten their stories straight with each other and in naming names have settled upon someone who is no longer associated with either in an effort to take the heat off of both.

Why didn't they just bring back the already-certified 757 instead of stretching the 737?

The problem isn't the certification of the plane, it is the certification of the pilot. A 757 or a true successor to the 737 would have meant that pilots would have to be certified for the new plane - and that means they lose their certification for the plain 737. The MAX is targetted at carriers which already have large fleets of 737 and the idea was that the same pilot could fly both.

How confident is everyone on all the other changes to the 737?

They’ve found the MCAS issues, but with a procedure this lax I’d expect several other issues to have gotten through.

Empires destroy institutions.. they hollow them out until are barely cloths for one figure residing within. Proofing something to an institution is hard. Proofing something to one person is "easyish". The problem is not that one plane manufacturers internal culture allowed falling behind, but that this rot and decay bypassed controlling institutions, because these where hollowed out for empire reasons. You can not defeat this problem unless you solve the root node. Which are hidden deals instead of proper procedure replacing the physics of capitalism.

how is it that boeing still has not admitted fault? i guess they bet to get out through a loophole in the investigation result; investigation that they are part of i assume? something like "it was 1% pilot error" is enough to make a pr campaign from.

I'm surprised no one has mentioned Therac 25 or Normal Accidents yet.

For reference, the Therac 25 was a computer-controlled radiation therapy machine involved in several over-exposures due to replacement of physical controls with computer based ones without complete understanding of the interactions of the controls.

The Max feels very much like that. No one can really keep a whole aircraft in their head, much less a whole aircraft development project. We use computers for that, as well as mental heuristics. But if those computers and brains are not fed all the proper data and connections, they will not find the all the problems.

Additionally, there seems to be a lot of the tail wagging the dog. If this system is expected to perform according to X specifications, then by golly it will, and we will show that it does.


Edit: Please don't take the above as absolution of Boeing. Someone (a lot of someones) really should have known better.

I don't see these as equivalent, at least not based on what I've learned about the cases (feel free to correct). As I understand, Therac-25 was due to software bug and a genuine design process inadequacy that allowed it to cause a problem, that could happen with people acting entirely in good faith, simply because they didn't know better. That's why they created standards to address the design process. With 737MAX... pretty much literally everyone could tell you the decisions were bad, and so many seem to have been made in bad faith, specifically to e.g. avoid recertification and increase revenues in a pretty reckless manner.

It's not the equivalent, but it is the consequence of a chain of incremental changes, each of which is not sufficient to subvert safety margins, but together they change the paradigm.

As to bad faith, yes, I'm sure there was some of that, but generally decisions like these don't look like bad faith to the people making them. It's easy to get swamped by technical details.

> consequence of a chain of incremental changes

> As to bad faith, yes, I'm sure there was some of that

You're downplaying this. This is not downplayable, and this is not similar to Therac. "A consequence of incremental changes" is a rather gross way to paint this, as if it's hard for a single guy to see how e.g. non-redundant sensors is an extremely bad idea on its own, let alone everything else. There have been multiple huge missteps, not made by accident, each of which is individually worth a huge red flags obvious to people in different areas. That 1 single mistake wasn't enough to bring the plane down doesn't mean the mistakes must've been small or somehow downplayable. And no, this isn't some kind of gray area with people getting genuinely swamped by technical details. It's abundantly clear here's been a ton of bad faith here, that there is still ongoing bad faith even after the fact, and that they're still unwilling to address the problem properly.

I'm sorry that it seemed like I was downplaying it.

What I was actually thinking is that this kind of thing is likely to crop up in complex systems, and if we work on complex systems we should be wary for it.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact