Hacker News new | past | comments | ask | show | jobs | submit login
What is the Boeing 737 Max Manoeuvring Characteristics Augmentation System? (theaircurrent.com)
150 points by js2 41 days ago | hide | past | web | favorite | 148 comments



I'm struck by the connection between this and the Air France Airbus 332 which crashed en route from South America to France about a decade ago.

It seems that the 737MAX (at least, the one last year -- we don't know about the latest one yet) crashed because a faulty sensor caused the computer to automatically point the nose down in order to avoid a perceived stall risk and the pilots weren't able (or didn't know how) to disable that.

In contrast, the 332 crashed because a faulty sensor misled the flight crew into pointing the nose up and getting into a stall.

In both cases, crashes resulted from reliance on a faulty sensor -- but in one case it was flight crew and in the other case it was software. Solving this isn't going to be as simple as saying "the flight software should defer to the pilot" or vice versa.


I think you’re slightly mischaracterizing the cause in the AF447 flight in that there was a lot more going on than just the airspeed indicator:

AF447 crashed because of a cyclical series of erroneous inputs based on unreliable prompts from aircraft systems; a cycle that fed on itself to the extent that control was never regained. The BEA’s final report attributes multiple factors to the AF447 mishap: temporary and repeated inconsistencies and loss of airspeed indication; inappropriate control inputs; failure to identify unreliable indicated speed; the approach to stall; the full stall state; and the inability to apply an appropriate response in these flight regimes.

https://sma.nasa.gov/docs/default-source/safety-messages/saf...

Also, in the AF447 case the computer had to defer to the human because it didn’t know what the air speed was. In contrast, one thing that’s surprising to me in Lion case is that the MCAS seemed to be willing to trust a faulty AoA sensor:

The DFDR recorded a difference between left and right Angle of Attack (AoA) of about 20° and continued until the end of recording.

http://avherald.com/h?article=4bf90724/0009&opt=0

So it’s got a pair of AoA sensors that aren’t in agreement (split brain, as it were) and it’s going to command nose down anyway?

(I’m sure there’s a reason there aren’t triple redundant sensors where two have to agree or some such. I think the flight computer does require the various different kinds of sensors to agree, but I don’t think there’s triple redundancy of each sensor.)

(Edit: sorry folks, I edited this a bunch of times so the replies below may not make perfect sense. I should really start typing my replies offline.)


They can't agree though - angle of attack and airspeed are two different measurements. Together they can be used to detect a stall condition, but independently are insufficient. Furthermore, if either are incorrect or inoperable, then a stall may be misreported or not reported when it should be.

AoA is really the weak point in my opinion, the probe can get stuck or iced up; they always seemed extremely fragile to me for something so critical. Not to mention, redundancy is limited by the number of probes, of which there are only two on all the aircraft I've worked on, so if they ice up, that's it. IIRC that's what started the chain of events which brought down the Air France fight.


Given the expense and size of airliners, it might be worth determining AoA via tracking minute dust particles with LIDAR. That should be highly reliable, with very direct speed measurement in airflow that isn't in any way disturbed by the aircraft. You also get airspeed.

Maybe there is an alternative to AoA. The whole point is to determine how air will be affecting surfaces such as the wing, so one could instead just put pressure transducers all over the aircraft. From that you can determine how close you are to stalling. You can even tell where on the aircraft a stall is imminent.


I edited my comment and removed the airspeed reference but what I meant it that MCAS is only supposed to activate at airspeeds approaching stall and high AoA (and some other things). In the Lion case the plane had a pair of AoA sensors in disagreement and the airspeed shouldn’t be near stall in level flight, should it? Yet the MCAS commands nose down anyway.


AoE sensors may well be in disagreement even without malfunction: local turbulence, flying in somebody's wake, etc


AoA sensors might disagree to a point, but 20 degrees difference is ... unusual.


Define "airspeeds approaching stall". You can stall a wing at any airspeed, if the angle of attack is great enough. And being able to tell if your angle of attack is great enough is the point of the AoA sensors.


Air France 447 was an Airbus 330. I think 333 is a typo.

It was a typo, but for "332" (which is the jargon for "330-200").



A specific aspect of the AF447 incident that has not been mentioned here is the fact that the incident aircraft's computers declared data from the Angle of Attack (AOA) sensors INVALID if the measured airspeed was LESS than 60 knots.

By itself, this is a sensible thing to do, since the AOA sensors are composed of a little vane that is free to rotate and align itself with the direction of airflow, connected to an angle resolver that measures the angle of the vane. If the airflow is not fast enough, there will not be enough force exerted to reliably and continually align the vane, thus the instantaneous angle of the vane may not conform to the angle of the airflow. To quote the BEA report:

> If the CAS [calibrated air speed] measurements for the three ADR [air data reference] are lower than 60 kt, the angle of attack values of the three ADR are invalid and the stall warning is then inoperative. This results from a logic stating that the airflow must be sufficient to ensure a valid measurement by the angle of attack sensors, especially to prevent spurious warnings.

However, the pilot flying had gotten the airspeed under that 60kt threshold, which inhibited the stall warning. When he reacted by pushing the control stick forwards (the correct reaction to a stall), the airspeed increased (a good thing, you need airspeed for lift) and the AOA decreased (also a good thing). However, the airspeed quickly increased over the 60kt threshold -- well before the AOA had a chance to decrease past its threshold. Therefore, the stall warning sounded again.

This does not fit into any mental model of how aircraft operate. A pilot knows, from their training and experience, that when you pull back on the stick, you end up stalling your plane (which leads the stall warning to sound); and to recover from a stall, you must push forwards on the stick, which unstalls the plane. After you're out of the stall, after you've recovered by pushing the stick, you expect the stall warning to be silent.

The crew were likely not aware that the stall warning cannot sound under 60kt airspeed, and could not, were not given the opportunity or information to realize that "stall warning NOT sounding" can signify that the plane is in fact, very seriously stalled.

The crew reacted with control inputs to shut that stall warning up, to avoid a stall, because "silent stall warning" is supposed to mean "not stalled". Acting on that stall warning, unaware of its treacherous reversed semantics -- sounding with the stick pushed forwards, and silenced when pulling back -- sent that crew to their doom:

> Until the end of the flight, the angle of attack values changed successively from valid to invalid. Each time that at least one value became valid again, the stall warning re-triggered and each time the angle of attack values were invalid, the warning stopped. Several nose-down inputs caused a drop in the pitch attitude and the angle of attack, whose values then became valid, such that a clear nose-down input resulted in the triggering of the stall warning. It appears that the PF reacted, on at least two occasions, with a nose-up input, whose consequences were an increase in angle of attack, a drop in measured speed and consequently stopping the stall warning. Until the end of the flight, no valid angle of attack value was less than 35°.

There was no AOA instrument on the flight deck either! The computers saw the AOA increase when the pilots pulled the stick back and saw the AOA decrease when the pushed forwards, but the pilots themselves never saw that, because they were not shown a single AOA number/readout. The information they were given that was derived from AOA (the stall warning's status) was completely and fatally misleading; to the point that the pilots never diagnosed a stall:

> The crew never formally identified the stall situation. Information on angle of attack is not directly accessible to pilots. The angle of attack in cruise is close to the stall warning trigger angle of attack in a law other thann normal law. Under these conditions, manual handling can bring the aeroplane to high angles of attack such as those encountered during the event. It is essential in order to ensure flight safety to reduce the angle of attack when a stall is imminent. Only a direct readout of the angle of attack could enable crews to rapidly identify the aerodynamic situation of the aeroplane and take the actions that may be required.

> Consequently, the BEA recommends that EASA and the FAA evaluate the relevance of requiring the presence of an angle of attack indicator directly accessible to pilots on board aeroplanes.


Excellent point.

The aviation industry loves to blame pilots when things go wrong. In most cases they're right, but as systems become more and more incomprehensible, we're expecting far too much of the wetware that most pilots that are humans run.

AF447 was an engineering crash. The collection of systems on that flight deck misrepresented the state of the aircraft to the people flying it. Now you can start equivocating about various subsytems, and in particular each subsystem may have acted in a logical way for good reasons. But the overall system, the collection of subsystems, in my opinion killed those people. With the information the computers had that it knew was valid -- GPS being the primary one -- the plane could have only been in one of two configurations. If there had simply been some way of showing both configurations to the pilots at the same time they would have went level immediately. Instead, because the history of avionics is that each subsystem is either go or no-go, nobody seemed to be responsible for the collection of systems as a whole and how they interacted with people. The idea was just to train the hell out of people and make them responsible. That works extremely well with five mechanical systems with known and transferable failure modes. It's not going to work with 20 electronic systems each of which has had the kiss of technological product development on it.


> The collection of systems on that flight deck misrepresented the state of the aircraft to the people flying it.

This is definitely the case. Forcing humans, in an extremely stressful situation, to reverse engineer what different sensors are reading (or if they're broken), what their own control inputs are doing and what the computers are thinking/doing based solely on perhaps-misleading computerized annunciations and observed aircraft behavior doesn't end well.

Critical measurements -- like angle of attack -- must be shown explicitly (a recommendation of the BEA) to the humans and not solely reserved for computer-consumption. Furthermore, there is likely room for better presenting the current state of the sensors that affect the flight control computers, the status of all the automation, and the current flight control laws to reduce the "what is it doing now?" factor that is unfortunately so dominant in these incidents that combine aircraft upsets with system malfunctions, maybe as a synoptic diagram on its own ECAM page -- not crammed onto the PFDs for a pilot actively flying the aircraft.


It makes me angry. There's a good reason I led my book with off with it as an opening example. https://leanpub.com/info-ops

The plane changed modes and nobody realized it. That amazes me. To have different modes where the controls and indicators do different things and not have that information readily apparent? Ok, this is where you might could use better training. But gee. Like you said, another page on the ECAM would have been fine. You could have put a couple of wooden mechanical airplane models on sticks manipulated by the computer to show the top two best guesses for situation and ended up with something better than what they had.

I think the worst part of that disaster was that aside from engineering/program/integration management, I don't see any particular group of people that were directly at fault. Yet it all came together in a disaster. I find that situation unacceptable.


>The plane changed modes and nobody realized it

This claim is directly contradicted by the cockpit transcript in the official accident report:

https://www.bea.aero/docspa/2009/f-cp090601.en/pdf/annexe.01...

At 2 h 10 min 22,1, the co-pilot notes that the plane is in alternate law.


GPS is not really usable as an indicator of airspeed. It's certainly not precise enough to determine whether or not an airplane is stalled.

The pilots had reliable altitude and attitude information, which is all they needed to determine the configuration of the plane.


In general pilots are very smart. In the case of AF447, they were not so good. Airbus and Boeing are building smarter engines to help pilots. This seems like a good idea. The problem is the illusion that a smarter aircraft requires less training for the pilots. IMHO, it is the opposite, when the aircraft is complex, pilots need more training to assimilate all the subtelities of their behaviour.

Recently, I have discussed about the STCA with an ATC. In approach, when aircraft are turning, there are often alerts that are to be ignored. They could easily be filtered out by software, but controllers prefer a simpler and more predictable system even if it causes false alarms. Maybe aircrafts are becoming too smart and pilots not enough trained.


The key moment to me in the AF447 incident was when the pilot woke up.

I understand you can have the junior guy flying. I can even understand the intermittent conditions that could lead to mis-identification of aircraft status. But any system, no matter what's in there, should be able to have the senior person on-board approach it from waking up and understand what the heck is going on. That is a requirement of the overall system.

Could we continue to train our way of more and more complex avionics? Sure. Maybe. How would we ever know that we've reached a point where that's no longer an option? I'm happy with my "sleepy senior pilot" criteria. If that's not it, what is?

It's not unusual for pilots to drift in and out of quasi-consciousness when doing hard aerobatic maneuvers. That usually works fine because there are only so many things to look at. Contrast that to your average GA instrument pilot. There's more stuff. They're taught extensively on partial panel -- but the thing that kills far too often is that pilots suck at figuring out when a system is going bad. It's not flying the airplane. It's trying to figure out what the airplane is telling you.

Anybody that's been doing software for any length of time at all knows that computers can tell you a lot more complicated and nuanced stuff than you can absorb. The pilots and the airplane should never be fighting one another.


It's actually interesting that we're finally reaching a point where types systems in user interfaces are becoming incredibly appropriate.

If we model the stall warning as a boolean variable - then we are not able to signal the user about the fact that the warning system is in fact "inoperable".

On the other hand if the warning system had a state which would say "inoperable", or at it's best return an error - "speed too low", then this warning would never have a state which could be incorrectly interpreted.

This is exactly the same kind of mistake as a difference between NULL pointer and the 0 address you'd encounter while programming.


How did they get into that situation in the first place - with airspeed at less than 60 kts while at cruise altitude?


By pulling on the stick to gain altitude, to a point where the plane stalls, then falls like a brick, nose up.


Right, but why did they do that? The familiar mode above described their inability to get out of the stall due to the misleading alarm pattern. How do you go from cruising to stalling in the first place?


> the pilot’s action of pulling up the nose was an irrational thing to do

Actually the inexperienced COPILOT did this. The pilot realized that was a problem eventually, but too late. And unfortunately the Airbus takes the average of both pilot inputs.


Conflicting inputs are handled by averaging and an audible "DUAL INPUT" cockpit warning. In the case of AF447, the co-pilot used his "Priority Takeover" button which meant only his control inputs were in effect.


The pilot doesn't get a notice in that case? Or was he just like "k what ever"


There is both a visual and aural indicator. That being said, nobody (except Bonin) knew that the PF had his stick completely back until it was too late.

You can see more here: https://safetyfirst.airbus.com/app/themes/mh_newsdesk/docume...


That stuff is really interesting, I've been digging through random reports, documents and manuals for two hours now.


Yeah, it really is. If you haven't already, check out the full BEA report - it's incredibly comprehensive and leads to lots of "jumping off" points for further research.

https://www.bea.aero/docspa/2009/f-cp090601.en/pdf/f-cp09060...


The plane announces "priority left/right" and the pilot whose input is being ignored gets an indicator on their display.


>And unfortunately the Airbus takes the average of both pilot inputs

Really? That's the best way to handle that?


From what I read, it's a case of "can't do it the right way originally because of patents/legal/whatever; can't change it to the right way ever since because then every airbus pilot would need to requalify, every book rewritten, etc ...". Sounds like a terrible case of rotten legacy support.


A dual input alarm sounds when that happens.

The pilots should never both be trying to apply control inputs at once. There isn't really a good way of handling the situation where the pilots are not communicating and are trying to make conflicting control inputs.


Since I read that Popular Mechanics article, I think the exact same thing every time I board an Airbus plane.


Airbus has since updated its flight control feedback systems


Source for this?


I may be mistaken; I had thought I had read that Airbus had acted, but on checking up, I think not in ways which would make my brief implication useful.


Iirc the problem was an iced pito tube that gave unreliable airspeed coupled with operating in a different reality (mode of flight control per airbus) coupled with differing inputs from both pilots coupled with the fact that the stall warning was disengaged below a certain airspeed. All too often these crashes are not one single cause but a series of unfortunate events.


After I read about that I was nervous flying on Airbus for a long time. My understanding is that on Boeing the two yokes are linked so it would be impossible for one pilot to nose up without the other having his yoke move too.


So it becomes a power struggle and assuming roughly equal strength of both pilots, we are still averaging ...


If that were the only factor, yes. The mechanical linkage though can help each pilot understand and discuss the actual inputs the plane is receiving, as well as being consistent with all the aircraft that they flew in primary civilian training.


If there’s a 40lb(iirc) difference in force inputs the rod linking the two columns breaks, at which point you should figure out what you’re doing.


Yes, but at least that gives a signal to each pilot what the other is doing.

It wasn’t until a few minutes before Air France’s crashed that the pilot realized what the copilot was doing.


>It wasn’t until a few minutes before Air France’s crashed that the pilot realized what the copilot was doing.

This is not true. See e.g.

http://www.airliners.net/forum/viewtopic.php?t=772033&start=...


2 minutes between captain returning to cockpit before he realizes the problem.[1]

02:11:43 (Captain) What the hell are you doing? [captain returns to cockpit]

02:13:40 (Bonin) But I've had the stick back the whole time! [At last, Bonin tells the others the crucial fact whose import he has so grievously failed to understand himself.]

02:14:29 - recording stops

[1]https://www.tailstrike.com/010609.html


These are all descendants of the original popular mechanics article, which I don't think is a reliable source. The actual accident report is available to look at now (https://www.bea.aero/docspa/2009/f-cp090601.en/pdf/f-cp09060...). The transcript is in an appendix: https://www.bea.aero/docspa/2009/f-cp090601.en/pdf/annexe.01...

Note that at the point of the transcript you're referencing, the captain was in the cockpit but not seated at the controls. It therefore would make no difference to his awareness of the situation whether or not the control sticks were linked, as he wasn't holding either of the sticks.

Unlike the captain, the PNF and PF (who were seated at the controls) both seem to have thought that they needed to climb. See 2 h 13 min 39,7 and 2 h 13 min 40,6.


Of course the official accident report is a better source, but I’m not arguing linked controls would have helped. Just that it took the captain a few minutes to grasp the situation.


The senior copilot was in the other seat, and the problem was that the unlinked controls meant that the senior co-pilot did not know during the previous 10 minutes what the junior Copilot Bonin, who had the controls, was doing, and that his (senior copilot) input to put the nose down was doing nothing


As I said:

Unlike the captain, the PNF and PF (who were seated at the controls) both seem to have thought that they needed to climb. See 2 h 13 min 39,7 and 2 h 13 min 40,6.

There's also evidence from the rest of the transcript that the pilots at the controls were perfectly capable of observing (or inferring) which inputs were being made.

I think there's an underling misconception here that an Airbus sidestack has a "position" akin to the position of the yoke of a plane with fully manual controls. An Airbus is flown with the sidestick remaining in its central position 99% of the time. To climb, for example, the pilot makes a short backward movement and then lets the stick return to its neutral position. The flight control software will maintain the specified climb until another stick input is made. For this reason, stick inputs tell you very little. If, for example, you see the PF push the stick forward, that doesn't necessarily mean that the plane's being put into a dive. The PF might just be reducing the rate of climb. You'd have to keep track of the complete history of stick inputs to know what was going on.

On top of that, as only one pilot has their hands on the stick at any given time, for the PNF to visually observe brief movements of their own sidestick would hardly be any easier for them than just looking at the sidestick of the PF, who's right next to them.


At 2h 10m the PNF (copilot in the left seat) warns the guy on the right about speed and says he needs to go back down. I believe that this is where the issue is, where he doesn't realise that the guy on teh right is still trying to climb. I think this is something I also read in the article, and it's good to see it in the logs (thanks for posting them)


Yes but AF447 was caused by the pilot flying forgetting the basic rule of fly the correct "pitch and power" when you lose situational awareness. If he had left the controls alone nothing would have happened and the errors would have cleared when the sensors defrosted. The problem was further compounded because he took back control from the co-pilot, who was correcting the situation, without asking and reapplied the incorrect inputs. By the time the plane was going at 60kts IAS at cruise altitude it was so far out of the normal flight envelope that the computers started sounding erroneous stall warnings as the plane speeded up again and it wasn't until the captain came onto the flight deck that the situation was understood fully and by then it was too late.

Whereas in this case an undocumented system causes the plane to seemingly automatically and repeatedly attempt to fly into the ground if there is a sensor malfunction. The pilot has a short window of time, I don't know like 30-90secs?, to figure out what is going on and switch off some specific obscure switches to disable the malfunctioning system and then wind the trim back manually before the plane becomes uncontrollable.


They were different sensors though. In Air France 333, the pitot static tube, which measure airspeed, was jammed by ice. In Lion Air 610, the Alpha vane, which measure angle of attach, was faulty.

But I agree that the solution is not as simple as pilot vs. software. At the end of the day, these sensors and computers need extra redundancies.


If redundant sensors contradict each other, what should the pilot do next? That's not easy to figure out. Even if you had say 10 redundant sensors, and 1 doesn't agree with the other 9, is there cause for concern or no? And I'm not sure they'd install 10 sensors either, eventually you'll also get information overload for the pilot.


The aircraft will have configurations that are known to be good. For example a particular thrust level and pitch at cruise that should avoid overspeed. Or using a normal ascent profile at lower altitude, or full thrust in a go around. It should be perfectly possible to divert and land without functional air speed or AOA sensors. And there are other good sources of independent data. For example GPS/radar ground speed, the engine stats, vertical speed, wind data etc.

Although I agree that some pilots in the past have not performed well with these kind of complex failures. And criticially the solution could be an argument for more automation or less automation. And that plays into more political discussions about the future.

I wonder if a third option would be to report data in real time to the manufacturer. Have a mission control style setup that is trained to rapidly analyse data and give immediate advice to the pilot. With better communication systems this could be possible. Although it does erode the concept of "pilot in command" which is another political issue. Although pilots takes advice all the time from ATC and that works well. Why not advice on the aircraft systems?


> If redundant sensors contradict each other, what should the pilot do next? That's not easy to figure out.

That's not easy, but good engineering is making plans for every sensor failure. Some failures are catastrophic - then make redundant sensors. Some are not - just ignore that sensor and try to infer that information from other sensors or defer to pilot. In my car there are many sensors and engine can run with many of them defective or disconnected, but typically in "safety mode". I even had failure of gas pedal (CAN-connected, so my car is drive-by-wire) and car managed to drive, but it used on-off mode and accelerated and decelerated very slowly. Similarly even with faulty sensor you can try to extract some data from other similar sensors and create safeguards.


That's generally not how redundant systems work. They'll usually implement some form of a Kalman filter or whatever cool kids use these days to filter out bad readings, while alerting the pilot that one of the sensors seems to be faulty.

Usually things that can kill people are triple redundant, where one goes bad, you can trust the other two readings for this reason.

I'm surprised there is a single point of failure like this in the design. But after I've read a bit about how Boeing released this product, maybe I'm not surprised.


The doesn't account for a systematic failure. It doesn't matter if you have 1000 sensors if they all fail at the same time.


Then you could also use more than one type of sensor, for example accelerometer & GPS


Personally, I believe the problem is too few sensors. Don't make it 3 or 10, make it 10K sensors. If 10% are off, you can simply ignore them, and not even alert the pilot.

Living beings have millions of sensors for most input and are able to work with conflicting sensory information because of this high redundancy.


> In Air France 333, the pitot static tube, which measure airspeed, was jammed by ice.

In such a case, all of your 10k sensors would have been iced.


But they would not go down all at once like one or two sensors. As they would be spread across the plane, the system would "feel" it is losing sensors and be "aware" of it.

It would know when to invalidate sensor input and tell the pilot about it. With a mix of Pitot and other sensors, it would be even more precisely able to know when to stop trusting the sensors. This is something that is currently lacking.


Do you propose to grow airplanes, too?


Minor correction: I believe that was Air France flight 447, not 333, which involved an Airbus A330.


Airbus 332 he refers to stands shortened for Airbus A330-200


My mistake! Thank you for the correction!


Different sensors, yes. But similar issues -- in one case the pilot stalled the plane, and in the other case software designed to prevent a stall caused an unrecoverable dive.


AF447, I think, was just a bunch of UI errors that led to the deaths of hundreds.

As I understand it, after one airspeed sensor failed, the plane switched from "normal law" to "alternate law". In "normal law", there is no control input that can take the plane out of its safe flight envelope (i.e., you can pull all the way back at the lowest possible airspeed and not stall the plane). In "alternate law" there is no such protection. When AF447 crashed, the first officer spent most of the time pulling all the way back on the joystick, which will (and did) stall the plane. This, to me, indicates that he didn't have any idea that the plane had switched into alternate law, which seems like a UX error. I bet if all the screens switched from white-on-black to black-on-white when going into alternate law (or something more night-vision friendly) there would have been no doubt and there would have been no incident.

The other problem was that the captain and first-officer were inputting conflicting control inputs and didn't know it. The plane just averages them, and provides no indication that it's doing so. The captain sees that he's pushing all the way forward on the joystick and has to wonder why the plane is still stalling. Meanwhile, on other planes the control columns are mechanically linked, so you have to fight against the force of the other pilot which will signal you to say something like "dude let go" and regain control over the plane.

I feel like we should have fixed this in the 1980s when this happened:

https://www.youtube.com/watch?v=-kHa3WNerjU

But here we are in 2019 with basically the same problem. I fear that Boeing hasn't learned Airbus's lessons and the 737 MAX is confusing people in the same way that people were confused by the first A320s. The solution has always been "blame the pilot" but I blame the software.


>I bet if all the screens switched from white-on-black to black-on-white when going into alternate law (or something more night-vision friendly) there would have been no doubt and there would have been no incident.

You're talking about two pilots who couldn't figure out that the plane was stalled even though it was descending rapidly with a nose up attitude. There's not much that UX can do to compensate for that level of incompetence. Holding the stick back in that situation would never be appropriate, regardless of which control law applied.

>The other problem was that the captain and first-officer were inputting conflicting control inputs and didn't know it.

This is a persistent myth about this accident. See e.g. the following post:

http://www.airliners.net/forum/viewtopic.php?t=772033&start=...


"Solving this isn't going to be as simple as saying "the flight software should defer to the pilot" or vice versa."

Actually it is.

2 sensors: Defer to the pilot with alarm that sensor data is unreliable if inconsistent.

3 sensors: trust majority of signals.

But "deciding" by software for one signal when having only 2 sensors with inconsistent results is negligence. https://en.wikipedia.org/wiki/Air_France_Flight_447#Airspeed...


The root cause is that they were trying to work around a hardware problem with a software patch. In the old days, they would have balanced the incidental high-aoa lift from the moved and bigger nacelles with a larger elevator surface instead of adding a jack-in-the-box surprise feature into already borderline incomprehensible autotrim logic.


As a non pilot, AF447 crashed because the guy flying it kept pitching up during the entire duration of the stall, 30k feet. The captain finally realised what was going on just a few thousands feet off the ground and starts yelling the pilot flying to pitch down, and he has to repeat the command several times because the guy just isn't listening. Honestly it's just sad. 200 people killed because of an idiot. The same captain says "angle of attack 10 degrees" just a second before impact. The same pilot flying suggests to deploy airbreaks because he felt like the plane was going too slowly - during a stall. I mean it hurts to even think about this.


As a non-pilot as well, it is a crazy situation.

My understanding is when you can’t trust your instruments, you set a certain level of power and pitch and you’ll know you’ll be ok. The plane will keep flying.

This guy is pulling back hard on the stick the entire time. I don’t think many pilots would ever think “this is the appropriate thing to do in this situation.”


That's what panic and confusion does: you stop thinking and cling to reflexes.


Really, flying by instrumentation is what you're saying is the problem and that's the truth. There's no way around that though. I did read that the sensor for the MAX is on both sides but is only activated on the side of who is flying, the pilot or the co-pilot. If the sensor fails, the plane fails. They could at least have both sensors active as a failsafe in case one is acting up.


If the sensor fails, the plane fails

I don't believe that's true - if the sensor gives an erroneous reading, it's up to the pilot to recognize uncommanded trim and disable it by shutting off power to the system that moves the stabilizer.

Really, flying by instrumentation is what you're saying is the problem and that's the truth

Flying by faulty instruments is clearly bad, but flying by (working) instruments is much better than the alternative, every pilot going through instrument training learns very quickly that he needs to trust the instruments when flying in poor visibility, his senses don't work.

They could at least have both sensors active as a failsafe in case one is acting up

If you have two sensors and one is acting up, how do you know which one is true and which one is false?


> If you have two sensors and one is acting up, how do you know which one is true and which one is false?

Typically it is possible to determine this by looking at reading stability. If one sensor is constantly stuck or changes values too fast (0-100% several times per second) or too slow, you know it's faulty. Those sensors should be engineered so that you can know if it's faulty. If you can't detect faulty sensor, it's just bad engineering.


If you have 2 sensors, how do you reliably detect that one of them is bent or installed incorrectly and gives an inaccurate, but stable, reading?

If the sensor is completely malfunctioning it's easier to detect, but what about when it's just wrong?


> how do you reliably detect that one of them is bent or installed incorrectly

During standard pre-launch or manufacturing tests. It should be verified by applying some change and checking that this change is measured correctly.

Generally for critical sensors, you can add some circuitry or even mechanical means of checking (eg heater on temperature sensor). If mechanical side fails, whole sensor module is marked as failed. Critical sensors should be designed in a way that you know when it fails.


For many sensors especially critical ones there are multiple independent ones (like indicated airspeed). If they deviate there are standard procedures to determine which one(s) are faulty. For IAS you basically set a determined thrust at a determined pitch. Then you have a table which gives you IAS for the altitude you are at. This will determine the faulty pitot tube. I think for other sensors there are similar procedures.


You can also have 3 sensors and use a simple 2 of 3 check. The likeliness of 2 failing at the same time would probably be pretty low.


> it's up to the pilot to recognize uncommanded trim and disable it by shutting off power to the system that moves the stabilizer.

You have other things to do on that flight stage (right after flaps have been set to 0)


Then you ought to react pretty quickly to disable the malfunctioning system that's going to force you into into a crash so you can continue flying the plane.


Yeah you have no idea of what you're talking about. An aircraft is not like your PC where you just click "ok" to some spurious popup and everything is fine

Cognitive load is a factor in many accidents. You don't have the plane doing something stupid during a critical stage (like take-off) with the excuse that you can "just disable it".

If a plane has an issue in keeping a stable flight but instead wants to point up or down beyond flight limits it is defective.

Edit: https://twitter.com/SarahTaber_bww/status/110525655715412787...


Isn't handling unexpected conditions the entire reason for having a pilot onboard?

If the pilot is not expected to detect and respond to something like a faulty sensor that causes an unwanted control surface movement, then why have the pilot onboard at all? Why not just let the plane fly itself?


The 37 max was designed to change stabilizer pitch in a key moment of unstable flight and that trim is not able to be overridden by elevator input. You have to disengage a thing a t a critical altitude when flaps 0 happens at only 1000’ or so agl on a system you didn’t know exsisted and only activates under a few uncommon parameters to counteract the large rotation moment that the forward engines give. You really don’t quite know what you are talking about. 1000’ per min vertical speed is not uncommon for a controlled descent.


While you seem to know what you're talking about, you're not very good at expressing it at all (lack of proper grammar doesn't help).


it was on mobile, sorry.

perhaps this is more clear:

The 37-max was designed to change the stabilizer to be more nose down in attitude in a key moment of unstable flight and that nose down stabilizer trim is not able to be overridden by elevator input on the yoke because of the design. To fix it, you have to disengage the runaway trimming of the stabilizer at a critical altitude when you disengage flaps at about 1000’ or so agl(above ground level) on a system you didn’t know existed. To further complicate this, it only activates under a few uncommon parameters(max thrust, high angle of attack, steep bank angle, flaps retracted, autopilot disengaged) to counteract the large rotation force that the forward engines give. 1000’ per min vertical speed is not uncommon for a controlled descent so you don't have much time to fix it. You really don’t quite know what you are talking about.


Unexpected conditions is not to mean "the aircraft software design is stupid and will try to kill you".


Oh please. Go read about some of the rudder hard-over failures the 737 suffered over the years. A pilot should be able to fly out of these things. I think unfortunately the children of the magenta line are the ones truly trying to kill us.


Surely autopilot failure is a predictable fault regardless of the underlying cause?


There isn't one predictable manifestation of autopilot failure you can train over.


"Unexpected conditions" are different from "a plane that actively tries to commit suicide"

And if you think all pilots do is respond and detect to fault, I suggest you learn more about their duties on the cockpit.


If the pilots don't have the cognitive load to take over manual control at any time, let alone react to the question of sensors that are returning impossibly divergent readings, surely they are already overloaded.

For that and other reasons I don't know if it's as difficult as you say; these sort of things are where very intelligent UX design can play a major role.

For impossibly divergent sensors as there were in this particular case, if the difference between divergent sensors is between a normal flying pattern and a crazy disastrous scenario, two things should happen— (1) both sensors should be considered untrustworthy and their use in automation should be suspended, and (2) the pilots should be asked the straightforward question "are you in a normal flying pattern" and if so, re-enable the sensor which fits this pattern.

Additionally/alternatively, they could fit a few off-the-shelf commodity accelerometers/gyros directly to the circuit board of the flight computers and use their readings exclusively to predict which "real" sensors is most likely correct. For example if one "real" sensor suddenly swings and the other one didn't, has there been any corresponding shift in the accelerometers? This would be a perfect candidate for machine learning with a few thousand hours of flight simulator data with simulated sensor failures based on known sensor failure modes.


> and (2) the pilots should be asked the straightforward question "are you in a normal flying pattern"

How do you do this when the pilots can't see outside?


I’m not saying that answering the question is mandatory. But if they’re cruising and everything seems fine, or if they’re mid takeoff and everything seems fine... probably the sensor which is indicating a nosediving is probably fine.

If they respond with “no” then they should be given maximal control of the aircraft with the computer not restricting control or making any large scale adjustments itself.


I don't think it's quite that simple. In the case of the Lion Air flight, one of the AoA sensors had been recently replaced due to problems on an earlier flight; in the case of the Air France flight, there were icing conditions which were known to affect the airspeed sensors.

It's possible that both of those flights could have been saved given better awareness of how much those sensors could be trusted at the time -- for example, if Lion Air's engineers, upon fiddling with AoA sensors, had been able to tell the avionics systems "the AoA sensors might be unreliable; don't do anything automatically based on those sensors which might crash the plane".


It is difficult to understand the design decisions made on these complex software systems without any real domain knowledge, but one can't help but wonder whether this whole MCAS system is pushed into the field prematurely.

For example, we know that the flight control computer has access to altitude information. Why would it let the MCAS push the nose down, when it knows that you have very little altitude margin? (I am talking about the Lion Air flight).


Preventing a stall at low altitudes is arguably more important than preventing one at high altitudes.

The MCAS is a compensation for extra powerful engines. Those engines are most powerful at low altitudes as well.


That is correct, but remember that the aircraft wasn't stalling, the AoA sensor was bad.


I think the problem is that this whole automatic system is to prevent crashes because the engines change how the plane behaves compared to a normal 737 which is the training Boing was trying to avoid in the first place.

So for them to turn off the automatic system, they really need to be training on how to fly this variant of the 737.


Usually, with things like airspeed indicators for example, there are two sensors, one on each side which get activated depending on who is flying, like you said. However with airspeed for example they are supposed to monitor the difference in reading and if one sensor goes faulty the system should switch over. Why that doesn't happen with these sensors I have no clue


I don't understand why they don't have multiple redundant sensors or multiple technologies that detect the same thing.


In the 737MAX case, there is no human interaction needed for the crash.


It sounds like no one system of keeping the plane level is ever going to be fool & error proof.

OK, this might be a stupid idea, but what about a simple old spirit level, a fancy oil based one that will never freeze, mounted to the side of the cockpit to give the pilots a emergency true level backup when all else fails?


A spirit level does not work in an accelerating frame of reference. From the point of view of a spirit level, there is no difference between straight and level flight (1g due to earth's gravity) or a spiral dive (>1g due to a spiraling turn downwards).

Gyroscopes work, which is why artificial horizons are a thing. But an artificial horizon doesn't show anything except where your nose is pointing. You need all sorts of other sensors to tell you which direction you are actually going.


OK, thanks for the succinct explanation. Its nice to know that people smarter than me are trying to solve this.


Not smarter, just an aviation layman geek :)


A spirit level does not tell you anything. The had existed in planes almost since planes exist.

You can not differentiate a force coming from plane acceleration and deceleration from gravity(which also comes from some kind of apparent deceleration). People in the space Station are being affected almost with the same gravity that you or me.

In fact, a spirit level or your ears provide you with wrong information all the time with no visibility.

With planes, the first thing you have to learn is not trusting your senses.


It's ironic that it's a Boeing with bad software (or sensors) interfering between yoke and plane. After all, historically, Boeing has had the philosophy of trusting the pilot with direct control of the plane. This is in comparison with Airbus, where controls are by stick instead of yoke, and pilots are taught to trust the flight computer and move the controls less. (Of course, these days both manufacturers build fly-by-wire planes.) [1]

The addition of a 'maneuvering characteristics augmentation system' seems very unlike Boeing. I guess it became necessary after adding too-big engines without otherwise adjusting the airframe... [2]

This "we'll fix it with software later" attitude works well with CPU errata and maybe Teslas, but it doesn't look like we're there yet for airplanes, unfortunately.

[1] https://www.pprune.org/tech-log/10321-stick-vs-yoke-airbus-v....

[2] https://theaircurrent.com/aviation-safety/what-is-the-boeing....


This "we'll fix it with software later" attitude works well with CPU errata and maybe Teslas, but it doesn't look like we're there yet for airplanes, unfortunately.

I'm sure I'm preaching to the choir here, but I'm absolutely against the "fix it with software" attitude. I'm working with embedded systems, now, and while we can often make it work by tweaking the software, it always results in a less than optimal solution. Fix the hardware; get it right, preferably before manufacturing a few thousand units.


Doesn't work in the current speed of development any more. Time to market is more critical than maturity, sadly. That's why every system can be fixed in software now, and that's also why every system can be pwned now.


Lots of great technical discussion about MCAS and potential improvements in here.

I think the bigger issue though that needs to be addressed is the certification and training process. What factors, other than the obvious engineering factors, were driving the design process? Why were those factors allowed to also influence the training administered to pilots? After the Lion Air crash the response from US pilots unions seemed to speak volumes. Pilots were not informed of this system on their new aircraft and as such were treating the aircraft the same as all other aircraft falling under the 737NG type rating.

Airlines and Boeing worked together to minimize training costs to help the aircraft sell easier and as a result critical systems knowledge was left out. Knowledge that likely could have saved Lion Air.


Here's a pilot describing the MCAS system on the 737 Max:

https://youtu.be/zfQW0upkVus?t=220


Super interesting, he's really great at explaining things in laymens terms. Thanks for sharing


I follow Mentour's channel pretty regularly. For anyone with any flying anxiety, it's great to see how seriously they take their jobs, how regimented they are and how really safe you are flying.


The big issue here is how much of the existence of this system, the desire to NOT make OTHER changes to the plane that would inhibit the pitch-up tendencies, and the lack of necessity for another type rating were all driven by simply an attempt to skirt existing regulations and save Boeing and Southwest a ton of money.

The reality is that this was the whole deal with the Max. All of these decisions were made in a context were avoiding the requirement for a new certification or type rating for pilots was a must. We can't avoid the reality that had the necessity to skirt regs not been there, Boeing would have definitely come up with a different solution from an engineering perspective.


One of the articles this one links to [1] makes what might be a significant point, at least with regard to the Lion Air crash:

"[The MCAS-commanded trim change] can be stopped by the pilot counter-trimming on the yoke or by him hitting the CUTOUT switches on the center pedestal. It’s not stopped by the pilot pulling the yoke, which for normal trim from the autopilot or runaway manual trim triggers trim hold sensors. This would negate why MCAS was implemented, the pilot pulling so hard on the yoke that the aircraft is flying close to stall.

"It’s probably this counterintuitive characteristic, which goes against what has been trained many times in the simulator for unwanted autopilot trim or manual trim runaway, which has confused the pilots of JT610. They learned that holding against the trim stopped the nose down, and then they could take action, like counter-trimming or outright CUTOUT the trim servo. But it didn’t. After a 10 second trim to a 2.5° nose down stabilizer position, the trimming started again despite the pilots pulling against it. The faulty high AOA signal was still present.

"How should they know that pulling on the yoke didn’t stop the trim? It was described nowhere; neither in the aircraft’s manual, the AFM, nor in the pilot’s manual, the FCOM."

From other sources, it seems that Boeing's decision not to reveal this feature was justified, in part, on the grounds that the documented procedure for dealing with a trim runaway was the same as for preceding versions of the aircraft. This does not seem to have taken into account that the pilots' experience would differ from what they had been led, through their training, to expect, and likely to cause confusion over what was going on.

Of course, now that this has come to light, one would hope that every 737-MAX pilot would have by now been well-prepared to handle the situation - unless there is some additional issue which has not been identified yet.

[1] https://leehamnews.com/2018/11/14/boeings-automatic-trim-for...


To add to your hypothesis - making a change to the fly-by-wire system was probably much cheaper than making airframe changes to make the aircraft more neutral in flight (airframe changes which would likely also negatively affect fuel economy).

So Boeing took the cheap rout and decided to have the computer handle the pitch-up tendencies of the aircraft...

This kind of fix is quite common on fighter aircraft which are purposely designed to not be neutral/stable in flight for better handling characteristics (among many other reasons).

I have no idea how common this kind of fix is on a passenger aircraft.


It is a violation of the brand expectations pilots have for Boeing.

Boeing introduced fly by wire after Airbus and it did it in such a way to preserve the feel of flying a plane, in particular if you have a fight with the system you could pull harder and win.

Some pilots have strong feelings about this and are loathe to switch from Boeing to Airbus or vice versa.


As a pilot (sadly not a Boeing, or anything close to that side), I think this is a non-trivial point. Every second professional airline pilot you will talk to will talk about that philosophy difference between Airbus and Boeing, and 99.999% of them will take Boeing's side.

Sadly, pilots really don't have much of a say in those discussions. They'll fly what's put in front of them, but I can assure you that all feel safer being given a plane not trying to actively fight them.


Are you saying now we have pilots trained for commercial that are flying planes designed more like fighter jets?

Eeek.


> We can't avoid the reality that had the necessity to skirt regs not been there, Boeing would have definitely come up with a different solution from an engineering perspective.

"Necessity?"


Bad choice of words. I meant to imply that someone in upper management made this a requirement, and then it was up to the engineers to figure out how to accomplish that, contrasting it to a view that this was a pure engineering failure. If this were completely an engineering consideration, I'm sure they would have come up with a different solution. The (artificial) political constraint is what led to this particular design.


This also suggests a failure by the FAA. Not to deflect any blame from Boeing, but regulators have to assume that companies will bend the rules as much as they can, if it’s to their benefit. Perhaps the rules on what requires a new type rating are too lenient.


On the subject of why Boeing didn't think to tell pilots about the MCAS:

> "Since it operates in situations where the aircraft is under relatively high g load and near stall, a pilot should never see the operation of MCAS"

This seems to me to indicate a much deeper issue with the aviation industry than the 737 MAX issue. Aren't pilots as part of their simulation time testing the limits of the flight envelope of the aircraft they're flying? If so surely they'd be expected to encounter the MCAS system.


I think Boeing claimed that no retraining of existing 737 pilots was necessary for the new model.

This attitude has led to crashes in the past as well.


If airlines would need to invest in a lot of training for existing 737-non-MAX pilots, that would probably have made the plane a lot less attractive. Being "a regular 737" is part of the appeal of the 737 MAX.


Going back a few generations, the earlier 737s were plagued by crashes as well, relating to rudder issues:

https://en.wikipedia.org/wiki/Boeing_737_rudder_issues


From that article: "leading the manufacturer to suspect and insist that the pilot had responded incorrectly"

After a few years of course it was concluded that it wasn't the pilots fault.

I find it appalling that Boeing is always and every time trying its damndest to blame the crew, even before the investigation has concluded. This is a pattern, and imho should be penalized by law. Its disgusting to blame people who are dead and cannot defend themselves against the allegations, when there's no conclusive evidence. I wonder if they apologize to the crews family when they do that and are later found out the be in the wrong. I suspect they couldn't care less though.


Yes, their reaction to crashes are terrible. Just look at what happened to Lauda Air Flight 004 that came up in a thread about this yesterday: https://en.wikipedia.org/wiki/Lauda_Air_Flight_004#Investiga...

"Lauda attempted the flight in the simulator 15 times, and in every instance he was unable to recover. He asked Boeing to issue a statement, but the legal department said it could not be issued because it would take three months to adjust the wording. Lauda asked for a press conference the following day, and told Boeing that if it was possible to recover, he would be willing to fly on a 767 with two pilots and have the thrust reverser deploy in air. Boeing told Lauda that it was not possible, so he asked Boeing to issue a statement saying that it would not be survivable, and Boeing issued it. Lauda then added, "this was the first time in eight months that it had been made clear that the manufacturer [Boeing] was at fault and not the operator of the aeroplane [or Pratt and Whitney]."[18]"


Wow... looks like these investigations take forever. Hope Boeing patches soon.


I mean just to give an anecdote that is slightly worrying; I was about to land in Grenoble and it was quite rocky the turbulence and incredibly windy, you could tell we were coming in too fast and the pilot decided to back out of the landing and take off again with the wheels nearly on the runway.

Suffice to say the angle of attack was extreme and the power produced by the engines quite impressive/terrifying. I would say anything that decided to change the express wishes of the pilot in that situation would have been hugely dangerous.

Can someone explain why ever overriding the pilot rather than warning them would be a good thing?


I can't find a source right now but I remember reading that there had been a few cases where there had been an emergency situation and pilots had ignored stall warnings, resulting in the plane stalling and crashing. This was why Boeing had elected to make the system on the 737 Max 8 override the pilots inputs.

I think having an emergency system that overrides the pilot is fine, but it is crucial that the pilot can easily and quickly disable such a system when they know the plane is doing the wrong thing.


> a few cases where there had been an emergency situation and pilots had ignored stall warnings

I remember that being about pilots' becoming over-sensitized to visual (red lights) over audio ("stall warning" audio) warnings during a crisis, often ignoring the audio inadvertently (i.e. not hearing it).


You say that «the angle of attack was extreme», but are you sure about that? Bear in mind that AoA is not equal to pitch. As the engines were producing a lot of thrust, the AoA might as well have been relatively low once positive climb was established.


From the diagrams I see online [1] it's difficult to know either way because the wind direction plays a part in the AoA. It could have been more or could have been less than I felt it was, but for certain the pitch was much higher than normal flying.

[1] https://en.wikipedia.org/wiki/Angle_of_attack


On the MAX, this case ("take off/go around power") seems like it is one of the cases that MCAS should have an interest in protecting against. The risk from the high/forward engines causing lift and inducing a stall seems like it would be high here.

I've read conflicting information about whether MCAS would activate during TO/GA. But regardless, if you're saying that you don't want MCAS to activate, it seems like you're also saying that you want the pilot in control during a moment where the plane is so liable to stall that they invented a system to try to safeguard against it happening.

(Not a pilot.)


My local airport (Halifax YHZ) has notoriously bad landing conditions, and around a year ago I was in a 767 that aborted its landing and went back around. Turns out (at least as far as I've been able to tell from Wikipedia etc.) that the entire aborted landing sequence is activated in one step by the pilot and the plane does all the multiple complicated things it needs to do (change thrust, trim, flaps, etc.) mostly by itself. How that would work with bad sensors is a separate question - although presumably by this point in a flight you'd know what's up with your sensors one way or another.


I agree with what you are saying in principle. However, the MCAS system apparently is designed to not kick in when the flaps are extended. In your situation, since you were descending to land, the flaps would have been extended and MCAS wouldn't be activated. The pilot would have kept the flaps that way until the aircraft reaches a certain altitude.


For what it's worth, I'm not sure that this is true. I think there is a conjunction involved -- MCAS is activated when [a] or [when flaps are up and b]. I'm not sure whether [a] covers the take off/go around case, but it seems like it might since that case produces high AoA due to the extra lift generated by the engine placement at high load.


Two planes full of passengers destroyed because the pilot couldn't get the nose to point upward and we find a secret computer system that overrides manual controls and automatically points the nose downwards?

Yeah, sounds like a real possibility to me.


Interesting that this website prevents you from copying text


You can use the option in your browser to view the page source and grab it from there.


Another few good technical articles on MCAS, and other preexisting automatic stabilizer trim adjustments on the 737:

http://www.b737.org.uk/mcas.htm http://www.b737.org.uk/flightcontrols.htm#Stab_Trim


To me, the biggest contradiction (as it seems on the design of the system) is that it seems to rely on very fickle measurements to command big movements on the aircraft. So, in the case of Lion Air, you have 2 contradictory inputs and you're basing an automated decision on it?

Also, it seems aviation hasn't caught up with 20th century fluid measuring techniques. Speed of fluids in pipes can be measured non-invasively. With the biggest caveat that doing things inside a pipe on the ground is easier it seems we're still using 19th century techniques to measure fluid speeds and direction. Which didn't matter too much when their noisy/unreliable inputs were simply ignored or compensated by pilots and mechanical devices.

No real world scenario (at least not a surviveable one) can produce the sensor readings that happened in AF447 (sudden drop in airspeed) and in the Lion Air case. But this is being taken at "face value" (there is some filtering) and acted upon it (with no contradictory input from other systems - for example an accelerometers - and yes it's not so obvious how to put everything together).


The AoA sensor doesn't measure fluid pressure in a pipe; it measures fluid direction with a small mechanical wing that rotates to align itself with airflow [0]. Thus it has to move, unlike a pitot tube. If it can't rotate for some reason, it gives faulty data.

[0] https://aviation.stackexchange.com/questions/2317/how-does-a...


Yes, I'm aware of the differences, but you could measure air speed in two axis then calculate AoA from there, that's what I mean.


It's done. Here you go, only a few hundred dollars:

https://www.dynonavionics.com/aoa-pitot-probes.php


Nice! Finds AoA through vector addition. Now I'm curious why Boeing uses a moving sensor. Seems like having two different kinds would be ideal.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: