One of my pet peeves: the first 7 "causes and contributing factors" effectively blames the engine technicians for taking shortcuts. Nowhere does it say anything about the scheduling pressure from management that made the technicians consider shortcuts in the first place. If you have hiring and firing power and you tell someone they need to get the plane back in the air a few days later, they will do things to get the plane back in the air a few days later.
This is considered simply a "risk" by the report, and not a direct cause. As a consequence, the actual actions taken basically all belong to the category of "fix the engineers", "fix the flight crew", or "fix the plane". Not a single "fix the stressful maintenance environment".
It really underscores the importance of using a framework like HFACS [0] in a situation like this, because you end up exhausting all possible sources of problems and ensuring that you reach the correct root cause. It seems abundantly clear that the forcing-function was getting the aircraft back into service as fast as possible, not as safely as possible or as carefully as possible. A well-designed organization would have said that any unexpected contingencies during the maintenance process would have led to a full re-evaluation of the situation to understand it. The moment the engine didn't match their understanding, they should have scheduled a replacement plane or cancelled the upcoming flight, and reset expectations about how fast the plane could be back in-service.
> There are way too many jobs where management doesn't give the front-line worker an appropriate amount of time and/or resources, and when things inevitably go wrong, the front-line worker is blamed, and the problem is studied as if the only factor was the front-line worker.
In this case, there is no evidence here of any ontoward pressure on the maintenance crew, and there is not even any evidence that anyone in the chain of command outside of those involved in the task had any inkling that this might develop into a schedule disruption.
On the contrary, the evidence strongly suggests that the maintenance crew did not feel they were under any special pressure: in their minds, they had found the correct solution, one which would allow them to complete the job on time, so any risk of flights being delayed or canceled had been solved.
You do not even have to work in the airline industry to see that unscheduled maintenance risks creating delays, and that is less than optimal for everyone involved - no-one has to be told it. If that knowledge creates too much stress and conflict of interest to be safe, commercial aviation would simply be untenable.
This reminds me of times in the first couple of years of my career, where me and some other junior people were terrified that we weren't going to get a project done on time. We would be wracking our brains for out-of-the-box ways to get it done, whispering to each other, having trouble sleeping, and then we would be amazed that the lead developer would simply go to our manager (while we were still desperately brainstorming) and tell him we wouldn't finish on time. And nothing awful would happen. The lead developer was fine. We were fine. The project finished late. Nobody got fired.
Now as a senior developer it's hard for me see why I was so terrified. Some managers yell and bluster. (Or, they used to, back in the day.) Some managers shame and insinuate. Some managers obliquely threaten. But they all, no matter how they express their authority, have to face up to the unpredictability of project planning. The question is not whether your manager is nice or mean. The question is, do they really think they're better off firing you and replacing you? Whether because they think they can get an upgrade or simply to pin the blame on you for something. After you've been around the block once or twice, you understand that a manager who is chill to your face is ultimately subject to the same motivations as a manager who uses stress as a weapon.
I wonder what it's like for airline mechanics? They are represented by unions. I would think that safety infractions would be the real potential career killers, but that's just me guessing.
"It's not management's fault because engineers should expect to and put up with managers treating them harshly (yelling, shaming) when they are doing the right thing"?
No, the point is that employment security depends on your manager's incentives, your employment protections, and your ability to get a new job if you lose one. This has very little to do with how your manager treats people.
Somebody who has been in the industry for a while and seen people fired and laid off has learned that someone's job is no more or less safe with a friendly, empathetic manager. (Your personal stress levels are another question, and an important one, but complying with a toxic manager doesn't decrease your stress levels, so it isn't leverage to make you do or not do anything besides change jobs.)
That's why I mentioned aviation mechanics being unionized, and I wondered if a worker's history of safety infractions follows them. Those would seem to be very important factors. A safety-critical worker should be able to say, "Look boss, you're telling me if I don't break this rule I could lose this job, but 1) the union has my back, and 2) if I do break this rule I might not get my next job."
That's important, because threatening in a loud and obviously abusive way is not the only way to make someone afraid for their job. You could even say that it's the amateur hour way to manipulate an employee. If management wants to scare people into skimping on safety, they can do it without doing anything you could point at specifically as "abusive" behavior, so you need protections that don't rely on identifying obvious misbehavior by management.
EDIT: Edited to point out and clarify, I didn't say anything about the behavior of the manager in my story, and I don't remember the manager being angry or threatening. Actually I can barely remember his face, and I doubt it had anything to do with him. I think it's something that junior people bring with them, just like they're scared of disappointing teachers, scared of their high school principal, etc., in a way that's hard to understand when you're older.
Also lack of access to documentation: the technician attempted to access the Rolls-Royce EIPC and the SB’s through the computer network, but was also denied access.
This should be a major point for a maintenance shop. Documentation should always be available, both online and in offline versions.
The article says the service bulletin was actually available on site through a different publication, but the mechanic wasn't aware of this(and yet did the right thing of contacting dedicated support, who didn't advise him correctly):
" The Rolls Royce SBs were also listed in the Trent Illustrated Parts Catalogue, accessible from any computer at the facility, but he was apparently unaware of this, so he instead switched to plan B and called the Air Transat Maintenance Control Center for help."
So many problems involving tech pubs process here. I could almost classify this whole incident as a tech pubs problem, since the aircrew faults also involved the operational checklists and their planning/guidance.
TechPubs teams often have no control over the compliance (Optional, Mandatory, etc) of the SBs - that, at least in my experience, always came from way up high in the chain, sometimes from a VP who wasn't even tangentially involved in MRO. The required procedures - MX or ACRW - also are often coming from a costs concern, rather than criticality. The whole process of figuring out what's critical and what's not - that's a problem still being smacked into, every day.
It's not hard to make a Sanity Check for pubs vs ERP/PDM/CMMIS, but often people involved REALLY DON'T WANT TO SEE THAT. Because it's horrifying.
The commenter who mentioned HFACS is right on the money. Unless you implement something equivalent, you're gonna be chasing your own tail.
Which it is, tech pubs are huge part of the after market activities of every aerospace manufacturer. That those are not up to date, or not available when or where needed, is a major finding during audits or investigations.
What shortcuts? Spec said no touchy between tubes. Tubes no touchy when aircraft left the hanger? Seems good to me.
These are the kinds of failures you get when you try to proceduralize everything with foolproof procedures, checklists, run-books, processes, etc. The maintenance people didn't even know they were making a judgement call.
Had the technicians sleeved the offending tubes with appropriate chafing gear and secured them to reduce movement as would be common place in more earthly situations instead of trying to get it "technically in spec" without thinking about the implications of what they were actually doing this problem may very well have been averted. But maintenance people in aviation aren't really ever exercising those creative problem solving skills since it's such a book driven workflow they're always dealing with so of course they missed the forest for the trees (not that they would have been allowed to implement a one-off fix like that).
> But maintenance people in aviation aren't really ever exercising those creative problem solving skills
This is the opposite of what happened. The engineers were confronted with a novel and unexpected problem, and instead of following procedure (which they lacked access to) they applied "creative problem solving skills". From TFA:
> Although it was possible to install the post-SB fuel lines and hydraulic pump with a pre-SB hydraulic tube, the tube would rest against one of the fuel lines at a point where it rounded a 90-degree bend close to the pump. Aware that the plane could not be dispatched unless there was clearance between the tubes, the technicians torqued a nut on the end of the hydraulic tube until it rose approximately 0.635 mm off the face of the fuel line.
Torquing a nut so that the hydraulic tube isn't technically rubbing against the fuel line is pretty creative and seems to solve the problem. But it didn't take into account pressure and vibration while in flight, and almost caused the death of 306 people.
I'm amazed they can measure the distance between two real non-fixed components down to a few micrometers (they wrote 0.635mm, implying somewhere between 630um and 640um)
If 0.635mm (an odd number itself which happens to be exactly 1/40th of an inch, or "25 thous" in American sizes) and they let it go that sounds really odd. That's literally the size of a grain of sand. If the concern is rubbing, sure that implies it will move during flight.
Shim feeler gauges would be readily available in most shops checking wear on moving parts. 0.0025 inches (0.635mm) was included in the first high quality (gearwrench brand) set that I looked up online.
0.635 mm is indeed an odd number. The thickness of a piece of 80gsm paper is 0.065 mm, so maybe what is meant is "about the thickness of ten sheets of paper"...?
Who else thinks that if a tolerance of the thickness of a sheet of paper, in a mechanic/maintenance setting, could cause the death of over 300 people, that there's a striking lack of fault-tolerance in the design.
I wonder if this nut, if not every nut on an ETOPS airliner (or in general), has a specific torque value. If adjusting the torque can open up a gap, presumably doing so could close one as well.
> These are the kinds of failures you get when you try to proceduralize everything with foolproof procedures
> Had the technicians
The report makes it clear that the problem was their lack of access to the relevant details, not that the process was defective. If they had seen the need to replace the hydraulic hose, they would have presumably changed it or flagged the problem to someone higher up.
> The maintenance people didn't even know they were making a judgement call.
It seems like they did. If you replace a part and something is touching, you are not trusted in aviation to try and make some space with a bit of fiddling for exactly this reason. They did make a judgment call. If they had relied on the process, again, they could have queried why the new pump was touching when the old one wasn't.
"Follow the process" is because this is like the tenth example of engineers and maintenance personnel trying to ignore the instruction booklet and nearly getting hundreds of people killed.
"Read and follow the directions exactly" is a rule that is written in blood.
There was a shortcut: "But if they had looked at the text of the SB, they would have realized that they missed a step: they were also supposed to replace the hydraulic tube which attached to the pump." With the old tube, the clearance was insufficient to avoid contact while in use.
> These are the kinds of failures you get when you try to proceduralize everything with foolproof procedures, checklists, run-books, processes, etc.
Aviation safety is utterly dependent on procedures, checklists, run-books and processes. The fact that things sometimes fall through the cracks is no argument against them.
Management cannot be at fault because they are not responsible for safely flying the aircraft or safely maintaining the aircraft. That is the job of the employees, not management.
Now if management had for example knowingly procured inferior replacement parts or hired non-certified personnel to work on the aircraft, they would be at fault.
Interesting that they call Lajes a “military airfield”. On the Azorean island of Terceira, Lajes (TER) is a mixed use airfield that is the commercial landing strip of the island since the 1930s, as well as having a multi-national military base on the other side since the 1940s. There is a picture in the story.
The NTSB report called it a military airbase, and other reports say “Military air traffic controllers guided the aircraft”. Civilian aircraft can use the military side with a permit.
Lajes’s other claim to fame is that Bush and Blair met there just days before the Iraq invasion to hammer out whatever deal they thought had to be dealt.
The tale of the stricken aircraft was fascinating. I’ve always wondered, over the decades as I’ve flown slight doglegs over Earth’s oceans, if the strategy to route over remote airfields in case of emergency has ever had a “cost” estimated. How many hours and gallons of fuel has been spent for this safety blanket? The infrastructural insurance spent to prevent lives lost in air travel appears to be much higher than lives on road travel. Could it be that the insurance is for the aircraft, not the lives? Hmm.
The island, and the others of the Azores, are a wonderful place to visit by the way.
One of the coolest emergency airfields I can think of is, literally, Uruguay Route 9, along the coast of the country, where a 2-lane highway becomes an enormous runway that you continue driving along. I don't know if there were any famous landings there; I heard it's only used rarely by the Presidential jet.
See where the grass divider ends, and gets replaced by a jersey barrier for 2 km? That part is the runway - it would require some hours of preparation before it would be usable, it's not quite "declare mayday and dump the plane there."
ah, that's very cool. Makes me think of all kinds of NATO-level shit that's strewn around the world waiting to be uncovered ;)
The thing with the road in Uruguay is that as you're driving, there's basically a yellow sign with a picture of an airplane and an arrow pointing up, and you're like "what the fuck did that mean?" And then suddenly you understand exactly what it meant because you're driving on a fucking runway...
The Czech one is more of a Warsaw-Pact-level shit. :-) Here is a (originally secret) documentary video about MiG-21 exercise on the highway (in Czech, obviously): https://www.youtube.com/watch?v=FDoZGoaG9_w
Sweden had/has roads onthe middle of nowhere/forest that has more free space on the sides and at each end of particularly strate and flat stretches, there are oversized parking spots. Supposedly so fighter planes could refuse and rearm from mobile crews
The ones I've been on weren't signposted at all. You're just on a road, and suddenly the road is completely straight and very wide, and 3km later it's back to normal, and there's nothing visible nearby.
And I'm absolutely sure I wasn't: two different spots, both near military airbases, this one is somewhat longer than the Vyškov one. Personally drove on both of them, many times. Can't find an English source though: https://cs.wikipedia.org/wiki/Leti%C5%A1t%C4%9B_M%C4%9B%C5%9...
Hah! I was driving down this route a few years back when I suddenly found myself on an airstrip along with all the other traffic. It was particularly bizarre given that I’d just driven down from tacuarembo to punta del Diablo on largely unpaved roads.
That said, it’s not the first road doubling up as a landing strip I’ve found myself on - there’s a stretch of the main road between Donetsk and Mariupol that was, in soviet times, used as an emergency field for strategic bombers. It’s this relatively quiet provincial highway that suddenly turns ridiculously wide, and goes dead straight and level for what feels like an eternity - 11km looking at google.
Oh, and RAF manston used to have a level crossing across a taxiway.
The Panamerican Road in El Salvador has an airfield like this too. It's next to a military base. I've never heard of it being used, maybe it was during the civil 80s civil war.
There's a YouTube video[1] of the German Armed Forces showing an emergency airfield on a German Autobahn from 1988. It's subtitled, but only in German. Autotranslation is available though.
To cross the Atlantic between NA and Europe you need to be able to travel 138 minutes from an airport, although for almost the entire area it’s 120 minutes.
This is accomplished by having air strips which can take massive planes like A350s at remote places which can’t justify planes of that size normally like Newfoundland, Greenland and the Azores
While these are mainly there for military and temporary emergency use, in 2001 they were used far more - with dozens of planes landing at tiny towns like Gander and St. John’s in Newfoundland, places which only really expect a single large plane to land and refuel rather than 30 with passengers to accommodate for days.
The maximum time you can be away from a suitable landing spot depends on the ETOPS certification (or 60 minutes if you don't have one). The certification is specific to the type of plane and operator.
A modern 777 at most airlines for example has an ETOPS rating of 180 minutes.
That's 180 minutes of flying with a failed engine. So you have to take into account whether you can maintain altitude on 1 engine. Most of the time you can't, you'll be flying lower. Which means slower speed, so the calculation is 180 minutes at that speed.
Gander's role in 2001 was made into a musical, and it's quite a touching and heailng piece of entertainment if you were around back then: https://www.youtube.com/watch?v=WcI1EKE01S0 -- certainly best on stage if you can find it, but also available on Apple TV.
This was a fun rabbit hole to go down, with the ETOPS [1] hint.
I enjoyed the history of ETOPS-138 [0]. ETOPS-120 leaves some small triangles of the Atlantic inaccessible, particularly if one emergency airfield is unavailable. A 15% increase to 138 minutes covers it.
Anecdotally, though, my experience as a commercial passenger feels like airlines fly almost directly over pacific islands rather than within X minutes of diversion. My memory may be anchored in bygone years.
Even then, it didn't take the actual great circle route because Russia.
The Japan to Europe flights are even worse. The great circle route cuts west, straight across Russia [0]. These days, they fly east to the Bering Strait then more or less directly over the north pole [1] (though probably easier to see on great-circle: [2])
The direct route from FRA (Frankfurt Airport) to KUTAL (a navigational beacon on the Baring Sea in American airspace) passes within just 51 miles of the North Pole
Back in the Soviet era when USSR airspace was closed to most Western flights, planes couldn't do that route non stop from Europe, so had to stop over in Anchorage for refueling.
> The infrastructural insurance spent to prevent lives lost in air travel appears to be much higher than lives on road travel. Could it be that the insurance is for the aircraft, not the lives?
I would be surprised if the cost of equipment and training pilots and whatelse was not factored in.
But the major driver might be that you have to pay for more expensive insurance to get regular people to strap themselves into something going many miles per hour many feet above the ground. Not because it's inherently less safe (though I would argue it is) but because the prospect is scarier!
Sounds similar to Gibraltar airport, which is both a civilian airstrip, as well as a military one (the military base is opposite the passenger terminal).
If you haven't looked into Gibraltar airport before, it's a fascinating thing - the land it's on is reclaimed using rock dug out from the siege tunnels, there's a level crossing on it as the only route into the town crosses the runway; if a plane is coming into land, or taking off, you have to wait at the barriers. It's also the city airport closest to the city centre it serves at 500m.
True, it's kinda special aviation history. Like the Kai Tak approach. But I think that for security and traffic reasons (eg foreign debris) it would have had to change eventually.
Lake Shore Drive in Chicago has also been used as an emergency landing runway but smaller general aviation aircraft complete. At least one of the planes ended up flying under a pedestrian bridge over lake shore drive during the landing.
I really love the straightforward but gripping style of writing here. I read every word! Which is rare for
anything submitted to HN (or elsewhere for that matter)
I've been reading Admiral Cloudberg's write-ups for a few years since discovering them over on Reddit's r/CatastrophicFailure,[0] they're all truly excellent. I really enjoy his clear, no nonsense writing style.
What us failure analysis nerds need, what we deserve, is a 90s era Discovery Channel show written by Cloudberg. Either that or a a United States Chemical Safety Board[1] style series.
Not GP, but pretty sure they’re referring to Cautionary Tales with Tim Harford (the “Undercover Economist”), which is gripping, informative, and entertaining.
I just discovered cloudberg a few months ago, on HN. I agree the writeups are amazing.
What I enjoy most is that frequently, the crash appears to have a clear “culprit” early on, then cloud berg points out 5-7 other factors, any one of which might have prevented the accident. If the series had a theme, I think it would be that airplane crashes are less about human failings than systemic ones.
That kind of thinking is useful when examining other system failures; AND it can be important to consider "let's assume this other factor had prevented the disaster, how could we have still identified the underlying 'cause'".
Because if your ass is being repeatedly saved by the equivalent of a $0.50 zip tie, you probably want to know that.
Thanks you made me read 2 more. And kept me up late! There is one where some safety feature highly recommended for all planes back in the 90s (I think), to identify and alert which
engine failed, wont be added to 737s because you guessed it: type certification. It is fascinating and a bit scary.
I find a similar pace in the Air Crash Investigations series. Despite the low-budgeet filming and the actors who are clearly not allowed to speak, it goes into enough detail to be interesting without labouring the point.
I think if you asked the engineers who designed the airbus in question, could you create a "leak" warning, using just the existing sensor suite, they'd say "sure".
You know the fuel level in each tank, and have a good estimate or actual flow data on engine fuel usage. Simple math says "leak" or "no leak".
If that would work, it is surprising it isn't in the software.
This is why a graph of a quantity over time is much superior to just seeing the realtime number. Had they had a graph of fuel remaining they would have been able to see that fuel level had started dropping faster.
In monitoring IT systems seeing metrics like this over time is super valuable.
It's not very surprising. The computers are relatively basic by desktop standards and the time taken to develop a system securely means that many "obvious" things as seen from the armchair don't exist irl.
Also, like most engineers, you don't necessarily add systems for every conceivable thing that could go wrong. If you think about how many times this has happened against the millions of plane-miles travelled, you can see why it wouldn't necessarily have been high on the list.
Make it broader - before take-off, pilots enter the expected destination (or expected distance). During cruising, computers keep track of how much fuel is being used (lost) per unit of time, and projects forward to see if they can still make their expected destination or not.
If the number does not reach within expected margins, then show an error.
This protects against leaks, but also miscalculation of fuel needed by the pilots, misfueling, or efficiency loss somehow.
Garmin G1000 units (popular in small planes) show this, as a circle around the current position, including fuel reserve. However, it depends on an accurate fuel level, which is mostly dead-reckoned from integrating fuel consumption. Measuring the actual fuel level in wing tanks is imprecise and the sensors sometimes get stuck and give bogus readings.
Perhaps the big innovation needed is accurate fuel quantity measurement.
I think a plane can measure an approximation of weight loss from its own airspeed and angle of attack, won't give you exact weight measurement because weight at take off is an approximation, but might give a estimate of weight loss. Depends on the precision of the aoa sensor tho.
Indeed, but that system will report nothing unusual when there's a leak directly from the tank, or in the pipe between tank and flow sensor (which I think is usually near the engine, downstream of a fuel pump) or if the tank wasn't filled to spec.
Any sensor has a failure rate. If the probability that a sensor has failed isn't dramatically lower than the probability of a leak, then the pilots will do just what they did in the incident and assume a bad fuel sensor reading rather than a leak.
This already exists and in a much more advanced way. The flight management system calculates the expected fuel on landing based on the route, wind data, different fuel usage at altitudes etc.
It's not as simple as "fuel per time" because fuel usage and speed change massively with altitude.
What was missing here was a big warning to the pilots (that has since been added). But it's also standard procedure at all airlines to monitor fuel and required fuel calculations, which could have helped this crew if they did it earlier.
> Meanwhile, Airbus and the French Directorate General of Civil Aviation worked together to produce a recommended service bulletin modifying the Flight Warning Computers on A330 and A340 aircraft, allowing them to warn of possible fuel leaks by continuously comparing the planned fuel with the actual fuel on board.
If I recall, one of the duties of the pilots (usually the first officer's) is to keep track of fuel levels and consumption throughout the flight so any anomalies in the engine or fuel system can hopefully be found sooner.
If a human can do it, I would imagine a computer could automate the process. Likely wouldn't remove the need for the pilots to do it anyway, though; redundancy and reliability and all that.
They did it according to the article, but 200kg was seen as insignificant. I bet it is hard: imagine driving 100 miles and them trying to gauge from the fuel used if there is a leak that lost you 200ml.
It's a common misconception in the general population that the Captain always flies the plane and the FO supports them.
There's always a pilot flying, and pilot monitoring. The roles are usually decided before the flight by the Captain and have nothing to do with who is Captain and who is First Officer.
In my time as an avionics tech on C-130s, it was drilled incessantly into us to never perform maintenance from memory. Always consult each step in the relevant publication and have it with you. We weren't immune, we also had ASORs (Aircraft Safety Occurrences, but nowhere near this level of seriousness) where mistakes were made because the techo did not consult the publication. We had tuffbooks that were loaded with all publications and STIs (Special Technical Instructions, similar to SBs) that should have always been carried to the aircraft and consulted during maintenance. If the SB in this case was unavailable, maintenance should have ceased until it became available. When you sign off on maintenance, you are signing off on it being completed in accordance with the publications referencing the specific section of the publication (at least, that's what we did in the Air Force).
On the driver side, my immediate thought was: How did a fuel leak in one engine cause a total loss of fuel? I would hope that there are now checklists that prioritise shutting off the cross-feed valve if a leak is suspected. Weight and balance are subordinate to the risk of running out of fuel and starving both engines.
On a side note, we would have regular "Safety Days" where incidents of ours, and major incidents like these that have happened to other aircraft were presented to the squadron (both aircrew and maintenance) in detail and discussed. Having the potential consequences in your mind at all times really helped fortify the necessity of following the correct procedures during the regular conduct of your work.
This sort of thing is why I am a huge fan of 'Risks digest' and spend a lot of my free time reading there. How things can go wrong is a wealth of important knowledge. Unfortunately a lot of it is very hard won and written in blood and to see how little it can sometimes take to cause a major accident possibly with loss of life is a great reminder of your responsibility when designing technology.
I think it’s unreasonable to pin the mistake on the pilots for not running all the checklists. This is where a flight engineer would come into play in my lay opinion. The pilots were tactically focused; they were trying to keep the plane in the air. You need someone on the flight deck who is looking at the bigger picture, doing the fuel calculations (even if the automated systems aren’t warning about certain fuel-related variables), running the checklists, etc. Perhaps a flight engineer role (perhaps filled by a spare pilot) should be required for all ETOPS flights?
It is an objective fact that if they had run the checklists they would have arrived with fuel remaining, and also that they had plenty of time to do so. If we can't say this, how can we learn anything from the investigation?
They did contact the engineering department. Maybe it would have sufficed to have a telemetry uplink back to the engineering department. Should be doable with Starlink etc. soon enough.
For some reason I'm a little obsessed with airline incidents and crashes. Admiral Cloudberg produces outstanding write-ups, and of course the videos of Mentour Pilot on YT are also excellent.
While this is in no way scientific, it seems that what most catastrophes have in common are bad CRM (crew intra-communication) and lower-than-average airmanship. In most cases, for a catastrophe to occur there must be an initial problem, sometimes very small, that is made worse and worse by bad communication and a series of misunderstandings. And ultimately, in some cases, pilots fail to fly the plane.
For poor airmanship, AF 447 comes to mind (very minor initial incident, plane working perfectly fine, disaster caused by incompetence), as well as a comparable problem on AirAsia 8501 (the captain decided to reboot the computers in flight (!) and the first officer was unable to fly the plane properly without autopilot). Disasters caused by poor CRM include the Tenerife Airport disaster (deadliest accident in aviation history).
But I find incidents that don't result in a catastrophe often more interesting. Incidents and problems are impossible to avoid when using machines as complex as modern airplanes; what's fascinating is how humans, working together and using all of their skill, are able to save the day.
This is the case here. The problem was caused by the maintenance team who didn't use the proper hydraulic tube when replacing an engine, because they didn't have immediate access to the text of the service bulletin that described the procedure, and they thought they could do without. (The wrong hydraulic tube didn't have the proper clearance; when under pressure, it wore away at the fuel line beneath it, until it cracked and leaked.) The actions of the pilots in flight compounded the problem but this was completely excusable given the information and training they had, and the checklists as they were written. Ultimately, disaster was averted by excellent airmanship.
There were other events like that. The most famous one is of course US Airways 1549 ("miracle on the Hudson"), but a less well-known, very similar, and possibly more impressive one, happened in 1988. On TACA Flight 110, the pilot landed the plane on a patch of grass with no engines, with no injuries to the passengers or damage to the plane.
That pilot, Carlos Dardano, had only one eye, having been shot in the face while landing in El Salvador a few years prior and being caught in a cross fire. He had started flying when he was a toddler, on the knees of his father. At the time of the incident, he was 29 but he had already amassed 13,410 flight hours, with almost 11,000 of these as pilot in command.
Another amazing story is the Air Astana flight 1388, in 2018. A heavy maintenance operation had resulted in the inversion of the aileron cables, but not the spoilers, making the plane utterly impossible to control. This problem escaped detection until the plane was in flight. The pilots, after calmly discussing how to ditch the plane in the sea to minimize the number of victims on the ground (it was a test flight with only 6 people on board, all pilots or engineers), managed to regain a semblance of control and landed safely, after 90 minutes of the wildest roller coaster imaginable. It looks like Kazakhs can fly.
It seems the modern world doesn't hold skills in high esteem (what used to be called tradecraft before the word became associated with espionnage); the MBA culture essentially regards workers as interchangeable; and of course we have all these machines!
But the opposite is true. What will save your life is not the machine, it's the experience, professionalism and competence of the people using the machine.
Some of the most upsetting disasters (in the sense of how avoidable they should have been) have been caused by failures of the human-machine interface, particularly with automation.
For example https://en.wikipedia.org/wiki/Emirates_Flight_521 - a go around was commanded by the pilots to abort a landing, but they weren't aware that the automated go around system - specifically the auto-throttles - would not engage if the wheels had touched the ground (if only briefly). Thus engine power remained at idle/landing speed, and the plane fell out of the sky.
The pilots should have checked the throttles of course, but they would not necessarily have been aware that the wheels had touched down, may not have known/remembered that the automatic go around system would not function in that case, and they were concentrating on flying the plane in presumably unusual circumstances that led to the go around in the first place - they assumed the automation would function as it usually does.
A better human-machine interface (e.g. with a warning that go around automation was disabled) would have prevented the crash. I think anyone designing human-machine interfaces, especially safety critical systems, should read up on these cases.
Another very famous incident is the Gimli Glider which was also a result -- as almost all of these are -- of quite a string of issues, including, astonishingly, the mixup of imperial and metric units. The report on the accident found fault with Air Canada procedures, training, and manuals. Everything else was, I guess, peachy :P
This story is more ambiguous. The plane should never have left the ground and the captain committed a serious error when he decided to depart with no fuel gauge working. According to airline procedures at the time, no plane could be dispatched without working fuel gauges. That wasn't up for debate, it was a hard rule and it was disregarded.
Then, contrary to legend, nobody really mixed up imperial and metric units, but they did the math wrong (twice!) when converting from one to the other. They weren't surprised when they found that 8,000 liters of fuel would weight 14,000 kg. One liter of water weights 1 kg; fuel is less dense than water; without knowing anything else, one should question a calculation that results in the weight of 8,000 liters of fuel being more than 8 metric tons.
So yes, ultimately, the plane landed safely thanks to outstanding airmanship; but it was improper following of guidelines, and math incompetence, that caused the problem in the first place.
In both cases, I would pass a similar subjective judgment. Both pilots made a series of mistakes, but such mistakes are made every day, most often without consequence. However, both ultimately saved their plane when it mattered, thanks to exceptional flying skills. I have nothing but respect for them.
Okay, and agreed, but in the case of the Gimli Glider the captain disregarded procedures. That's not just an error, it's a fault (he was suspended for a while because of that, and rightly IMHO).
> most catastrophes have...lower-than-average airmanship
I wonder how many (significantly) lower than average airmanship days happen everyday without incident. I suspect it’s a lot, and the overall system catches/corrects many and is resilient to the effects of many others.
It's a very good question. But the argument isn't against machines; it's in favor of excellent airmanship. The slippery slope is when good machines appear to make good craftsmanship redundant. They don't.
I consider the lack of respect for experience and skill one of the biggest failures of western culture. It also applies to things like mathematics where nowadays some people are even proud about their lack of skills. But the end result is that most people are amateurs in their own job.
I remember for a while they were trying to make Robert Piche into a scapegoat. What I see here is:
1) Design error. Ok I am a total nobody in this area but still fuel leak detection not being included seems like a major design fault. I mean something that measures difference between whatever fuel injectors deliver into the engine and whatever leaves the tank is quite possible I think.
2) How many times have we read when a fucking management pushed for event happen / to be on time no matter what and the following disaster. Challenger, Columbia etc. etc.
3) Service procedures, manuals etc. I guess those are combinations of (1) and (2)
4) Errors by service and flight crews. I can imagine that the person who bent the line had enough reason to raise concern. The rest can't really be blamed.
Against my own inclination to never watch YouTube, I've become a bit obsessed with this one fellow who does excellent reconstructions and analyses of aviation incidents. He covers this one here:
What's good about Mentour Pilot is he reads and interprets the NTSB reports and the reports from other relevant agencies, and also adds what his own experience as a 737 pilot tells him.
I've often found that means his reports conflict in various ways regarding either what happened, or what caused what happened, with what is found in so many other reconstructions, many of which seem to goes as deep as a Wikipedia article and not a whole lot further.
Seems a bit weird to say "never watch youtube". It's like never reading a book anymore because someone gave you a bad one. It's up to you to filter out content you dislike, just like with everything else.
"Books" don't get a cut of all book sales, whereas YouTube gets a cut of all video profits. "Books" don't have editorial control over what goes in them, YouTube does. It's clearly closer to a publisher than a medium. Maybe you meant platform instead of medium?
Not the OP but personally I'm the same. It's Google I don't like and also the video format, I have no patience for watching videos. Written content is much easier for me to absorb quickly and efficiently. Videos are always so tailored to the slowest viewer (especially training videos, I hate those!!)
There's only a few YouTubers I have time for like ones that don't hide complexity, such as Dave Jones from EEVBlog.
Reading material quickly remains more comfortable than watching a sped-up video. There are many cases in which written material is a better way to transfer information than video.
A case in point would be this article, as it is clearly written, balanced and very digestible.
Nothing weird about it. I don’t watch Youtube except the rare cases, and when I have to I download videos. I can't stand getting interrupted for ads during watching so I opted out of youtube completely.
When I'm signed in, it's the same except that it also sometimes suggests somewhere in the list next to a video that's playing a single video that I watched a while ago.
For me, YouTube hasn't been able to suggest properly for years; ages ago, suggestions were all clearly related to the current video in some way. The old behaviour lead me to spend lots of time on YouTube.
When "Shorts" started appearing on the main page, it somehow seemed like I couldn't get rid of it. I somehow don't want to see it ever, on any computer so it being default is as evil to me as blinking ads.
The addition of forced-autoplay-unless-you-sign-in[1] is definitely a case of Making It Suck[2].
So, I increasingly use other clients, or exclude YouTube results from searches. This pretty well gets me away from clickbait rubbish like "$profession doesn't want you to know this!!", grossly exagerated faces/postures, and Shorts.
[1] Welll… actually I guess it's "default to autoplay whether signed in or not, and keep resetting it if you're not signed in".
[2] Meaning, moving people towards what another behaviour by making something else more difficult.
Maybe you aren't signed in enough for it to have learned your preferences? You can also click the "..." button on a video's thumbnail and select "not interested" or "don't recommend this channel" on content you don't like, which in my experience is very effective at improving future recommendations.
I also get annoyed by the Shorts section (they're almost pure clickbait/teasers, addictive but no real content, waste of my time!). But I can just click the "X" and it disappears for 30 days. If they didn't have the "close for 30 days" option, I agree, this would really suck.
I'm left wondering if the feed data from planes is broadcasted to pilots on the ground to figure out possible problems. Like the actual instrumentation data, as the article says, there was an oversight from their part.
Airbus advertises a product called Skywise [0] now, which appears to offer realtime data streaming back home. In a documentary on Qantas Flight 32, they mention that Qantas had realtime data access to ECAM at least. I believe this is more-standard now, but obviously in 2001 it would've been a different ballgame.
> Investigators felt that quality control experts, had they been present, might have been more skeptical, but none were on site, because Air Transat’s quality control personnel only worked Monday to Friday.
Great read. Interesting parallels to software engineering failures. Time, pressure, quality control, operator error, documentation, training... a human challenge.
Me too, despite knowing the rule “aviate, navigate, communicate”, I was hoping that the captain was explaining what was going on to the passengers but then I remembered: no power, no PA
One of my pet peeves: the first 7 "causes and contributing factors" effectively blames the engine technicians for taking shortcuts. Nowhere does it say anything about the scheduling pressure from management that made the technicians consider shortcuts in the first place. If you have hiring and firing power and you tell someone they need to get the plane back in the air a few days later, they will do things to get the plane back in the air a few days later.
This is considered simply a "risk" by the report, and not a direct cause. As a consequence, the actual actions taken basically all belong to the category of "fix the engineers", "fix the flight crew", or "fix the plane". Not a single "fix the stressful maintenance environment".