A picture is worth a thousand words:
The institutional response to managing the risk of complex systems is often to introduce layers of approval and process that appear to be dealing with a problem ("this system can't fail, therefore please check with everyone involved in the system before making a change") but really have little real world value except driving everyone crazy and incurring enormous cost. They also shield individuals from real responsibility (how carefully do you review something you are the only approver on? What if there are 5 approves? What about 15?).
A better answer is to find ways to conduct real world tests with subsets of your system where you can roll back bad consequences. As far as I can tell This is the approach of google/Facebook and other newer tech companies with pushing small changes to subsets of customers for testing.
Legacy enterprise companies are woefully inequipped to do this for the most part, both technically and culturally.
Some industries have regulations that make this approach difficult (financial services) or human consequences to failure that make the cost of experimentation too high (medicine, airlines, etc)
Having worked in heavily bureaucratic organisations, I was reading the pdf as an allegory of the blame game and safety first heavily bureaucratic, process driven culture big orgs have, which do nothing to do actually reduce risk - rather it stifles people actually being able to, and wanting to, change anything.
Basically I came away with the opposite to what you said.
So much this. I coach and train my clients (Fortune 500) in extreme programming (unit testing, TDD, CI, CD, etc). The duality these developers live with and managements obliviousness to their own detriment create a toxic and anxiety inducing work environment. It's very upsetting to me how these developers live with the manager/stakeholder constantly breathing down their necks to cut corners and "get it done" meanwhile holding them accountable for any mistakes.
It's not that other areas of business aren't like this, it's that management creates an environment where mistakes are more likely (and in some cases almost certain) to happen, and then penalizes people when they happen.
An example: Bosses refuse to provide funding for materials and time to automate the test framework (embedded systems). So testing is done mostly manually, this consumes a great deal of time or tests don't get conducted due to the lack of time or capability (I can't flip a switch 10 times in a second, or at a particular and precise time). So either we don't have enough time to take the test feedback and correct the system, or we never get the test feedback (because some tests aren't done) in order to correct the system. Errors are virtually guaranteed to slip into production if you're operating on either short schedules or complex systems under these circumstances.
Management expects perfect results, but ties the engineers hands too much so that they aren't able to execute effectively, and then blames (and often dismisses) the engineers as a result.
It's not just cutting corners in the actual implementation, but also the metrics and management infrastructure for post-release operations.
I'd say airlines are a sweet spot because we are probably near a technological local-optimum. We are not constantly evolving new paradigms for the industry, rather just incrementally improving the old one.
If post-mortem culture in the industry results in "fighting the last war", then that probably does little harm, at worst it adds in incremental cost. In the mean-time there is still incremental techonolgical improvment giving us more head-room in the cost vs. safety trade-off.
The ancient ones, the humans who lived before my birth - those who launched rockets to the moon and detonated city destroying bombs with atomic power, feats no country has matched in these declining years of civilisation - they also had faster than sound passenger aircraft.
Although it might be the tendency to exaggerate stories with time, rose-tinted glasses and the myth of the noble-savage.
It's the difference between prototype and a production. The Saturn V rocket and the Concorde were both prototype-grade vehicles. The marvel of them is that they worked at all. Now that we've proved the point, we won't put either "into production" unless it's reliable enough to be boring.
The air safety Human Factor culture have moved away from what they criticise 30 years ago. They are now fully on Safety II.
I advise Steven Shorrock work recently at https://humanisticsystems.com
Another note, I wondered what the root cause of the financial meltdown was for a number of years, but looking at it from this point of view, it's obvious that a number of things have to go wrong simultaneously; but it is not obvious beforehand which failed elements, broken processes, and bypassed limits lead to catastrophe.
For your own business/life, think about things that you live with that you know are not in a good place. Add one more problem and who knows what gives.
This is not intended to scare or depress, but maybe have some compassion when you hear about someone else's failure.