Hacker News new | past | comments | ask | show | jobs | submit login
Normalization of Deviance (2015) (danluu.com)
180 points by pabs3 on Jan 25, 2020 | hide | past | favorite | 43 comments

It's not just growth-oriented new companies. The old ACME BigCorp.'s have exactly the same problems, but not because they disregard best practices because everyone is fixated on growth, but because inertia has crept in everywhere. Because everything is executed top-down and there's no back-channel. Because people care about their yearly bonuses and would rather not bring up uncomfortable truth to their superiors.

And most of all, because there are zero techies among managers. There's a complete mental disconnect between the tech-savvy among the grunts and even their direct reports, let alone more senior management.

Those companies then pay big money for external consultants to tell them how to fix issues; the consultants tell them the obvious solutions from their books (which may or may not overlap with what the grunts would have told them), and then those solutions are pushed down in the next restructuring.

Where I work, senior management has just discovered DevOps & Agile, and now everyone has been told to do DevOps, but without breaking up the top-down communication style. That's going to work just wonderfully.

> senior management has just discovered DevOps & Agile, and now everyone has been told to do DevOps, but without breaking up the top-down communication style

I'd argue that they haven't discovered Agile, although they might think they have. The whole point is to disrupt these traditional ways of working.

The theme across agile implementations is control over what's taken on and accountability for what gets done at very high granularity. The whole thing about breaking up complex stories into subtasks is do not abstract away any details from me, I want to know everything.

It's hard to imagine this being a less top down communication style. What are you comparing it to?

I've worked in companies where breaking things down into subtasks was genuinely used as a tool to prevent the dev team from being overworked.

same here. the belief is that just using using CI/CD tools will make us efficient. nothing against CI/CD, but the organization has much larger problems.

The WTF moments come when you don’t yet have credibility. Would you want to hear a stranger with an empty commit history use the wrong keywords to tell you how to do your job?

Start your campaign to change a place after demonstrating ability to work within it.

That's better than nothing! They might not have any previous history and might get a term or two wrong, but their perspective can be useful anyway. You should not dismiss ideas based on what that person has previously said (or if they haven't said anything before) but rather the content of the message itself. This of course becomes hard to scale once you have a large team trying to collaborate, and the feeling of juniors wasting people's time with their easy questions comes up.

Another thing is that when someone does not have a history with a particular subject, they enter it with a fresh mind, something you cannot have if you've worked on something for a long time. In case it's also useful to take what they say in consideration, as they can see stuff you cannot. Vice-versa is also true.

We dismiss the ideas because they contradict our priors. Knowing that someone else holds them adds no new information. Strong objective evidence is rarely available. But seeing that an opinion is correlated with superior skill will tend to make people take it seriously. (Sometimes too seriously! Why does everybody cargo cult Google?)

For example, several Amazonians have joined my team and been horrified at the lack of central tracking and control over sprint tasks and productivity KPIs. I like my autonomy. I like not taking orders from JIRA. There is a reason I don’t work at Amazon. My team already ships, and well. So I am not excited about their campaigns.

I am not a fan of test coverage as a KPI. I’ve seen lots of high-coverage but meaningless and brittle test suites, full of copy-paste and implementation-coupled mock expectations, do more harm than good. At the same time, I find that covering increasingly obscure “if err != nil { return nil, err }” blocks is a waste of time, so my stats aren’t always high. But I have seen SREs I deeply respect, people who have stopped major train wrecks and held insightful technical conversations, decide to use it in their recommendations. Because of that, I’m grudgingly coming around. That’s so much more convincing than the Sr. Staff and above saying “let’s set a minimum coverage threshold for the org [because KPIs and standards are inherently good].” That just reads as careerist sucking up to management. It's people who haven't touched code in 10 years and don't even know the language we use, claiming their judgement is so good that they don't even need to look at a situation to decide how it should be handled. They are the wrong messengers, even for a right idea.

The top commenter's analogy aptly captures the essence of the issue.

Yeah, but why the difference between chefs and SW engineers? Do we need a software Typhoid Mary for things to change?

i'd argue that without strict regulation and unannounced government audits we will get sick of most market output.

They're entirely different fields you can't draw an analogy and any attempt will be awful.

Yeah, this analogy is not that useful after all.

This is one of the best articles on company culture I've read.

> I lost the fight to get people to run a build, let alone run tests, before checking in, so the build is broken multiple times per day. When I mentioned that I thought this was a problem for our productivity, I was told that it's fine because it affects everyone equally.

I laughed so hard. One of my friends works in such a company and I've lost count of how many times I've told him that his accounts of their exploits read like sitcom scenes.

I think an often-overlooked factor in company behavior is that either way its just money. Companies exist to make profit and if the screw ups cost less money than the shortcuts generate they'll happily keep doing them. In the majority of cases its not even the manager's money at risk its the shareholders. Do you think any members of Boeing's board of directors went to prison after their shortcuts murdered 346 people? Do you think any of them even took a pay cut?

Fining companies for misbehavior is a fundamentally inadequate method of regulation.

Companies exist to serve their shareholders, their share holders are the ones choosing profit to be the primary focus.

Many shareholders are only holding these shares for a few years, and could not care less about long term prospects of the company

Most of them dont actively choose much eithet

> There's the company whose culture is so odd that, when I sat down to write a post about it, I found that I'd not only written more than for any other single post, but more than all other posts combined

Following the link[0], it sounds like Microsoft?

[0] https://danluu.com/startup-tradeoffs/#fn:N

Fair enough, but "real pros duplicate code indiscriminately" got voted to the top of this website two weeks ago.

Very much doubt it did.

Takes a breach to get things done. I can present tons of evidence, but a breach will always move a vulnerability forward.

But a breach can be a disaster. Conclusion : it can maybe be interesting to have painless breaches. Like pentests.

The thing is, this isn't unique to tech companies. In every industry, people ignore best practices or even regulations, prefer politics over information, familiar over innovative, have fiefdoms and feuds, and basically everything that happens in society at large where there are power vacuums or opportunities.

It's a double-edged sword that is both a source of innovation and stagnation, and it is called being human. We do our best to keep the detrimental effects to a minimum, and have done a pretty decent job of it over the past few hundred years. Not perfect, but better.

"Best practices" is too often (in computer science, at least) a way to mean "Drunken distortions of the manuscripts of Dijkstra."

Basically all of the modern languages in common use are (either; rarely both) clones of his work on ALGOL (like Go, for example) or heavily inspired from his manuscripts.

Dijkstra is easily the most-misinterpreted and cargo-culted figure in computer science, and most people don't even know who he is! "GO TO Statement Considered Harmful" is a great paper, for example, but based on how it's talked about, it seems unlikely that 90% of people who are against the use of GOTO have actually read the paper, or have any logical explanation for why it shouldn't be used.

Dijkstra would strongly consider dropping everything and fleeing to a ranch in Idaho if he saw C++ in common use, for example, but it seems like C++ programmers (sorry for stereotyping, but programming communities do trend in one direction or another) tend to be the first to espouse the values and 'truths' of his writing and opinions.

I may be slightly biased, though, because Dijkstra has been proven wrong about the best way to write programs (by typing your code into a computer; something Dijkstra was against strongly), and that he hated APL so much that he went as far as to diss a Turing Award-winner (Alan Perlis) in his death because Perlis was an advocate of the language:

1968 was also the year of the IBM advertisement in Datamation, of a beaming Susie Meyer who had just solved all her programming problems by switching to PL/I. Those were the days we were led to believe that the problems of programming were the problems of the deficiencies of the programming language you were working with.

How did I characterize it? "APL is a mistake, carried through to perfection. It is the language of the future for the programming techniques of the past: it creates a new generation of coding bums." I thought that programmers should not be puzzle-minded, which was one of the criteria on which IBM selected programmers. We would be much better served by clean, systematic minds, with a sense of elegance. And APL, with its one-liners, went in the other direction. I have been exposed to more APL than I'd like because Alan Perlis had an APL period. I think he outgrew it before his death, but for many years APL was "it."

Not to take away from Dijkstra's brilliance but the problem with the narrative above is that Dijkstra wasn't part of the team that developed Algol 60 and that can easily be seen by the fact that his name is not on the Algol 60 Report or on the Revised Algol 60 Report. He did write an early book on Algol 60, he did write the first Algol 60 compiler and he did plenty of work on block structure language design but he didn't design nor was he a major participant in the design of Algol 60 itself.

Don't forget that misguided few paragraphs he wrote on "zero-based indexing" which have haunted us ever since.

>> The rules are stupid and inefficient

My go-to example for this is the radiological accident at the irradiation facility in Nesvizh, Belarus, in 1991. The International Atomic Energy Agency (IAEA) has a report here: [1]. It's fascinating reading.

To summarise, the facility in Nesvizh was a common type of irradiation faciity where product, such as food or medical equipment, is sterilised by circulating it around a radiation source by a system of conveyors.

In Nesvizh, the conveyors would often jam and the operators of the facility would have to shut down the irradiation process by storing the source in its safe position inside a dry pit, then enter the irradiation chamber to un-jam the conveyors [2].

The day of the accident, the operator circumvented all the safety procedures and entered the irradiation chamber when the source was in the unsafe position (he thought it was secured). What is really striking is that he would have had to work really hard to circumvent the physical safety mechanisms in place to stop this from happening. The irradiation chamber was built as a spiral maze with thick concrete walls. The entrance to the maze was protected by a retracting section of floor that exposed a "chasm" considered too deep and long for a person to jump over it. The way to the center of the chamber, were the source was, was lined with pressure plates and motion sensors that, if tripped would trigger an alarm that would cause the source to be automatically dropped in its safe position. It was basically like a room with traps in a D&D dungeon.

And yet the operator of the facility managed to circumvent all those safety mechanisms and accidentally kill himself as a result. He died before he could explain how he did it- so for example noone has any idea how he managed to cross the "chasm" in the entrance. But what seems reasonable to assume is that he had done this many times before.

The reason he did it of course was that sticking to the safety procedures meant that the operation of the facility would have to be stopped temporarily - which slowed down production at the facility.


[1] https://www.iaea.org/publications/4712/the-radiological-acci...

[2] It seems this is a common occurrence in this kind of facility: there are two other IAEA reports on radiological accidents that started with a conveyor jam inside an irradiation chamber that caused operators to override safety procedures. Look for the IAEA reports on the radiological accidents in San Salvador, El Salvador in 1989 and in Soreq, Israel, in 1990.

Thank you for sharing this, the report is really fascinating.

Isn't this why product testing, security auditing, and the quality control function that processes customer feedback and complaints have to be in a separate organization that reports directly to the top?

Just like you need a comptroller and auditors for finance.

Fantastic read & well written.

I feel like most engineers fall into this trap. Thinking systems are broke because they're miserable in them.

The system is not broken. It is doing what it is designed to do, which is undoubtedly accrue power and wealth to someone who is not you.

"But the same thing that makes it hard for us to reason about when it's safe to deploy makes the rules seem stupid and inefficient!"

This is a humbling illustration that will haunt me the next time I find myself in the same situation

As far as I can tell, the central thesis is that companies (maybe just tech companies) are screwed up. The sardonic tone though makes it almost entirely undigestable to me.

There's an emotional air of "the world is F'd and nobody sees it," that is independent of the central thesis, that makes it hard for me to take the anecdotes at face value. Which is too bad, because I'm sure there are a lot of grains of truth behind the author's experience.

Additionally, I have to observe that the author's experience doesn't align very well with mine. Certainly I've seen bad decisions, bad tech (but never a failure to use source control as he describes), some politics, intense personalities, but by and large I've found the vast majority of people to be acting in good faith, albeit sometimes creatures of habit and hard to win over before you build a track record.

I think one of the main points is that things are f...ed up but there is no way to tell anybody who has the power to change that. I see that in my company a lot. A lot of processes are inefficient but there is nobody you can talk to about improvements. So people just get used to doing silly things.

>> A lot of processes are inefficient but there is nobody you can talk to about improvements.

Well, I don't think there's nobody you can talk to. Let's just take his first 3 examples, as a random sample:

1. Paragraph 2: A company institutes radical freedom, so half the people leave within the first year. -- I don't see exactly what his point is, I'm not convinced this is a problem, and I don't see him offering a solution, and who can't he talk to about it?

2. Paragraph 3: A company that has a crazy paranoia about leakers. -- Sounds messed up, but I'm like 90% sure he could say "I notice some coworkers are unclear about forwarding emails on things like health plans, which as I understand, aren't company secrets. Could we shoot out a general message clarifying which materials ARE and AREN'T protected to alleviate any confusion?"

3. Paragraph 4: There's a company where to managers hate each other too much to be in the same room. -- I'm not sure what his point is, should one of them be fired? If he earns the ear of their manager, he can certainly insinuate that such a personality conflicts leads to problems.

Care to explain the downvotes?

Did you make it down to the "solutions" part? One can quibble with style, but it wasn't just a parade of horribles.

He's more gesturing towards building high-performing engineering cultures, or at least avoiding building variations on the illustrative dysfunctional themes.

The post's title references an article on aircraft safety procedures or rather it becoming normal over time for a certain group to ignore those procedures.

I don't think the author is (necessarily) saying this is how everything is. He is just describing how a culture of X being broken can propagate.

Just putting this excerpt from the cited paper on "Normalization of Deviance" for people who skim the comments only (I would recommend reading the whole article, this one is for once worth it IMHO):

A catastrophic negligence case that the author participated in as an expert witness involved an anesthesiologist's turning off a ventilator at the request of a surgeon who wanted to take an x-ray of the patient's abdomen (Banja, 2005, pp. 87-101). The ventilator was to be off for only a few seconds, but the anesthesiologist forgot to turn it back on, or thought he turned it back on but had not. The patient was without oxygen for a long enough time to cause her to experience global anoxia, which plunged her into a vegetative state. She never recovered, was disconnected from artificial ventilation 9 days later, and then died 2 days after that. It was later discovered that the anesthesia alarms and monitoring equipment in the operating room had been deliberately programmed to a “suspend indefinite” mode such that the anesthesiologist was not alerted to the ventilator problem. Tragically, the very instrumentality that was in place to prevent such a horror was disabled, possibly because the operating room staff found the constant beeping irritating and annoying.


What these disasters typically reveal is that the factors accounting for them usually had “long incubation periods, typified by rule violations, discrepant events that accumulated unnoticed, and cultural beliefs about hazards that together prevented interventions that might have staved off harmful outcomes”. Furthermore, it is especially striking how multiple rule violations and lapses can coalesce so as to enable a disaster's occurrence.

I’ve always taken the tone to be “the world is as f’d as we choose to make it”. I’m an optimist?

“An optimist is someone who believes we live in the best possible world. A pessimist is someone who fears this is true.”

I forgot the source of that quote.

The Silver Stallion -- by James Branch Cabell

"The optimist proclaims that we live in the best of all possible worlds; and the pessimist fears this is true. So I elect for neither label."

Applications are open for YC Winter 2024

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact