Hacker News new | past | comments | ask | show | jobs | submit login
Why are we so bad at software engineering? (bitlog.com)
379 points by signa11 16 days ago | hide | past | web | favorite | 292 comments

Those aren't the only reasons. Some more:

1. Software is easy to fix. Compared to other construction disciplines, software is the cheapest to fix.

2. The consequences of failure are typically minimal. Most software is not mission critical. Failure means minor inconvenience for a lot of people. Moreover, since your competitors are no more reliable than you are, you don't even stand to lose customers by having an occasional failure. (This is the one the article discusses.)

3. Software has more degrees of freedom. Unlike physical building disciplines where the types of materials are limited and generally well known, and where the number of physical dimensions is at most three, software takes place in an extremely complex and multidimensional operation space. Moreover, we have no evolved intuition about how things behave in that space.

4. Comparatively minor errors result in comparatively major failures. If you forget one rivet on a building, the engineering tolerances will make it such that the building will not fail as a result. But it is not really possible to make a server resilient to null pointer exceptions. It either works or it doesn't. Software fails partially much less often than physical things.

One of my favorite quotes from my old data structures professor in college was this: "software is non-linear."

If you're laying down the bricks for a house, there really aren't any cheap tricks or shortcuts. Building two houses takes twice as many bricks as building a single house, and so on. We could say that brick usage scales linearly with house construction. This is a bit of a bummer for bricklaying enthusiasts, but on the other hand, a single brick out of place is probably not going to bring your roof crashing down.

Software is different. Often, the distinction between processing 100 files and 1,000 files is minimal at best. This ability to scale effortlessly is often the best thing about software, but the same non-linearity is also the price we pay for power. A single subtle programming error can cause millions of dollars' worth of damage and bring massive systems to a grinding halt.

Overall, I think this is what makes software interesting and exciting to work with, but I can understand how it can pose a steep challenge once things like reliability and security become major concerns. The way I see it, much of research and development into better programming languages and tools has been to see how we can systematically mitigate these kinds of propagating errors and make our systems more robust to mistakes.

Yes but aren't we past this "software development is a production process" misunderstanding by now? We have solved the production problem: I can churn out thousands of identical binaries per day through my build system. The production side of things are solved. Other disciplines look at us enviously for it.

That's not where the troubles come in. The problems we face are generally those of design.

Our production is done by absolute, complete idiots - computer programs. Compilers can't think and can't fix design mistakes for us.

Contrast that with high-complexity engineering projects, e.g. fighter planes or Apollo project, where it usually turns out the official design specs tell at best half of the story, and most of the knowledge was contained in the heads of the builders, encoded in the shape and adjustments to tooling they used to build those vehicles. Which of course creates a problem today, now that both the tooling and the builders are long gone - but it demonstrates clearly how a build process made of people can fix a lot of errors in the design.

Insightful perspective! Thanks.

Are you sure? Pretty much every book and methodology over the past two decades has focused on how software engineering needs to "grow up" and be more like other production industries. A cornerstone in the latest fad, "new agile" with books like the Phoenix Project, is how the software delivery process should learn from lean production industries.

Yes, we've seen this before. It was called the "software factory" back in the 1980's and 1990's. It was snake oil then and it's snake oil now.

This is very popular with managers and funders who desperately want the predictability. Those pressures aren't going to go away. Doesn't mean it can be made to fit in that box, though.

> The production side of things are solved. Other disciplines look at us enviously for it.

Actually,software's production process is a mess that's designed to mitigate the colossal mess that's the typical engineering culture observed in software development. I mean, the only reason that a procedure needed to be devised to deliver automatically and immediately any fix to a software system is the fact that there was a need to repeatedly and systematically fix the product that was being delivered to production. That shit doesn't fly in engineering fields.

> If you're laying down the bricks for a house, there really aren't any cheap tricks or shortcuts.

Oh, there are plenty of them. Substandard bricks, substandard mortar, inadequate foundation prep, walls not being straight and plumb, etc.

It's just that brick construction is both easy to inspect and well known how to do it.

Right, sure. If I had to guess, though, bricklaying is still O(n) no matter how many corners you cut.

Not sure, more and more building are done using pre-done walls. Mainly in China. https://www.youtube.com/watch?v=3Sh7hghljuQ

it's more that bricks have to follow the laws of physics. A building with bricks that don't have the quality required will collapse, and the builders will be found guilty.

Software doesn't fail so dramatically - small annoyances, for example, is easily dismissed. Even if there are many of them.


The collapse may be after some years. I can't find any evidence that the builders have even had to pay for remediation, let alone compensation.

The Grenfell fire has an even more egregious problem in that the builders are being granted immunity from prosecution before testifying.

Small annoyances such as water leaking in and causing mold, or windows losing too much heat to the outside?

Or forgetting to put in the weep holes so water accumulates in the walls :-/

Come on man, how can you be so stupid so as to forget something like that?

This stuff is easy, just look over the code- I mean blueprints before commi- I mean building and you'll be fine. Everyone knows you're supposed to put the weep holes in, so make sure to do that.

The most serious and expensive software quality problems tend to c result from requirements analysis failures. Better programming languages and tools can't do much to mitigate those mistakes.

Better programming languages can make it easy to write down requirements in a formal way without making mistakes.

My point is that it doesn't really help if the requirements are wrong to begin with.

> Comparatively minor errors result in comparatively major failures. If you forget one rivet on a building, the engineering tolerances will make it such that the building will not fail as a result.

... but a systemic design error can cause it to fail, and this happens a lot when new techniques are developed and deployed. Change implies the increased possibility of failure.

I don't know why people keep holding up the construction industry as the paradigm here. Large civil engineering projects are notorious for cost overruns. The industry used to routinely get people killed during construction; only really post WW2 in the West has this been solved. It's still possible to have big post-construction disasters. One of the big political issues in the UK at the moment is the deployment of flammable cladding on multistorey buildings - this turns a fire in a single unit into potentially the loss of the whole building and many of its occupants.

You've all seen the Tacoma Narrows bridge video, right? But all sorts of innovations have their problems, mostly invisibly to the public. The last time someone attempted to disrupt tunneling before the "boring company", the "New Austrian Tunneling Method", it turned out to have serious problems: https://www.hse.gov.uk/pubns/natm.htm

For a recent, major construction failure in the U.S., see: https://www.nola.com/news/article_ca2f263a-ed77-11e9-8cfa-57...

> Software has more degrees of freedom. Unlike physical building disciplines where the types of materials are limited and generally well known,

The fact that materials used in construction are not that plenty, thus generally well known, is not a law of nature. In fact, it's the direct result of a radically different engineering culture. In construction, you only use building materials and techniques that are well known and demonstrably robust and performance. Even in new building elements and materials and techniques, before they can be adopted by the public first they need to be subjected to a whole batch of certification procedures and approved by a bunch of regulatory bodies.

Hence, a brick is a brick is a brick, because bricks are standardized and all the bricks you have contact with are actually certified and must comply wiht standarda such as


The fact that software development is a mess is precisely due to all the shortcuts and refusal to follow common practices established in engineering.

> you only use building materials and techniques that are well known and demonstrably robust

For better and worse. Most older building materials do not pass current building codes¹, even though many of them are perfectly safe and robust, while being way more ecological. Usually more labor intensive yet often still cheaper.

¹ depending on region

> For better and worse. Most older building materials do not pass current building codes¹,

That's primarily because they had to comply with the requirements of that time instead of requirements created in the future.

Nevertheless, rigorous compliance is not the point. The point is that the whole industry is focused on standardization, complying with specs, and use only tried and true technologies and techniques. That's not what the software development industry is about.

> use only tried and true technologies and techniques.

I was merely pointing out it has significant and serious downsides as well (which may be overwhelmed by the upsides).

If we were to apply similar principles to software development, it would have benefits and also downsides, both of which should be considered.

> But it is not really possible to make a server resilient to null pointer exceptions

This problen has been solved 30 years ago. Maybe we should pay more attenntion to improvements in our ecosystems...

> 1. Software is easy to fix. Compared to other construction disciplines, software is the cheapest to fix.

You could just as easilty as say software is extremely costly to fix. Software is labor intensive so fixing it can cost a lot of labor of extremely high paid people.

But really, I think the way to put it is "software seems much easier to fix than a physical object." It seems that way to the software engineer and to the manager talking to the software engineer.

Not only are there no physical reminders of how difficult changing a given piece of software is but the challenge is primarily in the unexpected and the illogical. Fixing routine X would be easy if it wasn't unexpectedly and/or illogically tied into a variety of other functions. And this stream of unexpecteds is extremely easy to forget/discount.

Software fixes can be costly, but they are easier than most other fixes at scale.

Doing a recall on a popular hardware product is orders of magnitude harder than issuing a software update for that same product.

There have been a few scenarios at my job where it was easier to just solve a problem in hardware, but once a bunch of hardware gets shipped, only software can save us.

It's easier to update the software than change the hardware. But you the problem that software can never fixed through the multiple iterations of updates - through the complex of the software and because the company pushes new features with new bugs along with the bug fixes.

But my comment above wasn't really to say software was more was less costly than hardware. It was mostly say that software is more unpredictable than hardware.

> You could just as easilty as say software is extremely costly to fix. Software is labor intensive so fixing it can cost a lot of labor of extremely high paid people.

Developing a fix for a software issue can be expensive, deploying the fix is relatively cheap in today's world with automated software updates.

Compare to e.g. fixing a design flaw in a car, where the development of a fix can be relatively cheap but deploying it (recalling all affected cars to the dealer to have new parts installed) is pretty much always expensive.

> Failure means minor inconvenience for a lot of people.

That alone makes software more critical than we give it credit for. Especially if there's many people.

Let's say you've got some mildly popular piece of software, with 10,000 users. Let's say a particular inconvenience means they each waste like 10 seconds per day. Total waste is 100,000 seconds per day, or 28 hours. Per day.

This adds up pretty quickly.

It still adds up slower than many other engineering areas. A recent technical problem on my subway line (Paris Métro 6) delayed everyone by 30 minutes for about an hour. Based on average ridership figures from Wikipedia, this adds up to ~250 days lost. An article from 2016 [1] places the downtime at ~2h16 per month for that line. That's ~1000 days lost per month, for only one subway line.

[1]: http://www.leparisien.fr/info-paris-ile-de-france-oise/metro...

I'm certain that even the current Windows 10 search bug doesn't come close to that kind of impact.

The subway is not the norm, it's an outlier. Most engineering projects are not nearly as important or impactful. An apt comparison would be a fairly big internet provider, a popular game (1M users, not just 10 thousands), a popular productivity suite…

I have chosen the small end of the scale, but let's think about Microsoft Word for a second. Or how much time Adobe Photoshop takes to start up.

What if you are a coffee machine manufacturer. a broken device or slow brew can waste way more time. and you can even hot-patch all devices at once!

An online coffee machine? If that thing's got a vulnerability, I already envision some corporate spy recording discussions, send them home, and comb them for corporate secrets…

They've been putting listening bugs into appliances long before the IoT was a concept. The IoT "connect all the things" approach just makes it easy on a grand scale.

People always forget that software engineering is a relatively new thing and still rapidly evolving. If you look at buildings a couple of hundred years ago(not the still standing, but the average) or the golden age of flying ~1950 it was mostly the same. Better than not having those things but nowhere near perfect.

>Software fails partially much less often than physical things. If only this were true. The worst kind of errors are the silent kind. They're just correct enough not to crash or notify anything, but wrong enough to corrupt results

You can't discuss software in this context without discussing IBM global services, oracle, computer associates, accenture etc etc.

Shut them all down, watch the average quality of software improve dramatically, immediately.

Next have someone qualified on every single board of directors and have each board create an appropriate sub committee. The same way you have a qualified accountant and audit sub-comittee. Have someone with proper CS & IT credentials and an IT oversight comittee. (Sure mostly the initial massive win is warding of the vampires mentioned in the previous paragraph but there is huge, huge value beyond that.)

Having ignorant people making the resource allocation decision is idiotic in the extreme and leads to overpriced, rubbish quality outcomes. Like deciding to use some garbage vote countinga app that doesn't and can't work - who did that? Can we just say they are utterly ignorant of the field they made important decisions concerning or do we need to get their name and demonstrate it.

Why are we so bad at software engineering? We are not! We just aren't. We do amazing things. We can do it reliably. We can do it economically. Software is f*&king awesome.

Why is there so much corruption in the decision making process leading to garbage quality overpriced, risky and idiotic software? Now that is a better question to ask.

Why is the idea of actually regulating foolhardy risk-taking startups (self-driving, privacy invading, turn-key facist state surveiling etc) so controversial? Because we can't even make good decisions about CRUD development at a policy level in a fortune 500 company - forget a policy decsion at a government level, you know it's going to be awful and redolent with regulatory capture.

We just need to grow up and stop blaming the geeks for the utter manure shoveled at us by ignorant jocks on golf courses determined to exclude anyone with actual understanding, insight and knowledge. And the manure shoveled by the actual geeks with a massive risk appetite and zero care for externalities beyond their startup making cash. Really. That's it. That's all.

Software is FINE. Decision making about software is SO bad, so awful, so hideous we try not to think about it lest it rots our minds with despair.

I sense a fair amount of the No True Scotsman fallacy going on in your comment--you're arguing that there's no fundamental issue at software engineering, only evil management oppressing engineers from unleashing perfect software. But, I suspect that we are so bad at our jobs that we build broken stuff all the time and merely blame our users for not knowing that everything is broken. (Try putting a space in your home directory and see how much stuff breaks.)

Heartbleed should be the equivalent of the Kansas City Hyatt disaster for our profession. It is a failure mode that is so elementary, so obvious, so easily avoidable that its occurrence should be a sign of deep failure in several processes meant to avoid it, and it should be grounds to open an investigation into criminal negligence. And yet... OpenSSL had no process for catching this stuff. Very few software projects do--I suspect yours do not either.

> Decision making about software is SO bad, so awful, so hideous we try not to think about it lest it rots our minds with despair.

And to add the icing to your cake, more often than not there are no consequences for those making these bad decisions.

It's funny how many engineers, especially in Silicon Valley, presume themselves apolitical and above such petty social concerns, when the process of making software is itself intensely political.

Hell, just being put in tech lead position and having to juggle engineers’ opinions tells me just how much of a minefield we’re in.

> when the process of making software is itself intensely political

How is making software more political than doing maths?

So by political, I'm conflating two meanings. A lot of Silicon Valley techies are apolitical in a sense of against government intervention, believing that science and their own rationality would be able to do better than the social institutions and powers-that-be. That sort of ideology is fine to have, but I suspect the same mindset can lead to neglect of the politics informed by the second meaning, which simply refers to the practice of negotiating, group-work, wheeling-and-dealing; in that sense you can also call it "business" decisions.

Creating software, whether in a corporation or in an open source project, is more often than not a process that involves many stakeholders, processes, and yes irrational traditions or even "religious" ideology (spaces vs. tabs, etc.). So any time you have more than one person writing a program, things will get political. The decisions behind software-making is a political process.

Mathematics, in contrast, is probably not political except in the higher levels of academia.

> So any time you have more than one person writing a program, things will get political.

I agree with you that the software industry seems to struggle more with these kinds of political issues.

However, there are many human endeavours that also require input from large numbers of stake holders and they seem to cope (i.e. going to the moon, building a bridge, building a skyscraper etc.).

That suggests the problem is not so much the large number of stake holders, but rather the software industry struggling to cope with situations requiring large numbers of stakeholders.

Quality on the initial revision of Apollo Program hardware and software wasn't great. They did as well as they could given the constraints at the time. But there were numerous failures along the way, and several missions survived mostly through blind luck.

>...apolitical in a sense of against government intervention, believing that science and their own rationality would be able to do better than the social institutions and powers-that-be.

That's an explicitly political position.

You’re not wrong, but some don’t realize that. Choosing not to play is a move.

You're talking about writing code.

The gp is talking about: requirements; allocating budget between research, salaries, testing, development, support, marketing, exec's pockets, shareholder's pockets, etc.; decisions about which standards to support or not; agreements with other firms; etc. etc. etc.

Software is the automation of processes. Processes are intensely political, because they effect the world.

Well after just having a 5 hour meeting discussing what exactly constitutes the Presentation Layer vs the Application Layer and debating about whether Authentication counts as business logic in the presentation tier, I can tell you for certain it is Political at times.

Math may not be political. Often it’s the input to the math that’s political.

Besides, the software engineering process is full of political decisions. What does an “unbiased” search engine mean? When you build a recommendation service, do you intentionally reinforce bias of the user or expose the user to other points of view? How do you define and handle abuse of your service? I could go on and on.

The Manhattan project and the Apollo project both contained a lot of maths and were also highly political.

> And to add the icing to your cake, more often than not there are no consequences for those making these bad decisions.

Not for the folks making the bad decisions, anyway.

Promotions aren't consequences?

Your suggestions are terrible and would make the situation worse. There are no "proper CS & IT credentials" which can qualify someone to do oversight. In particular most Computer Science degree programs don't cover any relevant material.

And what even counts as "software engineering"? If I want to write a VBA script in my spreadsheet am I supposed to ask some anointed expert for approval? It's just ridiculous and totally unworkable.

There are no proper accounting credentials. Anyone can call themselves an accountant. An accounting degree or CPA is not in itself qualification to do oversight. Audit committees don't solve everything either. Still seem to be worthwhile.

If you are running a fortune 500 company and your vba spreasheet is a material expenditure in your financial statements then YES get it approved by the IT committee. (If you get a vampire-squid consultancy to build it for you it probably will be too!) Non-material expenditure? No, who cares if it's not material? (Material has a GAAP definition and is deemd to be something like 5% of the balance of the asset or liabiltiy or impact more than 5% of revenue or expenses). Someone who has studied more recently than me can probably tell us what GAAP says is material for audit purposes.

> There are no proper accounting credentials. Anyone can call themselves an accountant.

This is not true everywhere. In my country, accounting is regulated [0] and can only be performed by credentialled professionals, registered on the Federal Council of Accountants.

[0] http://www.portaldecontabilidade.com.br/legislacao/resolucao...

A CPA specifically is qualification to do oversight. CPAs do occasionally fail in their oversight role but that's the exception, not the norm.

> Next have someone qualified on every single board of directors and have each board create an appropriate sub committee. The same way you have a qualified accountant and audit sub-comittee.

I think it makes sense to note that corporations don't adhere to GAAP voluntarily, or because it makes good business sense. They do it because they will be de-listed from stock exchanges and shut down by the government if they do not.

The stock exchanges won't do this for "tech malpractice." From a financial standpoint, tech malpractice is just another calculated risk. Versus financial malpractice, which creates an unlevel playing field on the stock exchange itself. And history has shown that the fallout from "tech malpractice" ends up costing comparatively low dollar amounts anyway.

This leaves me at: this won't improve without government regulation.

> This leaves me at: this won't improve without government regulation.

On the other hand, lack of government regulation has tended towards software costing $0.00 and very rapid evolution and innovation.

$0 software? Have you ever heard of Washington? The word "Enterprise"?

But the point is taken. So what proportion of $0 software, rapid evolution and innovation has come out of vampire-squid consultancies and their billion dollar revenue streams?

The upper limit of my estimate is 0%.

Having an IT committee on the board of directors is a pretty light regulation. Start with the banks. Your money is just bits on their disk.

Just having some people who get fired when the project costs a billion and fails is useful.

> And history has shown that the fallout from "tech malpractice" ends up costing comparatively low dollar amounts anyway.

Disagree. The cost is instead pushed to the consumer.

How many man hours were involved in fixing HeartBleed?

How much time/money have people lost due to security/tech malpractice at Equifax?

I've never been in a position where _lives_ were on the line, but I have been on projects where bugs would cost the company real money. The engineering mindset changes quickly.

Just because the company found a clever way to shift the cost off their balance sheet doesn't mean it's gone. It's just been hidden.

> You can't discuss software in this context without discussing IBM global services, oracle, computer associates, accenture etc etc.

> Shut them all down, watch the average quality of software improve dramatically, immediately.

This is naive. All of those organizations hire extremely smart people and pay them well.

The issue isn't them existing; instead, it's companies wanting "bodies" of developers/operators at the lowest price, hundreds of millions of folks in more impoverished areas willing to answer the call, regardless of the stipulations, and this being extremely profitable when done at scale. Blame executives continuing to think that tech is a cost to be optimized, not an investment to care for.

This pendulum is starting to swing in the other direction (paying for quality not quantity), but it's a slow adjustment.

Accidenture has a vested interest in their products' failure, lest we forget. If they were to hire smart people, that would just make them even more dangerous.

Well, in my experience this is equally (or even more) dangerous. I've seen when the engineers take over and it turns out that they are not necessarily better (at least not 100% good) at "simple" things such as estimating within a 10-20% bounds and similar things - even in organizations that are supposed to have some of the best of them. Rather, in organizations that believe in this kind of thinking (if we could just get rid of management and do it ourselves we'd be so much better off - kind of reminds me of NIH-syndrome) are typically worse off. It's like people who don't believe they are susceptible to ads - they are generally the ones most susceptible....

Yes but .. huh? How many developers claim to be good at estimation if not for their managers?

My experience has been that devs would much rather not make estimates because they know they can't do it, and when their management is itself made of engineers, they aren't asked to. Engineering-led businesses find other ways to avoid the need for estimates.

The problem with software is "copying is free". So old shit does not die, because it is cheap, it is used also in new programs.

It would not surprise me if firing 75% of working programmers would have a net result of more, higher quality software being created.

... unless, of course, it's management deciding to fire the engineers in the top 75% salary band.

Which they would be inclined to do given this scenerio.

I'm not even sure that wouldn't be a net improvement. There are a hell of a lot of people not doing much besides increasing communication overhead and friction and bikeshedding.

Possibly, the other half is convincing management that some developers are really worth 2-3x as much.

Just having 4x less people to communicate with would be a net positive, so I’d be happy with that even if we didn’t get a raise :P

Firing most of middle managers is required too.

>Next have someone qualified on every single board of directors and have each board create an appropriate sub committee. [...] Have someone with proper CS & IT credentials and an IT oversight comittee.

And let me guess, the sub-committee should have its own sub-committee, to hold meetings to prepare for the meetings to prepare for the meetings?

It reminds me of the company (which one? Philips?) which got so fed up with these people that it put all of them in a new department. Then, they got to hold their meetings and the rest got to do their work and everyone got along.

Interesting you think IBM global services, oracle, computer associates, accenture are making low quality software.

Which company is making good Quality software in your view?

If you've ever worked with any of these companies, you know they make terrible software. They just don't have the incentive structure to be otherwise.

Only one company I ever worked at was making decent quality software, and that was a small consultancy.

If you raise an issue and a few days later they update with a fix, then it's probably good.

You're never going to see that from IBM. You will get an email from a marketing team telling about all the other services you may enjoy.

Well at the risk of being flippant if you don't know what good quality software looks like, you probably work for, with or on behalf of these vampire-squid-consultancies. Or you have no business having an opinion about it or trying to form one. Really. It's as pointless as forming opinions about who is a good contemporary composer if you've been deaf since birth.

Look at all the successful startups disrupting industry by writing software. Note the total lack of vampire-squid consultancy in their codebase.

Look at the public service orders of magnitude cost blowouts and non-functionality that is the norm. Look at the domination of vampire-squid consultancies right there.

It's really not difficult to see unless you're determined to keep your eyes shut. The vampire-squid consultancies should not exist and are a symptom and proof of the prevalence of misallocation of resource in decisions made by the wholly ignorant.

> Look at all the successful startups disrupting industry by writing software.

You seem to be equating 'disrupting an industry' with (high) quality software. Could you please elaborate how this holds?

I would also appreciate examples of high quality software, I don't care about the impact of said software.

99% with you, but the vampire-squid consultancies came into being on the backs of market participants trading bank notes for software, and continue to exist on a steady supply of bucks for bugs.

So what did they do right, and what are they continuing to do right other than infecting large organizations with their blood sucking tentacles in a manner which closely resembles a Hokusai woodblock-print and removing some of the favored organs of the management end of the org chart that to be honest, they probably weren’t using anyway?

They are just horrible to deal with.

I get rafted in as a vendor for part of a package that several of them offer to Fortune 500s. They provide negative value - it would be massively easier if I could just hook up directly with the end-customer's IT and do what needed to be done with competent people who know their job and what they want. But I have to play telephone through two or three layers of project managers in Delhi and Bangalore and an ever-shifting array of other people of indeterminate status and role.

They are a symptom of the resource allocation being performed by those who are wholly ignorant. They exploit this. Viciously and mercilessly. They should not exist. They would not exist if the people making the resource allocation decision, ie senior management and boards of directors were competent to make the decisions with tens, hundreds even thousands of millions of dollars at stake. The con is on. It's much easier to con the ignorant and they do. Moreover they actually mercilessly attack, undermine and destabilise anyone working in the public service or fortune 500 co.s who might actually have enough of a clue to get in their way.

It has nothing to do with Hokusai.

Excusing vampire-squid consultancies on the basis that the conditions are ideal for their horror is simply unconscionable.

They are a disgrace to our industry. They desperately, desperately need everyone to believe "We are bad at sofware engineering" to continue their con. But we are NOT. We can do software engineering and have done it well, so many times, so publically with such spectacular success disrupting powerful incumbants from the garage with nothing more than software. We will continue to do so. Who is next? You? Me? Absolutely not anyone hiring a vampire squid. Guaranteed.

Something about this vision struck me as dystopian.

Fashion. We're not bad at software engineering. We're bad at correctly labeling what is "software engineering" and what is not.

"Software engineer" today is just a job title. It makes us sound cool, but it has about as much to do with engineering as "Scrum master" has to do with rugby. In lots of places, you don't need to have an engineering degree or be a certified engineer to claim the title.

Teams that actually practice Software Engineering, like the famous Lockheed shuttle group, are as good at it as we are at any other branch of engineering.

It's wildly misleading to use the same word for an app written by a few programmers in a couple of months (like the Iowa Caucus app) and the Space Shuttle control software (maintained by hundreds of people, with an average of 1 bug per release), and everything in between. It'd be like letting anyone call themselves Doctor.

> It's wildly misleading to use the same word for an app written by a few programmers in a couple of months (like the Iowa Caucus app) and the Space Shuttle control software (maintained by hundreds of people, with an average of 1 bug per release)

You are drawing a distinction here between two vastly different project types—not between the roles of people on the projects.

Of course something that has a million times the budget has the potential for significantly higher reliability.

Additionally, the particular requirements of the space shuttle software project make reliability extremely important.

If a computer programmer were to allocate similar priority to reliability in an app for chatting with friends, I'd argue this would make them less qualified for the title of engineer.

My understanding of what characterizes an engineer is that they make effective decisions about technical tradeoffs, and come up with plans that can be realistically executed while optimizing a set of project requirements (technical and non).

In my experience, software engineers are (often) appropriately labeled as such, insofar as the work they do matches that definition (if the definition is inadequate, that's another matter—happy to hear an alternative).

My hypothesis is that people start getting fussy about the 'software engineer' label for two reasons:

1) There's a lot of variety in software work being done. Not everyone writing code is an engineer. Not sure about the situation at large companies, but at least in the startup world, in my experience, everyone I've worked with who had the title deserved it in the sense that they were doing engineering work according to the above definition.

2) The requirements for software projects—especially in terms of the importance of reliability—are very different from those in more traditional engineering domains, which can create a superficial appearance of poor/non-existent engineering. But this is exactly what you'd get if the engineer did an extremely good job, but with project requirements that don't place high value on e.g. reliability.

It would be great if every app could have formally verified code etc.—but the reason they don't has to do with businesses funding these projects, not with the engineers working on them.

> It would be great if every app could have formally verified code etc.—but the reason they don't has to do with businesses funding these projects, not with the engineers working on them.

You can look at it from the company's point of view: Net profit would go down if they formally validated everything. Put that way, it sounds horribly selfish of the business. But you could also look at it from society's view: Net value to society would go down if everything was formally validated. Why? Because much less software would be created. And, for something like a chat app, we (society) don't actually need the level of reliability that we'd get from formal verification. Don't waste the effort doing it, because it's a net loss. Go spend the time building something else.

The result is that we get chat apps that crash. But we also get much more total stuff. I think that's a net win, even though it's annoying when the chat app crashes.

> ...much less software would be created.

I'm not convinced this is necessarily a bad thing.

> If a computer programmer were to allocate similar priority to reliability in an app for chatting with friends, I'd argue this would make them less qualified for the title of engineer.

While it's entirely correct for an engineer to use different materials in different circumstances, I don't think I've seen professional engineers in other fields choose methods which they know will fail under expected stresses simply because it's not considered an important project. There's still a safety factor.

> Of course something that has a million times the budget has the potential for significantly higher reliability.

I've seen software projects with funding everywhere between $0 and $outrageous, and the latter never use their extra funding for reliability. It's always for scope. They build software in exactly the same flimsy way -- just a lot more of it. You never see a press release that says "We took $5M in funding so we can finally upgrade from SCRUM to SEI 3, and fix all those dang bugs."

> I don't think I've seen professional engineers in other fields choose methods which they know will fail under expected stresses simply because it's not considered an important project. There's still a safety factor.

It's not that the project is unimportant so the engineer doesn't care about it, it's that the project's requirements have a certain tolerance for failures. The project dictates that tolerance, not the engineer; the engineer develops and implements a plan that works within the tolerance.

And no, typically there is not a safety factor, which is part of why the tolerance is higher for these projects.

> They build software in exactly the same flimsy way -- just a lot more of it

Again, this is not something determined by the engineers—it's just part of the project requirements the engineers are employed to work from.

I think your gripe is more with something like software product managers than it is with software engineers.

Nobody writes code that fails reliably under expected conditions. Reliability is a question of robustness in the face of unexpected or rare conditions.

BTW, chat apps are a bad example. Not sure why this thread is using them. WhatsApp has higher uptime than the telephone system itself, if I recall correctly. Making highly scalable and available message switches is a solved software/computer engineering problem.

A better example is most bespoke business software. If it collapses in a messy heap because a microservice ran out of disk space, people shrug and put in alerts to make sure disk space doesn't run low again without an operator being notified. Job done. Not exactly a rigorous approach to failure but it'd make no sense to spend weeks or months on making the entire system fail gracefully in that situation when it's easily avoided.

Whether we're bad at software engineering or not doesn't seem to be the point to me. I think the point is economics and other factors causes us to forego engineering almost every time. The space shuttle control software is a massive outlier in a giant body of software.

edit: although looking back at the article and paying closer attention to the title, I can see why the point I replied to was made.

> Teams that actually practice Software Engineering, like the famous Lockheed shuttle group, are as good at it as we are at any other branch of engineering.

I’m not sure that’s true, or perhaps more accurately, I’m not sure that bar is high enough.

No other branch of engineering, at least none that come to mind, have to deal with the number of variables that any medium-sized software application does. Software, being infinitely malleable and running on a seemingly infinite variety of hardware in an infinite number of environments, is vastly harder than any other engineering.

>No other branch of engineering, at least none that come to mind, have to deal with the number of variables that any medium-sized software application does. Software, being infinitely malleable and running on a seemingly infinite variety of hardware in an infinite number of environments, is vastly harder than any other engineering.

I'm not going to argue if this is right or wrong as I don't have a Software Engineer's perspective. My background is in chemical and materials engineering in heavy industry (mining and manufacturing sectors).

I've worked in Operations and Process Engineering positions (https://en.wikipedia.org/wiki/Process_engineering) and in more recent years I've been employed as a "Technology engineer" which is essentially integrating new technology and process improvements and optimizations into existing industrial applications.

The scope and complexity of what Engineers deal with daily might surprise you, software is complex so to is something like a smelter or refinery.

Pretty sure I can spend 15 min on Wikipedia and find dozens of engineering fields vastly harder than software engineering.

Also, your definition of software is really far from reality. 99% of software don’t have to deal with hardware directly and are compatible with a number of environments I can count on one hand.

There are engineers who have to worry about quantum effects that physicists barely understand. I think, by definition, that's harder than Turing-complete software.

Wast number of hardware and infinite number of environments is not contemporary issue with software. Contemporary issue is that software does not run properly even when it was meant to run at one specific piece of hardware, is often not tested at all, contracts are made for impossible deadlines and even software practices known for years are still not made (as much as management claims to do them to customers).

There exist cheap, dangerous DIY aircraft and expensive ultra-performant bug-free programs, but mostly we see expensive, safe airplanes and buggy but cheap(ish) software.

Also SWEs do a lot more "on-the-fly" modifications than physical engineers, simply because it's so much easier in software.

Agree. Also, nowadays everybody is “senior software engineer” after 2 years of experience, which is ridiculous.

I wonder how different was the situation before internet and to a point before personal computers.

Computer engineering was probably slower and deeper thought out. Or am I imagining an ideal past ?

It use to be this: Making something that works is step 1, then you rewrite it again and again until performance is what you need from the application. What seem like bad ideas work sometimes and what seems like a good approach may not work. A lot of silly trial and error. It was good fun seeing hacks that far exceeded what was thought possible. Oh? Its 1000 times faster than my best approach?? what the hell??

Ease of development was an utopian dream.

100% agree. Software Engineer/Architect gives plans to the software developers. Software developers actually build the thing to the specifications of the engineer just like a construction crew. But, all of that is only true if there is an actual license to be a Software Engineer that is held in the same esteem as other engineering fields. Until then the conflation of software engineer and programmer/developer will persist because it makes sound cool.

This is the dumbest comment in the whole thread. You can’t design software if you aren’t building. There’s just so many unforeseen problems that come up when you’re actually in the thick of it. It’s not like a house that you build 1000 times from the same plan. Every single piece of software is unique in some major way.

There are so many CRUD apps out there, they could be spec'd with such specificity that they only need a "coder" to complete the tickets. But the tech industry's combined architects/engineers and construction workers into a single developer role.

There are no such thing as a CRUD app and every app is a CRUD app.


Someone who can make a boxy UI, hook those boxes up to {{text templates}}, and make GET/POST/PUT/DELETE calls (to RESTful API just thinly wrapping a SQL DB) can recreate a shocking number of popular web and mobile apps. Those are CRUD apps.

Non-CRUD apps have an unusual (non-boxy) UI, non-trivial backend processing/integrations, or both.

I wish I was a software developer you’re describing. You wake up in the morning without a stress about 20 yesterday’s daily decisions, take your todays tasklist and just do what you did everyday with your well-known tools and knowledge for fifteen years in the field. That’s how most of us want it, no sarcasm or something.

Funny thing, the PE software engineering exam was so unpopular they discontinued it... https://ncees.org/ncees-discontinuing-pe-software-engineerin...

Agreed. An insult to real engineers. As misleading as calling programmers Software Doctors.

You can become a chartered engineer based on software engineering expeirence and a CS degree. Ergo, you can be a software 'engineer'.

Ironically, the title "Doctor" originally meant "teacher".


What does SWE actually do? I mean, I have some ideas how e.g. building or bridge engineers work and how they plan their projects. Standards, blueprints, legal safety requirements, physics, calculations, approval, etc. But what is software engineering in short?

Upd: is there AutoCAD for software?

Most software and for that matter, most software developers don't need that level of detail. It's the difference between building a nice doghouse and a skyscraper. The problem is too few know the difference, fewer still know how to hire properly, and many seem to think that one developer is interchangeable with any other.

Of course tethering to education requirements doesn't mean much. I think a formal guild where you build your representative value by the endorsement and status of others, and if you fail, not only are you brought down, but those that endorsed you will lose cred as well. Setting up such a system would be difficult and take time, and I'm unsure if it would ever reach critical mass.

Also, unlike doctors and other fields, I don't think you should have to be formally licensed to work in software. Most initial developers are, were, and continue to be cross trained from other fields.

>It's the difference between building a nice doghouse and a skyscraper. The problem is too few know the difference

That’s what my question is about!

Building engineer knows or reads N54.s-1e/624 “on public safety in a first floor lobby” to design it. It may specify various requirements like no sharp edges, glass walls need stripes on them, non-slippy floors, etc. Then when they need to add an electronic lock to all main entries, they contract with a seller whose locks are licensed with LS9-41.3/FLL. Then firefight inspection validates these entries for public safety again, involving standardized legal rules.

I don’t even try to describe the actual building from the ground to the concrete structure process, because that will be too naive. Lots of standards, requirements, ready-to-use plans, or a project approval process that requires half a year for a shallow review of all documents in many controlling departments.

What I’m interested in, is there such a level of engineering in software at all. I don’t know about anyone who did something like a regular building engineer does at the / before the construction site, but in software. Is there such a thing like Software Engineering at all? All popular software doesn’t seem to have signs of no-dog-house, when you read their code.

I'm not sure it answers your question but in software, the requirements are almost never codified up front because extracting the rules is part of the project.

In terms of the dog-house, there's that rule "Build one to throw away". Since no two software projects are similar, the only way to truly know what you're building is to build it, see what works, throw it away, then build it properly.

Part of what makes software so good is that you _can_ upgrade the skyscraper while people are in it, metaphorically hardening floors or adding new levels.

I think the software metaphor for your building codes is basically "Software coding standards and guidelines" and most large platforms will have them. For example https://www.oracle.com/technetwork/java/seccodeguide-139067....

And yes, good software engineers are expected to know those practices.

Aerospace, higher end weapons systems and most medical hardware+software... It's usually when in conjunction with life/death physicality that these things come into play.

Unfortunately, the efforts on some of these systems are much harder to work through in practice than something created more organically.

This is also where non-software companies get into trouble trying to do software projects. They think they're building a house so they hire a carpenter. They don't realize they're also going to need a general contractor and maybe a carpet guy.

I think a formal guild where you build your representative value by the endorsement and status of others, and if you fail, not only are you brought down, but those that endorsed you will lose cred as well.

You mean like, a tech firm? Or a software consultancy business? :)

No, I mean a guild... Closest modern representations would be somewhere between a professional board and a union. Bob was trained by the great Tim, who created Foo and Baz. Tim endorsed Bob as a Journeyman from an apprentice. Steve and Phil (both master level) later endorsed Bob as a Master craftsman.

Bob screws up big time... complaint filed with guild board for review... board revokes Bob's master status (back to journeyman) and notes on Steve and Phil's record.

If Steve and/or Phil get too many hits, they lose their status as well.

My point is that what you're describing is a firm, just with different terms.

Bob was trained by the great manager Tim. Tim endorsed Bob from internship to junior engineer to senior architect. Steve and Phil both later recommended Bob for promotion to CTO.

Then Bob screws up big time. The board of directors gets involved. Bob is fired and eventually gets a new job as a middle-manager or lower in some other firm, or if the screwup is large enough, takes early retirement.

The "guild" is doing the same things as a company does. But judging skill, success and failure is quite subjective. That's why we need so many competing firms to get results. A single guild would be like everyone working for a single company. It'd yield very poor results.

One big difference is that, at least in many jurisdictions, new (non-software) engineers need to survive what amounts to an apprenticeship. A second one is passing standardized exams and demonstrating experience before achieving the title.

Now, I don't think apprenticeships are unmitigated positives - another function they serve is to limit field entry. But they are, I think, far better than what "we" do, and sorry, pair programming is not a substitute.

I don't know what does AutoCAD have to do with anything. I use it for modeling parts that I machine as a hobby; that doesn't make me a structural engineer.

By AutoCAD I mean some app that helps you to design architecture and components and then validate these against existing rules. Not a literal mechanics/blueprints AutoCAD, but ?-CAD for software development.

>But they are, I think, far better than what "we" do

They should be far better, but an engineer is not an abstract position. If all it needs to be an SWE is to know 3-book algorithms and some patterns (just my uneducated guess), then it is not engineering. See my other comment for details. Yes, a CS degree adds to ones competence, but it is like saying that physics degree allows you to build an elevator. It doesn’t, it will be just a nice science-grade dog house level elevator, because learning physics doesn’t make you know all standards that elevators require, legally and practically.

Maybe something like a UML or data-flow diagram could be similar? There are more formal specifications you can do, as part of 'formal methods', which I think might be closer to some engineering disciplines [0] - here's one from where I work.

I do think software engineering as a discipline exists. Looking at the definition written by the American Engineers' Council [1], there have definitely been teams I've worked on where we met this criteria. There are teams like the Lockheed shuttle group who average 1 bug per release that clearly embody this criteria too.

Also, my CS Masters is accredited as meeting the educational requirements for Chartered Engineer status (which I'm hoping to achieve once I finish) so it's clearly accepted by some in the engineering community!

[0] https://hydra.iohk.io/build/789825/download/1/ledger-spec.pd...

[1] https://en.wikipedia.org/wiki/Engineering

> One big difference is that, at least in many jurisdictions, new (non-software) engineers need to survive what amounts to an apprenticeship. A second one is passing standardized exams and demonstrating experience before achieving the title.

"[M]any"? Outside of a handful of really anal jurisdictions like Canada I am not aware of any where this is a requirement. I chose not to go this route because I didn't see the value in it for the type of electrical engineering I was interested in.

Delve into the courses listed here, it might help: https://www.msse.umn.edu/MSSE-Master-Science-Software-Engine...

> is there AutoCAD for software?

JetBrains IDEs?

Totally agree. SWE is the only field where people can call them "Engineers" without having an actual engineering degree.

Contarian opinion:

We are great at software engineering, and our field is the best at engineering generally.

Engineering isn't about how reliable or high quality the system is.

Engineering is about balancing the costs and the benefits.

It can be great engineering to make a running shoe that wears out after 2 races, if it wins marathons.

People who think software engineering is bad, are actually bad engineers, who think engineering is all about quality or reliablity, rather than about optimizing tradeoffs. Software engineering is doing an amazing job at producing products that balance cost and benefit, which is why software is eating the world.

Engineers from traditional disciplines would struggle here, because it's a lot easier to follow super expensive tried and true process, than to situationally make tradeoffs like we do.

Yet software engineers sometimes fail at actually implementing that balance.

> It can be great engineering to make a running shoe that wears out after 2 races, if it wins marathons.

Sure, but it is bad engineering if it fails halfway through the first marathon and injures your runner. See Iowa. Nobody is complaining about the fact that tradeoffs are made (strawman), what people are complaining about it is that software engineers make tradeoffs unknowingly/don't understand what they are doing.

This is the difference between "we built a bridge that will last two years, because that is all we need" and "To our surprise the bridge already failed after two years, whoops".

facebook did ~$70B in revenue with ~$24B in profit, which implies that it spent about ~$40B to keep itself running, for a year.

The Big Dig cost ~20B over the course of a decade. The Apollo Program clocks in at ~25B The Large Hadron Collider at a svelte 10B.

Software isn't cheap it's just that no one pays attention to the costs.

Facebook is an outlier monopoly, having successfully privatized a massive commons, so I'm not sure their ability to spend 40B a year proves much of anything. (If those figures are even the right lens.)

As a counterpoint, Wikipedia's operating costs are ~40M a year, 1000 times less, and I know which one I'd rather keep!

> having successfully privatized a massive commons

They created the commons.

It didn't exist before and several companies tried to do what they did. They all failed. MySpace had a huge head start. Didn't matter. Everyone shit on the Instagram acquisition for $1B when it was first reported and years afterwards (in fact Zuck made the decision without even consulting the board and right before a risky IPO where the public was sure to take it as a negative signal). He saw the value and invested in it. Today Instagram is a $100B entity that does more revenue than YouTube.

Credit where credit is due.

That’s not to say there aren’t issues. There obviously are and they’re well documented.

How many people use those projects? What a weird comparison

Probably because the profits are in line with the cost.

I agree. Engineering is all about trade offs. Software let’s us make different trade offs because it’s so easy to update and fix your mistakes. It would be a poor engineer that didn’t take advantage of that.

I will give you one thing. We iterate faster. We can try a dozen number of approaches in a week and often we get immediate feedback on what works and what doesn't work. I hardly doubt any other kind of engineer has tried as many things as any of us do. We build, and build and build. We don't spend a lot of time building one thing. Or if we build one thing it is made of many things that we build, rebuild, refactor and replace over and over.

I don't think that lack of incentive is the primary reason that software is so unreliable. It might be part of the reason, maybe even a significant part, but not more than that.

I suspect that software is just ridiculously hard. The complexity ceiling for programming is insanely high, and there is very little tolerance for oversights. If I'm designing a drill for instance, I don't have to consider every possible use the drill will be put to. I can just overbuild the crap out of it, and cover 95% of my bases.

Can you think of an analogous man made thing that is as inherently complex and fragile as a major distributed system?

And importantly it never really stops. As soon as a reliable solution to a problem has been discovered we immediately use it to build a billion complex new systems on top of it.

And testing is hard.

And programmers rarely get to truly specialize. We spend a great deal of time learning on the go because it would be nearly impossible to truly be an expert in a field where the tools and goals change so rapidly.

I'm sure I could come up with more if I kept thinking about it.

Author here!

This is a good point. Earlier drafts of my post did include "the unbelievable depth and complexity of the modern technical stack" as a factor. I deferred it until a future post because I think that we accept that we don't fully understand the systems and stack because of our incentives. Imagine if the stakes were much higher: if our software was hooked up to pacemakers that stopped pumping blood when the software was down. This isn't so crazy: people who design airliners go on the test flights. We would band together as an industry to produce guidelines and recommendations around safe ways to run sites. We'd quit rather than work on a doomed project for a paycheck. Any work would require something equivalent to expert certification and design validation at every layer of the stack. We'd do everything in our power to avoid creating these emergent systems. But the situation is reversed - in many circumstances, the cost of failure couldn't be lower. We tolerate the fact that we don't understand the emergent systems that we create because it doesn't matter enough for us to care.

But I don't want to make it sound like I'm dismissing you're point. You're right and it's super important. I think there's a better angle that I can use to explore this. It might be "Why do so many websites get hacked and lose our personal data?", but I feel like that doesn't fully capture it. I have to think about it. But this seems less relevant to the election case, which is what I wanted to address - that these projects fail because "launch and iterate" has become so ingrained in our process that it's difficult to recognize when it doesn't apply.

I think your focus on incentives is correct.

If I'm building a plane, I'm going to understand every last little piece going into it.

If I'm building a web app, I'm mostly ok if I do not understand every line of Rails, and the C code in the Ruby interpreter.

There are embedded systems where people have that kind of understanding. And they do fewer things than systems where people happily throw libraries in and whip out new features regularly.

If I built a web app with the same kind of line by line control as that embedded system, it would get outcompeted by someone playing more 'fast and loose'. Maybe mine would have a 99.999% uptime (or maybe not, because AWS or someone might have problems too) and theirs has 99.95, but way more features, so the market will gravitate to them.

This is why I have said before and will continue to say, the future of software engineering to me will largely lie in reduction of complexity, including reduction of stack complexity, reduction of LoC (a bad single metric but could be useful as part of a bigger plan) and increasing readability. I say this as an ops type who has seen stack complexity as one of the main barriers to reliability. The reasons for this being the standard are numerous and understandable though.

It's essentially the main weakness of the open source ecosystems too. The many eyes theory falls apart when the LoC reach many millions. There will never be enough eyes finding bugs or poor practices. Maybe in the future machine learning could help reduce this gap.

> We spend a great deal of time learning on the go because it would be nearly impossible to truly be an expert in a field where the tools and goals change so rapidly.

Could we make the tools change less rapidly?

I have a larger theory of the tech world right now that there's way too much change, in every corner of the industry. Computers don't have fundamentally different capabilities than they did a decade ago, so outside of security patches, why do so many software projects release major feature updates more-or-less annually?

Make no mistake, most of these updates contain legitimate improvements—it's just that change itself has a high cost, for both developers and users. Most of the time, I don't think the improvements are worth the change.

There are numerous cultural and business reasons for the rapid pace of change in the software world, but I don't think any of them are inherent to how software is created.

We also hand roll and rebuild things over and over again. We try to make abstractions and building blocks, but we aren't very good at it compared to other fields. It'd be like an electrician wiring up a house, and building the electrical panel from scratch every single time.

Yeah, but we either need to standardize on the (metaphorical) wiring and dramatically impede progress in that area, or accept that we will indeed have to build the panel from scratch.

I'm not sure even a drill has not (a lot.. ) more thoughts put into it.

From mechanical to electrical engineer, to material fatigue, moulding, durability .. costs

I'm no tool engineer, I'd be curious to see how it really goes at Makita or DeWalt.. how much preemptive failure mode analysis and system thinking is done even for a drill.

We're bad for several reasons:

-- Outside FAANG type companies, most businesses are not willing to actually put in the investment it takes to do engineer software. They want a "solution" as cheap and fast as they can get it. I'm not even passing judgment here, as in some cases that approach is entirely justified.

-- Software is a world of eternal September, where the industry memory is only about five years long. Lots of us move on to other fields or roles and the memory is forgotten, and here comes another generation to reinvent the wheel all over again.

-- Education: Computer science is not software engineering. Code schools are teaching programming more than software engineering. Thus, software engineering is something learned through doing rather than something studied as a discipline.

The competency of FAANG is completely overblown. Mostly they enjoy massive profits by being monopolies. Because they're monopolies, they can afford to waste a ton of money on software. Facebook is not even known for high software quality, theyre actually known for speed and getting the work done as fast as possible. Same for Amazon.

Not entirely true. Facebook, Google, Microsoft, Netflix, etc... They all wrote (or greatly contributed to) a bunch of high quality open source projects that you probably use on a daily basis.

Yeah they did. But its a small fraction of engineers that wrote those libraries. Not only that, most of the engineers doing foundational work at those companies are not getting hired through standard interviews. When Google wants to hire people for a team to write a new distributed database, they arent asking a bunch of leetcode questions to engineers. They're hiring out of PhD programs and from accomplished teams across industry. People do get pulled onto those teams from other teams in the company, but it's only after those people have proved themselves. As a whole, only 1% of the engineering at those companies is what you are describing.

Look at the white papers for the foundational systems at Google and FB, most of them are authored by the same few teams. Jeff Dean had a hand in designing a ton of large scale Google systems, just because he came up with some brilliant solution, does not mean Joe programmer at google who can implement A* in 30 mins is in the same league.

Sure, they don't have to be efficient so they can afford to spend more on tool development. Not everything works but the better approaches get more adoption.

Also, I think you're underestimating the effects of having a lot of code. This pushes limits that smaller companies don't have to worry about.

Elite startups and hedge funds have the best software quality. There are funds in NYC where every programmer makes more than 500k a year, some in the millions. They have people on those teams who have been programming since they were in middle school. FAANG just hire in bulk, they just want you to memorize solutions to algorithms problems, and if you can do that youre in. Some huge percentage of programmers at Google have less than three years experience. They can absorb all the bad engineers they hire, a few end up being good, who they promote to the teams actually writing the software critical to the business. The rest just write CRUD Apis and shuffle data around, using some framework written by the top 1% in the company.

One of those funds famously blew themselves up in 45 minutes, with causes including reusing an old feature flag rather than defining a new one and a complete lack of code review or incident response procedures. In all my (admittedly anecdotal) knowledge, it may be true that the top funds have the top software engineer quality, but their skills are put to use writing software that's absolutely terrible.

> Elite startups and hedge funds have the best software quality.

Most startups that I hear about have rather shitty software quality because they're focused on velocity. VCs don't care about sustainability since they don't even know whether the business is even worth sustaining.

Sure, but a truly world class programmer writes better code in a rush than a mediocre one does in a slow enterprise environment. Programming skill doenst work like that, some people are just better.

Surely there are some teams in the heavily-regulated aerospace and medical technology spaces as well. NASA would apply.

If you haven't been writing code since middle school, how would you recommend getting into those funds?

It’s a weird world. I worked at D. E. Shaw and their initial phone screen includes SAT score. I was 26 at the time and had to ask twice to be sure I’d heard the recruiter correctly. For whatever reason, that was part of the process. They were also the origin of the “every new hire must have an articulated dimension in which they are better than the median for their role and level” that Bezos later took to Amazon.

That was the most consistent top caliber colleagues I’ve ever worked with. (I’ve worked with other excellent colleagues, but Desco was uniformly very good. Our office manager had a PhD. Our recruiter had a masters degree. It was crazy.)

Probably the easiest path is to be referred in from a current employee. Next would be outstanding achievement in some academic or technical field (if applicable). You don’t have to be an autodidact or a savant. You need to be competent, intelligent, and willing to work hard and then to find a way to get a warm intro to recruiting.

I'm curious, what evidence do we have that the FAANG type companies are significantly better at software engineering than anyone else?

When we look at the entirety of companies around the world, it's reasonable to think that there are non-FAANG companies that could be just as skilled. That stated, my own casual observations:

1) The FAANGs that I've seen have significantly more robust practices than the non-FAANGs. Everything from office conditions to build / test / release systems. The reason for this is:

2) Incentives. When your business is completely focused around software technology, you are incentivized to build out systems and processes that other places would not, because it is core to your business.

Anecdata from many years in and out of various kinds of companies, as both an employee and/or road-warrior consultant.

I still see bugs and rollbacks in FAANG releases. I conclude that the incentive isnt enough. The practices are inadequate.

A process with occasional bugs and rollbacks is rational if it produces more net utility than a process which only ships "perfection".

What's your error budget? Not spending your error budget is velocity left on the table.

That's... like the whole point of the article. That mentality causes less reliable software.

Well at least they are trying to be. Some of them employ PhD's in theorem proving to apply computer assisted proof to improve their software (and hardware).

Who's doing this? Amazon is using model checking on simplified models of real-world software systems, that's not "theorem proving". Microsoft is doing some related stuff in low-level code (Sing#, etc.) but they're not in FAANG. Galois is not even close, they're a relatively small shop.

The other A is doing it.

To accompany the neighboring reply from the grandparent poster, the practice of site/service reliability as an engineering discipline is not just popular amongst the FAANGs, all of whom have SREs, but embedded into the company culture and structure. Google, for example, has a head honcho of SRE (a VP of SRE?) who gives all SREs extra authority to act somewhat independently.

A large part of the perceived competency of the FAANGs is their ability to casually and repeatedly achieve 5+ nines of availability. When HN is unavailable, that's no big deal; when google.com is unavailable, users act as if an apocalypse has begun. That difference in reliability changes a service from something that people consider as an option into something that people rely upon.

This is my point exactly, they arent. They just have monopolies.

To your first point, not even that alone.

People aren't paying (only) for stable and correct software. They actively seek out fancy UIs for example. I noticed that in a former company. We had the best algorithm and the most correct numbers (for what we were selling) but the customers didn't like them because they were lower (and more realistic) than some competitors' numbers. Also they had better and shinier UI, they won the market.

There is plenty of very buggy and very bad software from FAANG companies. I'm not sure that first point holds too well.

are you suggesting that facebook or google's or MS products are the best engineered? I was just using skype ...

Skype is a bad example because it was something that Microsoft took over.

it was good when microsoft took it. they made it worse


btw the list is much longer: twitter, facebook, adsense ... visible bugs in all

We're gonna have to exclude some huge percentage of Google's product portfolio if that's the standard, then. Though much of that is pretty bad so maybe you're on to something.

One of the more profound insights from Stephen Wolfram’s “A New Kind of Science” is that small rules can lead to extraordinary complexity.

Wolfram famously gives Rule 30 as an example, where you cannot know what each line of squares will look like before you do the computation, but he goes on to cover many other phenomena, like prime numbers and tag systems.[1]

One conclusion is that complexity stemming from simple rules is relatively common in our Universe and that historically, humans have done a good job of solving the narrow pool of problems where you can know the end solution with a simple calculation (say predicting the location of a ball as you toss it up in the air).

The implication is that there’s a whole “universe” of problems where we cannot know the end solution until we’ve done the work necessary to compute it.

And in that sense, it wouldn’t be surprising if software engineering was a bit like Rule 30: an endless array of patterns we can’t predict until we’ve actually created the code or computed what it can do.

It’s not so much that we’re bad at software engineering. It’s just that emergent complexity is part of our reality (assuming Stephen Wolfram’s insights apply to software engineering. Maybe they don’t. But it sure feels like it).

I think the ancients knew this well. The Tower of Babel story is pretty old. Details the problems humans run into when they try to scale and work with many different elements: Something nobody could predict and possibly, catastrophe.


Wolfram likes to take credit for other people's ideas, but I don't think even he would claim that the idea that “small rules can lead to extraordinary complexity” was original to him. For example, what he calls a “tag system” is often called a “Post tag system” because of Emil Post’s pioneering work investigating it in the 1920s through the 1940s.


As Cosma Shalizi wrote in his review of this book:

> As the saying goes, there is much here that is new and true, but what is true is not new, and what is new is not true; and some of it is even old and false, or at least utterly unsupported.


Would Conway's Game of Life be a good example of small rules that lead to complexity?

Both Game of Life and Rule 30 are cellular automata so yes!

This article doesn't make any sense. It already fails at the premise. Airplanes and elevators have nothing in common with code written by normal software companies.

They optimize for a compromise between short-term gain and long-term gain. In essence, a startup can mathematically proof its software and go bankrupt before they even reach a prototype stage. Or they can shell out utter crap quickly, get some early feedback, funding and beta testers and slowly figure out what to improve first for the biggest bang.

Same goes for giants, like Google. They technically have the resources to prove SOME projects and they DO. For instance, Amazon proves their hardware virtualization software and other core pieces, where bugs would undermine the safety of the entire cloud.

I am getting a bit sick of "Why is software engineering so bad compared to XYZ". All the people asking are either not in this field or are junior engineers who just don't yet understand why things are the way they are and why writing bug-free, mathematically proven software is infeasible, unnecessary, impractical, undesired, etc.

Consider another analogy. When building a house, how expensive is it to decide that you actually want to have a ceiling with 5 meters high in the second floor, because the buyer of that floor wants higher ceilings? Simple: Tear down the building and start over...

In software? Usually it comes down to adding an IF-statement or some such. Software follows completely different economic principles and serves completely different needs.

There is at least one dimension though where this problem currently shines through: Self-Driving Cars.

I think auto-pilots in self-driving cars SHOULD follow the same rigor as airplane auto-pilots and they are most certainly not doing that, which will and already has resulted in lost lives.

"They optimize for a compromise between short-term gain and long-term gain"

Hey! Author here. This is the exact point that I make in my essay. I don't compare it to any other fields, and certainly don't mention airplanes or elevators. I explain how this compromise is deliberately reached for people who aren't familiar with the dynamics of software projects. I recommend that you read the post! It sounds like we agree on a lot.

Software Engineering is new. The romans were working on solving civil and mechanical engineering problems hundreds of years ago and we've had a lot of time to study their failures.

Unfortunately, our industry is dominated by an obsession over "use the latest, cuz is the greatest" mindset, where if you're not using the hottest new language or sexy framework, you're already behind somehow.

I can imagine maybe it was this way when the romans were trying to figure out how to build roads. "Oh hey, did you hear how Steve built his road? Yeah, they compacted gravel an inch out of time underneath the bed. Something about water drainage. Oh and they're waaaaaay more productive. Yeah, productive. Lets do use that framework for building our roads!"

Not just new, but always new:

> There need be no real danger of it ever becoming a drudge, for any processes that are quite mechanical may be turned over to the machine itself. https://en.m.wikiquote.org/wiki/Alan_Turing

> Software Engineering is new.

I believe one of the examples for extreme reliability used in the XKCD cartoon included in the article is air travel, which is about as old as computers.

I was told once that the reason air travel is so reliable is because they have mechanical backup systems for when the electronics inevitably fail. Is that true?

How reliable are electronics as a whole?

No, it's not true. Air travel is as reliable as it is because it's considered to be a system Airplanes, their design, manufacture and maintenance. Airports, communication systems, procedures for entering, transiting and exiting an airspace, fault reporting, accident investigation and much more. All of those things are part of the "system."

What a lot of people fail to do when making comparisons with software is to look at the act of coding in isolation from everything that surrounds it. That leads to a very skewed perspective.

> How reliable are electronics as a whole?

Given sufficient time, budget and management will very reliable system can be made. The probes have software too, so there’s that...


If an aircraft's engine dies mid flight pure physics of falling generates speed and the plane at speed generates lift so you slowly descend in a (semi) controlled manner. In a flight sim you just plummet to the ground unless someone thought far enough ahead to your virtual engine dying and implemented ambient aerodynamics to make sure a gliding mechanic would work.


Airplanes reuse and refine, reuse and refine the same system, more so with elevators. Software is often greenfield.

Realistically, software projects get budgeted for half/third of the time they actually need to be anywhere near complete. Then following the "delays" that should have originally been budgeted in, you might be over or under that more realistic estimation, but the software is worse for it. Why? Because the engineers were stressed/crunched/whatever into performing worse under the false pretense that "we're almost there," when that is not the case.

If companies put a greater emphasis on correctness and budgeted time/money/testers/etc for it, things would be better. This low grade output is what the industry wants and that is what it is getting. People in charge of scheduling may say they care about correctness, they never really do, they care about fast and the appearance of some facsimile of function ("it looks like it works").

Imagine if you went to your higher up and said that the 6 month greenfield voting app, would actually take 1.5 years and require more resources just to ensure correctness. You would just get taken off the project. "It's just a voting app, one screen and server." But that misses the point, because it's a voting app, it needs to be hardened, just like an airplane.

So you learn to shut up and either produce worse output, or work overtime to make something halfway decent. Either way it basically goes unnoticed until something blows up.

It doesn't help that developers pretend this crap isn't going on either. Verbalize and acknowledge it but everyone's so worried that unrealistic requests and budgets are somehow actually realistic and they're just underperforming compared to their peers. So many actually suffering from imposter syndrome and worried that they're suffering from the Dunning-Krueger effect.

Every group I've worked with seems to start in a defensive state projecting confidence but I casually admit to my struggles with everyone on the team and folks start to open up. By the end, you have a team of developers who can help set more realistic expectations. The less developers communicate amongst themselves honestly, the more they can be exploited by unrealistic expectations for development.

It's a much nicer environment when everyone knows and admits they're slinging garbage code because of resource constraints than everyone pretending they're not.

I think the greenfield aspect is deceptively responsible for so much damage. You want a good robotics engineer? Get someone that has made many robots. Want a good aeronautics engineer? Similarly, lots of aeronautics.

Software engineer? We assume lots of software. But, this would be akin to assuming all of the other engineers were simply "mechanical engineers." To an extent, true. And there will undoubtedly be polymaths good at many.

However, if you want someone to ship a Todo App on time, get someone that has done one before. A financial app? Same. ML Pipeline? Same.

Instead, we get someone that succeeded at getting an app in some field out, and assume they know how to get the same in another field. And are then shocked when they are unable to estimate something they have never done before. Or why they spend their time shaving yaks that they know, without typically providing new value to their users.

You want someone to do good at delivering what you are building? Make sure this is not their first time building it.

1. The economic incentives are aligned to ensure most software isn't very good because it doesn't "need" to be.

2. The tools that help make software really, really good are really, really baroque and remain that way because the people who invent them are subtly incentivized to be the high-priest of an esoteric technology church rather than making them for the masses. Which ensures that making really, really good software doesn't get cheaper with time and also ensures that you'll only reach for these tools if _you absolutely have to_ because of #1.

3. Software Programmers/Engineers see their jobs, and appear to derive most of their joy, from doing the actual coding. Not in using rigorous design tools ahead of time which are unmoored from their implementation, nor in doing the testing & verification of the implementation after the fact.

3a. Software Designers/Architects have made a bad name for themselves by spending more time pontificating at whiteboards than writing down verifiable specs & models.

3b. Software Testers/QA folks are treated as lower-class than Programmers/Engineers, and don't get paid nearly well enough, so they're setup to follow rote scripts and procedures rather than empowered to deploy sophisticated testing methods like property-based testing or mutation testing.

4. All those factors appear to have created a feedback loop which has become self-reinforcing.

Missing here and in the comments so far: because we categorically refuse to learn lessons from the past and to improve on what is already there, instead we keep on re-inventing the wheel in interesting and novel ways which then come with their own sets of problems which usually only show up after a couple of years. By then of course it is too late.

This has been going on as long as I've been active with computers. I think a part of the driving force behind this was the speed with which computers became faster. There simply wasn't time enough to absorb the lessons before the 'old' stuff looked small and obsolete in view of our new powers.

But if you look at old and mature software systems (Common LISP, Erlang, D and others of that vintage) you'll find that the degree of thought that went into them and making them bullet-proof is still very much absent of that which drives a large portion of the software industry today. We're probably destined to go through this sort of cycle at least a couple of more times, if not forever.

The Iowa caucus app didn't fail because "we" are bad at software engineering. It failed because someone did a shitty job of understanding the problem space, deciding on an appropriate solution, and finding proper resources to make sure that solution was well-implemented.

I think properly implementing a solution appropriate to the problem space is the literal definition of engineering. So "we" aren't bad at software engineering but someone definitely is.

The someone in question were inexperienced devs fresh from a bootcamp. This is a recipe for failure in every industry from making coffee to making spacecraft.

Perhaps software is somewhat unique that we don't appreciate experience.

I'm saying the fundamental problem was not the software implementation but the non-engineering decision-making surrounding it.

What the poster is saying is that the project management aspects are a part of engineering a product. It's an opinion I happen to agree with.

He has a legitimate point, though. Even as a software developer for nearly 30 years now, I’m surprised that there aren’t standardized solutions to these kinds of problems; I can sympathize with non-developers who figure, “this has been done in a similar form thousands of times, it should be almost a rubber stamp”.

It's a lot more nuanced than that.

Upfront I'll say I know the people at Shadow and worked right next to them at HFA. They're super competent (they worked on some of the more critical things) and some of the best people I know. Just so you know where I'm coming from. I haven't talked to them about any of this.

Political software is very wild west. There are dynamics that are just impossible to anticipate, like every Saturday this vendor's response time drops to nothing because they do weekly backups, or we're coordinating all the votes through this person but they've never used a smartphone before. There are some fundamental truths here:

- these systems and processes are underfunded, ad-hoc, and run by people who are compromised in significant ways (haven't slept more than 3 hours in 2 weeks, no experience with technology, responsible for literally 30 things, downright incompetent but bodies were needed, drug/drinking problems).

- we're now in an age where these systems have to be secure against state-level actors, and we are woefully unprepared for that in every way you can think of.

- very little in our political system can hold up to scrutiny. Caucuses in particular are and always have been junk, but even elections are... maybe impossible to verify. And even if we could perfectly record every vote, externalities (weather, voter suppression, malfunctions) and people's bad memories ensure there will always be a gap between the will of the people and the results of an election. They (and everything else about this process) are fundamentally imperfect.

At HFA we dealt with this by planning for it from the beginning, and when you're the presidential campaign you can do a lot to combat these fundamentals (run drills, hire super competent people for pennies who will work for you out of patriotism, run red teams, strongarm and replace vendors, etc). We still made mistakes that haunt me. It's just how it goes.

Nowhere else in politics is like that. So you do your best: don't put your app on the app store, subject to anyone downloading it and review by Apple/Google, use MFA even though it will undoubtedly confound 80% of your volunteers. Hope the internet holds up in rural areas or gyms with satellite trucks and crowded APs/towers. Hope no one learns the numbers of your backup hotline. Hope people know the rules of the caucus and count perfectly so the results aren't tainted and called into question ad nauseum on national news media. Hope no one thinks you're part of a conspiracy to cheat their favorite candidate and doxxes you. Make your app as simple as possible. Accept your client's insistence on secrecy in the name of security even though you know training and testing are essential. Hope there's no quirk in the vendor's API that will reject results on caucus night.

This whole thing is like a Rorschach test. Are you a person who thinks tech has no place in elections? That's what you see here. Are you a Sanders supporter who thinks there's a conspiracy (not for no reason) against him? Do you think Democrats are fundamentally incompetent? Do you lament the state of software robustness? Etc etc. I would implore people who are falling into into this stuff to recognize these mental reflexes. Very few people know anything about how any of this works (I certainly didn't) or the details of what happened in Iowa, and if you don't, well probably shouldn't extrapolate from this situation then.

You bring up a lot of valid considerations for designing software at scale. In good engineering, these known unknowns would be considered during the design phase.

If the argument is that those items weren't considered, that is definitionally bad engineering and I think GP is dead-on in their assessment of Shadow's development.

If the argument is that there were unknown unknowns that caused the app to malfunction and the items you listed were just examples, and we don't actually know what went wrong - well, I'll be happy to read the post-mortem.

But that's not what happened here. This wasn't an "unreliable vendor on Sunday" situation, or a "we couldn't budget for offline local storage for rural internet". This was a "our app literally doesn't even let users log in" scenario, and I find that really difficult to excuse under the blanket of "political software is hard."

There are postmortems. They were using Auth0 w/ MFA. Login problems were lots of user error (putting wrong creds in wrong boxes) and bad internet. Not a lot you can do about that without opening up an enumeration attack or erecting cell towers.

Basically everything else you wrote is outrage at the state of software robustness, so I'll refer you back to my last paragraph.

I appreciate posts like this one for the commentary they generate, but it's still disheartening to see the obvious (to me, anyway) stuff getting missed each time.

The overall post was okay, but the lead-in was too topical to be relevant more than a few months from now, and also wasn't really about software quality at all. What happened in Iowa was a lot of bureaucratic cock-up -- they got the wrong people (because favors or political or personal connections) to do entirely the wrong thing, and didn't bother to consult any of the people who could have helped them make better decisions. It's not substantially different from someone buying the shittiest available used car from a private party and the wheels falling off on the highway, and then trying to use that as an indictment of automotive build quality.


Clearly software suffers from far more issues than other technical disciplines, and that is 100% because there are no common standards and, thus, no certification process or professional licensing.

When software developers and managers compare big software projects to big civil engineering projects, they keep missing the obvious educational and certification process required to become a civil engineer. Galloping Gerties and John Hancock Towers still happen occasionally, but then they become case studies for every prospective new engineer. Contrast this to software, where we all slang out code until we're bored and then self-educate by reading other people's war stories online.

The average man or woman that can build a porch for your house is subject to more licensing requirements than the developer that builds voting software. Let that sink in for a minute.

Programmers are quick to screech at the thought of being subjected to any kind of licensing or certification process, and yet we all have the horror stories of being forced to build and deploy utter garbage by upper management. Professional licensing would really help us here. We'd have a defense that would go, "I can't do that, it would make my license get reviewed."

I don't think this will ever happen though. The majority of programmers don't want it and nobody knows how to do it. They'd rather write and share navel-gazing articles from time to time, wondering about what could possibly be so very different in software from every other engineering discipline. 'Tis a mystery.

The problem is thinking that there could be such a thing as a license for software engineers in general. That would be crazy, almost like having to be licensed to write English. Software is applied to too many different domains and tasks.

What could work would be licensing for software engineers in certain specific domains. For example, we could require election systems software engineers to be certified for that field. Same with automotive software, power plant software, etc. etc. I agree that we should do this. But it's going to apply to only a small fraction of all software engineers, because licensing Web app developers is not practical.

Not practical, nor needed. A web app crashes? OK. Your customers get annoyed; your company loses some money; the world goes on. Losing money gives the company some incentive to get its act together, so it tends to be a self-correcting issue.

The Soviet Union licensed photocopiers.

True, but not a model we should emulate.

Certainly not. I meant to argue that oppression is a real risk, not a desirable option.

Ah, I see. I agree.

> The average man or woman that can build a porch for your house is subject to more licensing requirements than the developer that builds voting software. Let that sink in for a minute.

And the average company that can build a porch for your house has stronger insurance and bonding requirements than the average company that builds voting software - or any software, for that matter.

100% percent. Software that will be used in a mission critical way should require specialists and certifications. I wonder what your thoughts are on an apprenticeship program that includes course work and practicals in organizations?

Our society has produced software teams that produce highly reliable software (space shuttle, banking technology, and embedded systems) that don't leave room for a great deal of speed it innovation. Let's call them Type 1 teams.

We've also produced "move fast and break stuff" software organizations, who have realized moving fast and failing is a better financial outcome than safely arriving at a suboptimal destination (Facebook, Twitter, tons of startups). Let's call them Type 2 teams.

Neither Type 1 nor Type 2 teams are Bad or Evil. They are simply suited to different applications. What happened here is, an organization that was unable to distinguish between these types of teams hired a Type 2 team for a Type 1 job.

Generalized example, usual caveats apply.

The majority of software teams fall outside of this classification.

Agreed. It's really more of a gradient. Regardless, the problem is not where a team lies on the gradient, but where a team lies in relation to the fundamental needs of the problem to be solved.

The Boeing Company invested nearly $10 billion in developing the new 777

The Voting software cost $60K I believe, that includes distribution.

That about sums it up for me, if you want reliable software it costs a lot of money.

Money, time and planning. Agile is, for the most part, the absence of planning. "Figuring it out quickly and changing it if we need to." But, by design, most "physical" engineering tasks require tremendous planning, and include (from what I understand) very little "Just figure it out as we go."

The reason for a lack of planning, is because tremendous planning increases time to ship, which effectively costs money.

Agile is not the absence of planning. It's about adapting to change. When it comes to business software this makes sense because the end result is typically unknown. Aircraft avionics have a pretty well-defined end goal. The other big difference is really quality control. Most software doesn't invest heavily in quality because a failure isn't that costly. It's the same calculation that goes into producing absolutely everything. The cost of a quality failure in an airplane is humongous. I use apps that crash at least once a day, but I just restart them and keep going.

Yeah I guess I just very much do not agree with your stance on Agile, but that's okay. YMMV.

Yes, planning, lots of code reviews, lots of testing. Building your software so it can be tested. Its all doable but no one wants to spend the money - and if its not critical its good enough.

I've read with the space shuttle every line of code was reviewed in a meeting - its described here https://history.nasa.gov/computers/Ch4-5.html

>Agile is, for the most part, the absence of planning

That's just blatantly false. Agile is about adjusting to the changing environment.

Even wiki says: "Agile software development comprises various approaches to software development under which requirements and solutions evolve through the collaborative effort of self-organizing and cross-functional teams and their customer(s)/end user(s)."

I guess, on paper Agile's wiki may define itself as being "pro-adjustment", but I think in practice it's been pretty well-proven to be a "just plan less" approach.

You can take a positive spin on "We'll just figure it out iteratively as we go", but I think ultimately you get orders of magnitude less actual "planning" by kicking the can down the road. YMMV though, as with everything.

firstly that $10 billion covers many more cost items than just the software needed to fly the 777.

And going by recent events (i.e. 737 MCAS) it would appear Boeing is not that great at creating reliable software.

Yes software is seldom broken out - the space shuttle software spent $200 million, with an original estimate of $20 Million according to https://history.nasa.gov/computers/Ch4-5.html. I guess thats in 1970-1980 dollars

For the 777 it would be even harder to break out a software only a number only because while there will be new lines of code written there will also be many more old lines of code reused from earlier models.

We aren't bad at software engineering. The Author does a good job of analyzing the situation. It highlights mostly what we care about is delivering a stream of value knowing that we can correct things when they go wrong. We care that it is mostly right. We 100% embrace the "soft" of software

Thing we struggle more with is "Must work correctly on first REAL use". So we now have the PITA world of other engineering disciplines where adapting is often not an option once "deployed", though even there those disciplines look for ways to get "soft".

Having spent a lot of my career doing Embedded Systems/Industrial Control Systems the approach to getting things correct is very different from developing web/backend/mobile software. Lots of rules are introduced, design approaches are limited to certain ways of doing things and many techniques are avoided. Functionality is carefully considered on how difficult it is to build correctly. Testing is more convoluted. Robustness and Performance are far more pressing concerns. But again, in this environment it is often desired to be more "soft" and various things are done to try and achieve that. It can be handled much like other engineering and often is managed as part of a multi disciplinary effort of mechanical/electronic/software engineering and treated as a "system"

Getting some mobile app developers to make something that is supposed to be "robust" on first deployment is mainly a problem in not recognizing what skills and experience are needed to be employed to get what you want.

> We aren't bad at software engineering.

I have never seen perfect software, have you? They always end up with bugs that make you vulnerable...

I'm not sure I've ever seen anything engineered that is perfect, have you? what even is "perfect"? seems like an unrealistic standard and not the goal at all of any engineered system (civil, mechanical, electronic), I don't think that a good definition of "bad" is everything less than "perfect". I have seen software the fulfills the necessary criteria, especially in the world of control of machines, eg, engine control, industrial control systems, etc. Many things get built that work within their "engineering" criteria. I have one particular software application on a microcontroller that performs various functions and control, I've only ever deployed 1 version on thousands and thousands of devices, it has worked for a decade so far. It went through rigorous review and testing before being released.

> [...] it has worked for a decade so far

if you don't change the inputs or any other variables, it is different from a program deployed in the wild... but yeah not very many things are perfect (even the sun will die one day).

It is deployed in the wild. But what you are describing is essentially engineering, being in control of your variables is key. Engineering is all about understanding what the variables are, what tolerances you need and building to those criteria. Pretty much anything engineered is on wobbly ground when it gets inputs/conditions unanticipated. Part of that is being clever about when things are in uncontrolled environments to find a way to sandbox things into a controlled environment. Much like the software world, a simple thing is put it in a "container"/box/case and limit the "input" to the underlying system.

My take on this problem is that other (engineering) fields operate with much, much longer time frames - not to mention that they also have vastly more experience and best practices to rest on.

Software is young, things move very fast, and new practices can emerge quickly (and older get canned).

Imagine if aerospace company was to design, test, and build a new airplane in only 6 months. It would probably end up as a disaster. In the realm of software, it is very likely that you need to come up with something new, very fast - you don't always get the luxury to step back and lay the proper groundwork - you're doing it on the go.

Our tooling's pretty shitty, generally, isn't well-standardized or anywhere close to it across the industry, is changing rapidly (largely not to any real purpose, though sure, sometimes improvements slip in) and we're all expected to become semi-competent (nowhere near enough time or brain space for full competence) on an absurdly large set of these tools—not just using them, but using several different ones for similar purposes but with different interfaces and quirks (boy do they have quirks), how to set them up, how to fix them when they break, and so on.

Oh and you can make a name (and pile of money) for yourself if you manage to promote some mediocre beta-quality-at-best tool and trick enough other people into using it, and if people call you out on it they're the assholes.

Every thought experiment that immediately comes to mind, in which other professions of various sorts had to deal with some similar situation on an ongoing and apparently never-ending basis, read as absurdist comedy. Yet here we are.

[EDIT] ok here's a fun one: imagine if framers had to use a different hammer depending on the brand of wood at the job site. Like, hammer weight is different and you even hold it differently. For some you can use air nailers, others you can't, some of them only work with air nailers that have a second person working the trigger while one holds them, all the different air nailers need different compressors, some of those take different voltage, shit like that. Imagine it's like that for all their tools and that these differences manifest in all sorts of ways based on the combinations of materials & tools.

We wouldn't say "gee why are house-builders so bad at building houses, and LOL they can never even tell us about how long it'll take", we'd say "holy shit how do any houses get built, it's practically a miracle, we should fix it so it's not so hard—for no good reason—to build houses"

> Imagine if aerospace company was to design, test, and build a new airplane in only 6 months.

“[Kelly Johnson] sought and received permission to set up his own experimental department, which he based in a hangar next to Lockheed's wind tunnel. With his team working 10-hour days, six days a week, the prototype was ready just 143 days later. That plane was the P-80 Shooting Star. It arrived too late to have an impact in WWII, but (renamed the F-80) it went on to serve in Korea”

A former Professor of mine published a paper on a very related topic. He interviewed 54 software developer at the International Conference on Software Engineering. I found it a very nice read.


Indeed, a fascinating study, thank you for the link. Breaking down the summary:

Conclusion: More software engineering research should concern itself with:

- emergence phenomena

- how engineering trade offs are made

- the assumptions underlying research works

- creating certain taxonomies.

Such work would also allow software engineering to become more evidence-based.

There are a few issues... First, the people writing the checks aren't concerned about the material quality or the craft discipline output. It's also not so different from manufacturing... you can have a hand crafted oak cabinet with dovetail joins and meticulous material selection, or you can have a flat pack cabinet from Ikea. Both have value, and quality is considered a measure of value that meets or exceeds expectations.

Building an aircraft has many layers of bureaucracy behind it, and the cost that goes with it. In the end, software is practiced as a craft more than an engineering discipline. And that's okay. It's usually when politics and management decisions override the needs of the software/product that things tend to go sideways.

There are also more than a handful of software developers that don't really care about the quality or maintainability of the software. Let alone learning and using the methods for interfacing with said software.

Just the past couple days, I was looking for a React + Material UI template to work from. Many looked great, but actually following the code was a byzantine mess. Mirrored trees of files are my biggest pet peeve in front end projects. Let alone not even using the methods available for use and extension. That's just a recent example.

I've seen others, where a given set of methods is copy/pasted across dozens of case statements... Worse still is when you see a schema design that makes no sense and/or variables that are lost and meaningless, that everything around it loses all context. Such as the passwords having the variable w23 and the value is stored in the "Phone2" table under "extra1" columns.

Managers give lip service to "quality" and "testing" but when push comes to shove you skip testing to ship on time, and have to deal with bugs several years into an initial 9-month project timeline.

Slightly off-topic, but here's what I would've done: just build a website.

No real client-side code; lean on the HTML platform for everything it can be used for. HTML is the simplest, most stable, and most battle-tested client-side framework ever created, and it's stateless and runs on all devices without even an installation step. Toss some CSS on there to make it look pretty; CSS is stateless too so it won't introduce any real bugs. It doesn't have to look perfect; you're not marketing it to anyone. You're just collecting input from users who don't have a choice of product.

For the back-end use something dead-simple like PHP or NodeJS. The business logic can't be that complicated; I assume it's really a very simple CRUD app. Just accept form submissions, authenticate, validate, drop it in the database. Done.

Spin up a couple of hand-managed VMs on DigitalOcean or the like for hosting both the databases (probably just one database?) and the web servers. It's not a small user load, but it's much smaller than the typical internet company, and it only needs to run for a few hours. You don't need automated scaling and restarting while you sleep. Sit a task force at their desks during the event to repair any outages as they happen.

Use the right tool for the job. Your stack should be exactly as complicated as the task demands, and no more.


The biggest predictor of bugs is "Organizational Structure", the main problems sound like they're not in the computer(s)

Are we? I mean, we're constantly driving every software organization to take on more and more complexity in their projects all the way to the point that they reach their cognitive limits.

Yes, a lot of the progress we've made is just an illusion driven by Moore's law and lots of bloatware that it enabled, but a lot of it is genuine and needs to be taken into account.

I'm confused why this is even a puzzle.

There is a bubble. Companies with half a product, no distinguishing technology, and no path to profitability are getting valuations of billions.

Therefore the big guys - the FANGS - become worth trillions, and can afford to hire the best of the best and pay them millions (over the last 5 years, with stock, yeah, that's what sr engineers make in Seattle/Silicon valley)

So, you've got the best software engineers in the world making adtech.

Which means the only ones the Iowa Democrats can find for a 60k budget aren't inept, but they're certainly not in the top tier. See also: The Obamacare online rollout.

What will fix it?

A market correction. One, as a well paid sr engineer, I am not looking forward to. But it will make a bunch of actually competent software devs available for smaller market projects, and suddenly the quality of government software will increase dramatically as lower-paid-but-outstanding-benefits government jobs seem way more cushier all of a sudden.

All this shit applies at the FANGS too so I don't think any market correction will magically solve the problem.

Why are we so bad at software engineering? Because most software comes without any liability.

I completely agree. At the some point, businesses need to to have reason to say "no, what you want is feasible, and we won't risk the consequences of failure."

For safety critical applications, technical decision makers need to have reason to say "no, we can't do this, and we won't because my career and possibly freedom are at stake if we do."

Right now a huge portion of the risk of software failure to software builders is reputational. This can be sufficient, as pointed out in many cases in the article, but as we see in these examples is often not.

Procrastination is usually desired, for timely completion, and almost always rewarded, for the same.

I wonder if you'd like to pay for developing software that comes with full liability :)

I use to explain it this way: nowadays, bits and bytes are everywhere, most everybody you know has a notion of what a `megabyte` is. But this is a shockingly recent idea.

My grandma was 20 when a guy named Claude Shannon in 1949 invented/popularized the concept of a "bit" and described what information is, the way Newton described matter and how to model physics with mathematics.

Really, "information" was a just vague concept until Shannon and this was just 70 years ago!!1

It's only natural that a lack of generations of craftsmanship in this industry makes it pretty low quality/hard to master. On the other hand, imagination is the limiting reagent, since (arguably) the bottleneck is good ideas, specially with all these decades of Moore's law.

Because it's not actually taught anywhere. There are no widely adopted standards. Web development is literally a patchwork of random libraries and npm modules stitched together.

Computer science programs don't teach students how to build good software, they teach computer science. CS professors wouldn't even be qualified to teach good software engineering practices if they had to.

Code is also difficult. It's one thing writing your own code and understanding it, but understanding someone else's code is very difficult since you have no idea what was going on in their head when they wrote it. Of course there are practices you can follow to make things simpler, but it's never going to be as easy as say reading plain English.

> Because it's not actually taught anywhere

Oh, I don't know. I have a bachelor of Software Engineering that is accredited by the Australian Institute of Engineers.

The vast majority of my degree was on the discipline of SE, so rigorous testing standards, theory of large scale system design, hard real time systems, Improving your personal software process.

For our final year project 16 of us worked for a very large company to design and build them a piece of software. It was awesome.

> Because it's not actually taught anywhere.

Hell, even here on HN not having had any formal education of any kind is seen as some sort of badge of honor.

> Many common practices in software engineering come from environments where failures can be retried and new features are lucrative. And failure truly is cheap. If any online service provided by the top 10 public companies by market capitalization were completely offline for two hours, it would be forgotten within a week. This premise is driven home in mantras like “Move fast and break things” and “launch and iterate.”

Good article. We're not really that bad at it, we just optimize differently than building airplanes or elevators, for most things. No one cares if the site I built for the local bookseller goes down at 3 am for an hour or two. Or even for a few minutes during business hours.

Except airplanes and elevators need software and therefore software engineers so this should really be about web and app development.

Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact