I have different idea about essential complexity and accidental complexity. I think examples in the article are all just accidental complexity.
Essential complexity from how I read Fred Brooks is a domain knowledge like what an accountant does and what features he is using. All frameworks, servers and doing scp or queries in that is accidental complexity.
Accountant does not need any React or Angular front end, they don't need a server.
Where angular brings in reduction in accidental complexity since by using it I don't have to build a lot of stuff from scratch, it does not make tax laws any simpler. Where tax laws are essential complexity that needs to be coded in accounting software.
The same with servers, going to cloud so reducing my accidental complexity by removing need to deal with servers is not making tax laws simpler or easier to grasp.
This way in my opinion, essential complexity and accidental complexity holds really well as general idea.
I personally refer to these concepts respectively as inherent and incidental complexity, what should be classed as inherent complexity is anything that directly supports the description of the problem. Everything else is incidental complexity (and we mostly write incidental complexity).
The big thing that I don't hear often is about where complexity lives. Complexity is like a pile of receipts you need to hunt down to do your tax - you can have it distributed all over your house, you can sweep up into a pile, or you could file it. The first is by far the most complex, and I've worked on codebases that looked like that. Even the most trivial of changes takes a long time.
I think the bifurcation might be more clearly described as 'problem space' and 'solution space' complexity.
The 'complexity of the problem' is inherent, by definition.
Unless accountants want to literally change the requirements.
The complexity of the solution space is something else entirely and it may be hard to derive theoretically ...
But with sufficient tooling, such as AI, maybe it becomes trivial.
If we have 'AI Apis' that can immediately handle complex problems and dissolve them into something really simple, then many apparently complex domains could be reduced to something simple.
As long as we're allowed to use 'other software' and not have it 'count' as part of the complexity we are measuring, well, the then complex of the 'app' tends towards 0.
You can already see this a little bit with CRUD apps, which often feel like just gluing some forms together.
Not really, because I think that tasks he picked as "essential complexity" are still fully "accidental complexity" in terms of how I understand Brooks writing.
I cannot agree with Dan, because I feel that all "computer stuff" is accidental complexity in Brooks writing context.
Some other replies change context into "from software dev perspective that is an essential complexity", now we are discussing what Fred Brooks wrote in 1987. If we change that context then I can agree with Dan, but if you want to refute argument you have to operate inside the same context.
What I see that Dan argues, is accidental complexity is not getting easier throughout the years. He does not touch anything about essential complexity being well "complex" and that we cannot do much about it. So I don't agree that you can refute whole idea based on those examples.
Where does Dan pick anything as essential complexity? Please cite quote. As far as I can tell, Dan considers everything in the article as accidential complexity, just like you do (and I do).
"Let's see how this essential complexity claim holds for a couple of things I did recently at work:"
"For some queries, it's arguably zero — my work was necessary only because of some arbitrary quirk and there would be no work to do without the quirk. But even in cases where some kind of query seems necessary, I think it's unbelievable that essential complexity could have been more than 1% of the complexity I had to deal with."
It's very clear that the author of the post understands "essential complexity" as related to thinking about the query they need to write (data structures).
The issue is that in today's computing environments application programming is only a fraction of solving the problem of delivering the application.
They too see the essential (or inherent) complexity as being a domain and requirements related thing, but they draw the conclusion that they are somehow not part of it.
It is possible the whole app and system would be unnecessary if the essential complexity was addressed differently.
And it is also possible that due to the current way of doing things, there are new sources of essential (inherent) complexity.
Brooks couldn't have foreseen productivity gains coming from the internet and collaboration. But if you factor all that in and create a new baseline, faster machines still don't tell you what is the best way to deal with changes in requirements or how to represent a postal address so that it works internationally.
Factoring in the business process (esp a somewhat formalized one) as well as the notion of economic value would help to introduce a pragmatic separation between essential and accidental complexity.
I guess what gets interesting is when the business process is fussy itself (e.g. in a start-up setting) and that is when the direction of programming and software development at large takes in a more reinforcement learning discourse v.s. a pure top-down reductionistic one.
Perchance that is also when the categorisation of essential vs accidental complexity becomes more ambiguous in the human process.
From a philosophical standpoint, the starting line for "complex" is in something you take it upon yourself to pursue. The world we inhabit has a complex ecosystem, physics, etc; our study of it is far from complete. But every time we learn something about it a technical advance becomes possible.
Nobody has to keep the gears turning in this vast society. But someone at some point decided it was worth it to create those gears, without answering all the how's and why's: basic science doesn't tell us how to engineer things, only that we have ways of dong so.
So the "accident" of it emerges like a freeway pileup, from different domains of study colliding. And with computing being the workhorse technology for everything now, the crashes are so frequent as to be unremarkable.
In your example, the tax code is simply the abstraction layer one level down that you have no control over. So to translate your example, HTML specs would be essential complexity and all authored HTML would be accidental complexity to a front-end web programmer.
But this makes complexity a matter of context.
A browser is built using a programming language and uses interfaces provided by the OS and other libraries. This makes the browser accidental. And you can take this all the way down. Assembly is accidental with respect to the processor being essential.
But now we see how complexity flows through each layer. Just as the accidental complexity of the tax code becomes essential complexity for the accountant, the accidental complexity of HTML becomes essential complexity for the web programmer.
And a popular pass time is complaining how much simpler everything could be, only if.
(not sure if this is what the original article had in mind but this model seems accurate)
> So to translate your example, HTML specs would be essential complexity and all authored HTML would be accidental complexity to a front-end web programmer.
The HTML specs are not essential complexity, are you joking?
Accounting software that handles tax codes is the essential piece. HTML is just one approach to building an application.
Even though I whip out web dev quite often, I find it scary that more and more people view it as the only option or the default approach. To say it again, the HTML specification is not essential complexity in the context of an accounting app.
You're missing the point which is context. If you're the accountant, the tax code is essential. HTML is accidental. If you're the programmer, then HTML is essential. The outcome of your page is accidental. Accidental is what you have control over. Essensial is what you're forced to work with.
If you're being forced to code in HTML, just as an accountant is forced to work with the tax code, the domain knowledge as provided and exposed as specs, be it HTML specs or the Tax Code cannot be altered. So if the argument is that the tax code is essential complexity for an accountant, then the analogy holds.
To an accountant writing an HTML page for their firm or a one page app implementing some accounting feature for clients, HTML is all accidental. They could use anything else. What is essential to the accountant is accounting.
But since we're talking programming, the situation applied to a programmer: The HTML specs are essential complexity for a web programmer. They must work with it, and cannot work around it. And this is how complexity is passed on through the layers of the system.
And with the case of an accountant needing programming, the line of complexity between them is violated all the time. A web programmer hired by an accountant fails to implement good accounting software by being bound by the essential complexities of HTML. The accountant doesn't care what cannot be done because of some HTML spec. The web programmer must find ways to work around this through engineering, or compromise. This leads to further accidental complexity.
edit/addendum:
> I whip out web dev
The moment you go with web dev, this is the act that now makes HTML specs essential. The only way to escape HTML specs is to use something else, which you cannot do if you're already decided on a web app. Just as the only way an accountant can escape the tax code would be to change professions.
To align the argument with the original article, if HTML specs are half the web app, then the web app cannot be less complex than, say, half. And the only way to get there is to delete everything you did/added which is your page/program. Coding complexity can never break the essential complexity barrier, and accidental complexity in the form of your code is always a positive value.
edit/addendum2:
When adding a framework or library to reduce complexity, there must be a pragmatic replacement/swap of original essential complexity abstractions. The new library is positive, in that you're adding more code, and carries its essential complexity. But when the complexity of, say, jQuery can replace a priori javascript essential complexity, then we can make it out ahead. With the added complexity of replacing complexity.
I found your point hard to agree with, until I mentally replaced "HTML" with "bash" at which point I was enlightened of how important context is.
Though I was hired to work with languages $X and $Y at $dayjob, many, many bugs slip through because bash is used as glue for languages $X and $Y. This was true of my last job too, the only difference is that in my current job we don't have a resident bash expert to call out issues in code review.
I've always hated bash, but it's "essential" complexity when dealing with a modern Linux system. Technically it's within our domain to change, but for various reasons time and time again bash wins out as the de-facto glue holding infrastructure together.
Exactly. There is nothing you can do about bash unless you are the creators of bash. In which case, now you are dealing with whatever essential complexity it is you are forced to deal with to create your program. The program itself is all accidental using this model (the model initially outlined by the OP, which converges with the original article).
Maybe "incidental complexity" would be a better term. But the model is the same.
>You're missing the point which is context. If you're the accountant, the tax code is essential. HTML is accidental. If you're the programmer, then HTML is essential. The outcome of your page is accidental. Accidental is what you have control over. Essensial is what you're forced to work with.
That's not what accidental/essential complexity mean in Computer Science.
Essential Complexity is the business logic, program structure considerations, necessary tradeoffs, and so on.
Accidental Complexity is BS you have to put up that's not essential to the program, but you need to handle. Things from manual memory management to setting up Webpack are "accidental complexity".
It's not about the language itself. Even if you're programming in bash, bash is not "essential complexity".
The complexity inherent in the task is the essential complexity (e.g. "I need to copy files from a to b, and do a processing on them, handle the case where one file is missing or when a column in the file is mangled, etc").
Bash (or whatever other tool you use) can directly help you with this essential complexity, or impose accidental complexity on top of it.
E.g.
cp /foo/a /bar/.
for copying a from /foo to /bar has pretty much no accidental complexity. It essentially captures just what you need to do.
But as the script becomes bigger and you implement more of the business logic, shit like bash messing up pipelines when a part of a pipeline fails, or dealing with its clumsy error handling, add accidental complexity.
> That's not what accidental/essential complexity mean in Computer Science.
Of course. I was responding to the OP. This thread began with:
> I have different idea about essential complexity and accidental complexity. I think examples in the article are all just accidental complexity.
I was elaborating on that different idea. Everyone seems to be just reverting back to the text book and rejecting the difference not on merits, but for simply not matching.
But complexity is complexity. If we really want to talk about complexity and where the unavoidable part is coming from, then it's from the layer underneath also. When speaking of complexity reduction, you cannot ignore the complexity imposed.
"I have different idea about essential complexity and accidental complexity"
might mean:
(a) I think we should think of essential complexity and accidental complexity differently (legit, but can be confusing, and overloads the terms).
(b) I think essential complexity and accidental complexity mean something different (in general), and TFA got them wrong (e.g. because the parent doesn't know the traditional definitions of the terms, and thinks they're open to personal interpretation).
>Everyone seems to be just reverting back to the text book and rejecting the difference not on merits, but for simply not matching.
Yes, and I think those people are right. Whether the new idea has merit or not, it should use new terminology, to not obscure things. Then, we can discuss it on its own merit.
Even so, considering it on its merit alone, I don't think it has that much (more on that below). Because it essentially amounts "if you're programming in X, you have to deal with X (e.g. bash/html/etc.) and that has some complexity". Well, duh. That's true, but it's something we already know.
Whereas the accidenal/essential complexity in Brook's sense, is an important philosophical/logical distinction.
>But complexity is complexity. If we really want to talk about complexity and where the unavoidable part is coming from, then it's from the layer underneath also
Well, the original formulation is more useful though, because having to use bash or html is not "unavoidable". It might just be "unavoidable" because of one's employee insistence, or something like that, but that's not a computer science concern.
Whereas essential complexity in Brook's sense is completely unavoidable (in the logical sense).
A better and non-confusing term for what the parent describes would be, I think, "imposed complexity" or "circumstancial complexity".
The crux of Brook's argument is with irreducible complexity which he calls essential, but also qualifies it with what cannot be reduced with technology:
> programming tasks contain a core of essential/conceptual1 complexity that's fundamentally not amenable to attack by any potential advances in technology
Luu is arguing that most of the complexity programmers deal with is of accidental kind. I would describe it as the "incidental to implementation kind":
> I've personally never worked on a non-trivial problem that isn't completely dominated by accidental complexity, making the concept of essential complexity meaningless on any problem I've worked on that's worth discussing.
The OP @ozim that got me thinking reformulated essential as:
> Essential complexity from how I read Fred Brooks is a domain knowledge like what an accountant does
Which I found quite ingenious. Because it's true. Complexity starts from before you sit down to write your program. It's what you bring to the computer. And any non-trivial problem would be dominated by accidental complexity in a computer's implementation space. Unless the computer was already written for accounting.
> essential complexity in Brook's sense is completely unavoidable (in the logical sense).
The moment you interpret any part as "unavoidable" you lose an important neuance. And I was trying to illustrate how "unavoidable" was determined precisely by where you are sandwiched within the existing abstraction layers not of your making.
Even the tax code is written by legislators that are capable of controlling the accidental complexity they implement based on the essential complexity they bring to the table; the requirements they must satisfy before they leave the table.
The accidental tax code created by the legislators becomes essential domain knowledge to the accountant.
And how is this different from HTML or bash or OS programming? Or working with PayPal APIs? The authors, with avoidable choices, determine what becomes unavoidable for the consumer.
The accidental complexity of someone else is now essential complexity to you, by the defintion that 1) it's unavoidable, 2) fundamentally not amenable to attack by any potential advances in technology, and 3) it's the bedrock upon which all of your accidental complexity lies.
So with this model, how do you reduce complexity?
We have layers upon layers of expanding specs. It's not just a computer problem, but a coding problem that also applies to tax law or any other rule making. And it's an entropy problem. Left unchecked, complexity only grows, so how do you fight entropy?
Stacking is part of the solution. Each layer shields all that is beneath it. And for someone using TurboTax, they just see English and buttons and fields to fill. The user is shieled even from HTML.
The front end coder is immersed in the essential complexities of HTML and all of the TurboTax pages are accidental, in accordance with satisfying the essential requirements of the page. But nevertheless, he is shielded from everything below his HTML stack.
We can already see without these layers, the level of sophistication we've obtained with our internet experience may never have been attained. The premise sort of being, TurboTax has never been better. Which I hate to have to admit is sort of true.
Refactoring is another part of the solution that we already do with our code. This may entail remodeling, redefining, and reexamining the problem space.
But we can also refactor our abstraction layers. JQuery was a new layer on top of javascript that was later refactored out by many, as javascript matured (if you could call it that).
In closing, we can refactor code, we can refactor abstraction layers, and we can also abstract more. We need to fight complexity by deleting as much as possible and reorganizing what remains. And part of the solution is finding the minimum number of abstractions that are needed to recreate our solution space.
The minimum "accidental complexity" in a system can only be equal or greater than the "essential complexity" imported from outside the system. If the complexity could be less, then that would be reducing the essential complexity as well.
Complexity can be measured by the number of abstractions (words, terms, and expressions) needed to express the system.
So if accounting is essential complexity, then the complexity of a computer system for accounting starts at essential complexity, and goes up. The best a system can do is provide 1 accidental abstraction implemented for 1 essential abstraction that needs implementation.
All implementation beyond the naming of the functions is accidental.
My main point is that all "computer stuff" is accidental complexity as I believe people in 1987 would think. It might be hard to go along with it when one is software dev and much of our lives are on the internet. But that is what Fred Brooks was writing about back then.
Essential complexity idea is about real world being inherently complex and that we cannot remove that complexity by making bigger better computers. Dan is not attacking this at all, he only argues that accidental complexity is not getting easier.
You argue the same thing as Dan. You want to move essential complexity to a different place. But that does not refute "Essential and Accidental complexity" in context of book it is just arguing that accidental complexity is not getting easier. Fred Brooks argues that even if you make accidental complexity easier you will not make real world easier. He was not saying that accidental complexity will get easier or that it is some kind of law that it always gets easier.
Right. But I'm not just moving essential to a different place. I am describing how complexity is made essential or accidental based on where you are in the stack. It actually unifies your theory and Brooks, and I don't what others have stated as arguments against it.
As an accountant staring at a computer, everything about the computer is accidental. Saying simply that the computer is accidental regardless of context is not very useful apart from for refuting Brooks.
I think this assumes a fine distinction between "domain" and "non-domain" knowledge that often doesn't exist.
If you're writing a trading strategy, the technical details of how the code's executed often dictate what kind of trading you're planning on doing. e.g. How do I balance speed vs how sophisticated the algorithm is?
Also, if your "domain" is technical in nature (e.g. building a new database), it's not clear at all what parts are "essential" and what parts are "accidental". Is the file system accidental? Is the underlying data store (SSDs, spinning disks) accidental? Often the purpose of the project is intrinsically tied to the technical capabilities of the platform it's built on top of.
Brooks clearly missed the boat when he said, "Well how many MIPS can one use fruitfully?". I'm not sure where the fault lies on that one. A recent article [1] bemoans the slowness of modern software.
What Dan got wrong:
Both problems discussed were created by computers. A large collection of machines generating logs is possible to analyze, but that possibility comes from massive investment in tools designed to parse, extract and report on linear and structure data. Brooks covers this in "Buy vs. Build", although his example is closer to Excel than ggplot.
Also, if you are dashing off a python script in a single wall-clock minute, you either write insane amounts of Python for purposes similar to this or are way smarter than the average engineer. In Advent of Code (where Python is popular), the fastest solution to the easiest puzzle (Day 1) this year was 0:07:11. [2]
I just tracked my time for doing that Advent of Code day 1. It took 10 minutes from start to finish. Across the two parts of the question, I spent approximately 2 minutes building the scripts, 1 minute determining an algorithm, and 7 minutes reading the prompt. I suspect similar breakdowns for the 7:11 you quote.
If Dan excludes the prompt reading and solution determination because he's seen this sort of problem before, 1 minute using tools he's comfortable with doesn't seem unreasonable. Considering all it is is a list of hosts, a loop, a try/except block, and a scp command, using much more than a couple of minutes writing would be surprising to me.
There was a 6 minute delay in getting the inputs for day one for many people. The fastest part one solution was 35 seconds. And the fastest part two (after finishing part one) was also about 30 seconds.
If there hadn’t been a server issue I suspect many of the top 100 would’ve finished both parts in under two minutes
I think Dan has braggadoccio'd his time estimates, or his task is somewhat different from what he describes. I mean, the guy talks fast, like really fast, so I suppose he's quick but mere minutes for something like this doesn't seem realistic unless it's extremely routine. Instead, it seems ad-hoc-ish and exploratory to me, it seems like something that needs to be considered and planned out rather than done between 2 sips of coffee. (I am considering his whole task here, not just the scp'ing of files).
He's talking about log files from a few hundred THOUSAND servers that results in several terabytes of data that have to be parsed. He doesn't say exactly what he's looking for, but the point is he's trying to answer some questions about performance for more than a few different applications. Are these simple questions, or involved ones which spawn other questions? We don't know, but even if they're easy questions, there's many applications involved and many servers.
Right off the bat, for something THAT BIG, I think it's reasonable to figure out what you're going to do with a sample of logs before downloading "home depot" onto your hard-drive. So this is definitely a multi-pass kind of job: start with a survey, then try a bigger chunk, if everything's OK do the rest.
Next, I think it's advisable to consider factors about the servers themselves: the application versions, whether or not the applications were running (and why not), the hardware, the role of the server, whether or not the server was up (and why not). Is this metadata about each server available (can you trust it?) or is it something that has to be queried each time on each server? Dan says this supposed to be a couple of years of data, has each server been through upgrades? when? Is that relevant? We don't know any of these, but they would have to at least be considered for someone doing this task.
After the data is parsed there's slicing and dicing to do for the purpose of graphs. That takes lotsa of time-- I am assuming he's not just talking about extracting one figure for each application and plotting it.
For someone that is all set-up and on top of things, this seems like something that is a day's work and easily more, not counting follow-up work and validation to further investigate the additional questions that would inevitably (in my opinion) be raised on such a big dataset.
I think you underestimate the value of pipelining here. You could spend time narrowing down the set of logs to download... but in the time it takes to figure that out, you might as well just download them all.
Having "home depot" locally available for analysis is never a bad thing, plus you may be racing against time re log rotation, etc.
> For someone that is all set-up and on top of things, this seems like something that is a day's work and easily more, not counting follow-up work and validation to further investigate the additional questions that would inevitably (in my opinion) be raised on such a big dataset.
In the middle of a SEV, you don't have a day to perform this kind of analysis. 15 minutes till the SLA clock starts ticking and customers are owed refunds.
OK, but it's not a "SEV", he's looking at 2 years of logs on hundreds of thousands of servers over a wide variety of applications. Nothing appears to be "down". This is more like an investigation looking for some high-value efficiency improvements (which is something he's written about).
I am sure he did it "quick", but mere minutes of work and mostly just waiting around for something like that analysis? I doubt it!
Seems pretty normal to me - we used to do this kind of thing on the regular. SEV analysis often requires digging through months of past logs to ascertain all the affected customers. If that takes a day every time you hit a customer-facing issue, your team ends up spending all their time on after action reports, and never gets any engineering done.
> What Dan got wrong: Both problems discussed were created by computers. A large collection of machines generating logs is possible to analyze, but that possibility comes from massive investment in tools designed to parse, extract and report on linear and structure data.
I don't understand your argument. In what way does that disprove Dan's point?
Analysis of data coming from computers (logs coming from servers in this case) is guaranteed to have a structure that makes it easier to process, compared to data coming from more organic sources (reports collected by people, measurements from instruments, etc...) which are closer to the domain Brooks would have dealt with.
When dealing with these problems today, the greatest challenge isn't writing a script; it's deciding whether data points that don't fit the model invalidate the data or the model.
Moreover, we've spent five decades systemizing the former. Doing the latter is more challenging than ever and fraught with controversy.
Reading the article had me thinking: “computers are good at automating… computers.”
The quantity of logs he’s describing could only be created by having a machine take readings and dump them somewhere. The cause of the problem had to be a computer in the first place. For human data sources, even back in the 60s we had machines fast enough to tally the census, balance checkbooks, or book flights.
So yeah, Dan had a problem of unimaginable scope in the 80s but also the problem is kind of fake? “Help, my computers are spitting out too many numbers and I can’t graph them all!”
I feel like Dan was on to something here but also something in the article didn’t quite line up.
> So yeah, Dan had a problem of unimaginable scope in the 80s but also the problem is kind of fake?
I get what you're saying, but I don't think the problem is "fake" or artificial, maybe not even avoidable. Software is a tool, and you almost always need other tools to build and maintain tools.
Compare large construction projects, which often require separate construction projects, like building temporary roads to move all the material. Or drilling a long tunnel through a mountain: Frequently, entire factories will be built close to the tunnel entrance, which process the excavated material into concrete to be used for encasing the tunnel walls.
Another example: To produce huge numbers of screws in a cost-effective way, you need large, highly specialized machines, which serve no other purpose. And you need people who maintain those machines. Is that a fake problem, or are those fake jobs? Hardly.
Construction is an interesting case. If someone had written after the Empire State Building that skyscrapers would not grow another order of magnitude in the coming century, it would have been laughed at. There were lots of popular articles about “mile high” skyscrapers then. And skyscrapers today are much better! We have counterweights at the top, glass facades, fancy designs… But they’re mostly the same heights. Even the show off Burj Kalifa is only twice the occupied height. And we don’t even have a kilometer high skyscraper yet.
For leaderboarders, and especially the easier challenges, total AoC solve time is largely prompt comprehension. If you already know what you want to do, and that thing is straightforward, it is not unimaginable to bang out a short python script in a minute.
I found the example given very contrived and the whole article slightly disingenuous because of it.
Brooks was talking about building long-living applications serving complex business requirements. The article talks about a one off script with no ongoing life. It's not even really "software" at all in the terms Brooks was describing.
It's like someone describing absolute limits on fuel efficiency for vehicles and then I get on a scooter and coast down a hill and jump off at the end so that it crashes into a wall and explodes, and using this example to claim they are wrong.
The difference between a one off script and a production-grade system with documentation, instrumentation, and monitoring is pretty big.
Here's less than a minute's worth of questions I would want answered for a professional version of the log fetching example:
* Is that list of a thousand servers static?
* How is it updated?
* What format is it in?
* What do you do when the resource isn't available?
* What other systems are impacted when the resource can't be contacted?
* Is it acceptable to use a cached server list until it the resource can be updated?
* How do you handle servers that don't respond when polled?
If we retry connections, what scheme do we use? Linear or exponential back off?
* After a service interruption, how do we resume data collection? Do we keep rolling forward or do we request older data? How quickly do we retrieve backed up data? Is service resumption plan different for the case when a single system is down vs when the log fetching system has an outage?
Agree. Nearly all the work of programming I have done is related to the task of trying to figure out the business problem in enough extreme detail to provide a solution.
The main lever to improve productivity for me as a developer are open source tools, commercial APIs, and similar things. This isn't reducing the amount of programming, it's just moving it around.
Magically 1,000x faster computers would maybe help me be 20% more productive.
It is easy to see that the total complexity has been severely reduced in the last years when you look at how many small companies are doing things that required an army of people a few decades ago.
For example, running the technology for a real-time banking system nowadays can be done by perhaps even a single developer (with enough experience) thanks to open source, cloud computing and advances in software and hardware. Now we even have distributed relational databases that are relatively straightforward to operate, like CockroachDB.
First, Brooks wasn't talking about that. He was talking about productivity boosts due to programming language design and programming methodology. More specifically, he did not exclude improvements to both kinds of complexity overall, but only those that drastically change their relative proportion. Second, in the early 90s I had a summer job at a company whose old ERP, which had been written in a language called Business Basic and ran on a minicomputer, was maintained and operated by a single person.
>, Brooks wasn't talking about that. He was talking about productivity boosts due to programming language design and programming methodology.
Brooks was also talking about non-programming language advancements as a possible "silver bullets". See the pdf and look for the following sections that are not about programming syntax:
Hopes for the Silver
Artificial Intelligence
Environments and Tools
Promising Attacks on the Conceptual Essence
Buy versus Build
He underestimated AI machine learning. With hindsight, we see that the deep-layer neural net combined with GPU beat human rule-based programming for automatic language translations, DeepMind AlphaZero beats expert programmers hand-tweaking IBM DeepBlue, etc.
EDIT reply to : ", nor have they reduced the overall cost of programming by 10x even after more than three decades."
I'm not convinced of that. The issue with your conclusion is that it omits the increasing expectations of more complex systems. I'd argue we did get 10x improvement -- if -- we hold the size & complexity of the system constant. E.g. write 1970s style text-based code to summarize sales revenue by region and output it to green bar dot matrix printers. This was tedious work in old COBOL but much easier today with Python Pandas or even no code tools like MS Excel.
The invisible factor we often overlook is that our demand for software to do more complex things will always outpace the smaller productivity increases of the tools. This makes it look like our newer programming tools never gave us 10x improvement when it actually did.
How did he underestimate statistical machine learning? Whatever achievements were made, they did not take place within a decade, nor have they reduced the overall cost of programming by 10x even after more than three decades.
And, indeed, the one thing that Brooks presented as being the most promising direction, i.e. buy vs. build. He said that unlike changes to programming languages and methodology that wouldn't give a huge boost, buy vs. build might (although he wasn't sure about that). So he was exactly right about that.
Also, 1986 wasn't 1966. In 1986 people didn't write simple software in COBOL. They used things like Magic and Business Basic and even Smalltalk, and, shortly after, Visual Basic and Access. Excel was released in 1987, and we had spreadsheets in the early 80s, too (and Brooks explicitly mentions them in No Silver Bullet). RAD tools and "no code" was very much in vogue in the late 80s and early 90s. That was Brooks's present, not future. He even says that this is the right direction:
I believe the single most powerful software-productivity strategy for many organizations today is to equip the computer-naive intellectual workers who are on the firing line with personal computers and good generalized writing, drawing, file, and spreadsheet programs and then to turn them loose. The same strategy, carried out with generalized mathematical and statistical packages and some simple programming capabilities, will also work for hundreds of laboratory scientists.
When generalised, Brooks's prediction amounts to expecting diminishing returns due to reduction of accidental complexity, and we're seeing exactly that.
> I'd argue we did get 10x improvement -- if -- we hold the size & complexity of the system constant.
Only if we do the one thing Brooks says would work: Buy vs. Build. When we write from scratch -- no way. Even then, I think that while we may see a 10x reduction for specific simple task, we won't see it for large, complex software, which is where most of the effort in software is invested.
>, Visual Basic and Access. Excel was released in 1987, and we had spreadsheets in the early '80s, too. When generalised, Brooks's prediction amounts to diminishing returns due to reduction of accidental complexity, and we're seeing exactly that.
The "diminishing returns" of _what_ exactly?
That's what I'm trying to make clear. Let me try and restate another way:
(1) 10x improvement in programming tasks
vs
(2) 10x improvement in completing business projects
I'm emphasizing that (1) has been achieved many times in multiple areas but it's overshadowed by not seeing (2) happen.
Visual Basic is another good example. When I first used VB Winforms in 1990s, it was more than 10x faster than hand-coding the raw C "Wndproc()" message loop. But that legitimate productivity gain is dwarfed by the business wanting new complexity (the app needs to connect to the internet, it needs to be new-fangled web app for browsers instead of a desktop exe, it needs to work on mobile phones, etc, etc). Our new desires for new business functionality multiply faster than the time-savings progress in tools.
And "accidental complexity" isn't fixed either. And new deployment environments, new features also add a new set of accidental complexity. E.g. if next generation of apps need to interface to virtual reality (headsets, etc), the programming code will have logic that doesn't have direct business value. So we'll then get new 10x programming tool/library to manage that accidental complexity in the VR environment but then.... we're on to the neural implants SDK and we have no silver bullets for that new thing which means we revisit this topic again.
>while we may see a 10x reduction for specific simple task, we won't see it for large, complex software, which is where most of the effort in software is invested.
I agree. But again to be clear, today's expectation of "large, complex software" -- has also changed.
EDIT reply to: "I'm saying that (1) has not been achieved even within a period of time that's 3x Brooks's prediction, "
Raw Windows SDK C language WndProc() was late 1980s and by 1993, I was using Visual Basic 3 drag & drop buttons on to Winforms. Just that one example was 10x improvement within a decade. For line-of-business apps, VB was 10x+ more productive because of the paradigm shift (in addition to things like not worrying about mental bookkeeping of malloc()/free() etc.)
>But most tasks cannot be achieved today 10x faster than in 1986
For discussion purposes, I don't know why we have to constantly refer to 1986 even though the paper has that date. It's repeated submission for discussion makes it seem like people consider it an evergreen topic that transcends Brook's days of the IBM 360 mainframe.
As another example, the productivity improvement is the writing and deploying complex apps using Ruby on Rails or Javascript frameworks and deployed on AWS. That's more than is more 10x faster than the 1990s CGI days of having C Language code writing to stdout to output HTML. Those early web apps were simpler and yet they were so utterly tedious and slow to code.
I'm saying that (1) has not been achieved even within a period of time that's 3x Brooks's prediction, and that (2) has been achieved as Brooks's claimed it would. Again, don't confuse 1986 with 1966. We had Smalltalk in 1980. In 1987 we had Perl. Everybody was using Visual Basic starting in 1991, and Python came out around the same time. The most popular "fast and easy" programming language we have today is 30 years old. We've been using virtually the same IDEs for over 25 years. Clojure would be immediately familiar to anyone who's learned Scheme with SICP at school in 1980, and when I was in uni in the mid 90s, you know what the most hyped language was, the one that was thought to take over programming in a few short years? That's right -- Haskell. Things have really changed very little except when it comes to the easy availability of free libraries and knowledge on the internet.
> But again to be clear, today's expectation of "large, complex software" -- has also changed
But most tasks cannot be achieved today 10x faster than in 1986 except by the one way Brooks said it might be.
In other words, Brooks's prediction was incredibly prescient, and it is those who disagreed with him (and many, many did) who turned out to have been wrong.
> I don't know why we have to constantly refer to 1986
Because Brooks's prediction is predicated on the thesis that improvements due to changes in programming languages and methodology mostly impact the ratio of accidental/essential complexity, which means that we expect to see diminishing returns, i.e. a smaller improvement between 1990 and 2020 than we had between 1960 and 1990, which is exactly what we see.
> Just that one example was 10x improvement within a decade
Brooks didn't say there can't be 10x improvements due to languages in any decade, only that there won't be in the following decade(s), because when essential complexity is reduced, there's less of it to reduce further. To grossly
over-simplify his claim, yes, we did see a big difference between 1980 and 1990, but we won't see as big a difference between 1990 and 2000. Or, in other words, he claimed that while you certainly can be 10x more productive than Assembly, you can't be 10x more productive than Python.
> As another example, the productivity improvement is the writing and deploying complex apps using Ruby on Rails or Javascript frameworks and deployed on AWS. That's more than is more 10x faster than the 1990s CGI days of having C Language code writing to stdout to output HTML. Those early web apps were simpler and yet they were so utterly tedious and slow to code.
But that's because the web took us backward at first. It was just as easy to develop and deploy a VB app connected to an Access DB in, say, 1993, as it is to develop and deploy a RoR one today. A lot of effort was spent reimplementing stuff on the web.
>It was just as easy to develop and deploy a VB app connected to an Access DB in,
And that VB desktop app was not accessible to web browsers.
The business expectations/requirements changed.
E.g. when I wrote the VB desktop app for internal sales rep, I didn't need to code a "login screen" because the rep was already authenticated by virtue of being on the corporate network.
But if business says that checking product prices should be "self-service" by outside customers using a web browser, now I have to code a login screen. ... which means I also have to create database tables for customer login credentials... and code the web pages to work for different screen resolutions, etc, etc.
Yes, VB paradigm shift was a 10x productivity improvement over raw C Language (that's a programming syntax and environment change and not just a library) ... but it's overshadowed by having to write more code for previously unavailable business capabilities. New business expectations will always make it seem like we're running to stand still. It's not just the ratio of accidental to essential complexity.
After writing out of bunch of thoughts on this... I propose a another way to translate Brook's paper which is still consistent with his message: The new business requirements (new essential complexity) will always outpace programming technology improvements (e.g. accidental complexity reductions of using GC to manage memory instead of manual malloc()/free()).
This is why accidental complexity is always a smaller component of Total complexity. Thus, the real 10x programming improvements don't actually make us 10x faster at finishing business software.
EDIT reply: ">business requirements change has little to do with his point:
I interpret "essential tasks" and "essential difficulties" as modeling the business requirements. I'm saying his first paragraph can be interpreted that way. If you disagree, what would be some example programming code that shows "essential tasks" that's not related to business requirements and also not "accidental complexity"?
(He's saying the essential task is the "complex conceptual structure".)
>For most given requirements, it is not 10x easier to do something from scratch today than it was 30 years ago.
If the given requirements are the same as the 1980s but I also I get to use newer tools that didn't exist in 1980s (SQLite instead of writing raw b-trees, dynamic collections instead of raw linked-lists, GUI toolkits instead of drawing raw rectangles to the screen memory buffer by hand, etc), then yes, coding from scratch will be much faster.
>Of course applications 30 years ago had login screens and were accessed remotely, they just didn't do that using HTTP and HTML, which set us back in terms of some capabilities for a while.
This misses the point of my example. I was trying to emphasize the new business requirement of customers -- not employees -- accessing corporate database systems.
Therefore, a new GC language to alleviate mental burden of malloc() doesn't really help with that new complexity. It wasn't about mainframe green screens to http/html. It was about the new business functionality for customers access that makes it seem like programming productivity didn't improve at all.
The mainframe greenscreen wasn't relevant because customers at homes don't have X.25 T1/T3/ISDN connections to connect to the company's mainframe.
This is not an example of "web being backwards" or "catching up to old greenscreens". The end customers didn't previously have access to the mainframe at all. Therefore, it's new business functionality to empower customers that must be coded.
>we haven't developed any new programming paradigm or technique that has increased our ability by much.
Even if we exclude libraries, I still think garbage collected language (deployed on regular commodity pc instead of expensive Smalltalk workstation) is a 10x productivity improvement over C/C++ malloc/free/new/delete for line-of-business apps. Chasing random pointer bugs will slow productivity way down. And languages like PHP where the HTML templating was a 1st-class concept alongside the code is 10x improvement over HTML that was generated in CGI stdout of C and Perl scripts. New programming paradigms do help the coding aspect of productivity a lot. They just don't increase total business projects' productivity.
Of course applications 30 years ago had login screens and were accessed remotely, they just didn't do that using HTTP and HTML, which set us back in terms of some capabilities for a while.
What you're saying about requirements outpacing ability might be true, but it is not Brooks's point, which is also true, and that business requirements change has little to do with his point: For most given requirements, it is not 10x easier to do something from scratch today than it was 30 years ago.
We've certainly accumulated many software components over the past 30 years and made them freely available, and that has helped productivity a lot -- as Brooks wrote it might. But, as he predicted, we haven't developed any new programming paradigm or technique that has increased our ability by much.
I was so productive writing VB apps. I miss those days. Everything in one application, no browser -> server communication delay. You just assumed a minimum screen size of 640x480 and made sure it fit into that. No mobile responsive bullshit, no CSS malarky to deal with. What you drew in Visual Studio is exactly what the user got.
But there was also a lot of boilerplate involved. No handy-dandy open-source libraries sitting around on the internet that could just be pulled in to deal with a task. You could buy a COM object or a commercial visual control, but it was rare and expensive. If you were doing tricky things you had to work it out yourself the hard way, and make mistakes doing it.
Now... most programmers I know are plumbers wiring up different ready-made product APIs using a script language.
Yeah, I think jasode is right - we're 10x as productive now, and I think that's because of the popularity of Open Source and Stack Overflow rather than IDE's (if anything modern IDE's are less productive than 1990's era Visual Studio). However the business tasks have got 10x more complex. And I don't think that's unconnected - things that were previously too complex/difficult/expensive are now routine, and we're now expected to do more.
yeah, basically - there's an amount of complexity which is acceptable to business. Less than this and the business will add features until it's met. More than this and the features are not commercially viable.
The amount of complexity that the programmer needs to cope with is about the same, regardless.
> And that VB desktop app was not accessible to web browsers.
That wouldn't be a problem. I work in mainframes, web applications of today are a reimplementation of CICS and other green screen apps. Yes, there is more bells and whistles, it's nicer, but the technology for "distributed" applications has been there for a long time.
Fred Brooks suggests in 1986 that future productivity improvements focus more on:
>Exploiting the mass market to avoid constructing
what can be bought
In 2020 danluu says that:
>In 1986, there would have been no comparable language, but more importantly, I wouldn't have been able to trivially find, download, and compile the appropriate libraries and would've had to write all of the parsing code by hand, turning a task that took a few minutes into a task that I'd be lucky to get done in an hour.
IME this demonstrates remarkable prescience by Brooks. The only real difference between Brooks' hypothetical future where software is assembled from bought components and the one danluu lives in is that the components are, in this case, free.
Package managers might seem obvious and necessary today, but I don't think they were at all obvious in 1986 (or even in 1996). Fuck, golang didn't even think they were necessary in 2009.
Also:
>Looking at the specifics with the benefit of hindsight, we can see that Brooks' 1986 claim that we've basically captured all the productivity gains
I don't believe that this was his claim. He claimed that productivity gains from improved programming language design would be incremental rather than game changing (hence: no silver bullet).
Some people probably will argue that there is a 10x difference in language productivity due to language but I don't think that is the case. In fact, I think there's a class of people who keep getting proved wrong and wrong again about the relative importance of package management ecosystems over "game changing" language features like non nullable types that are incrementally better, but not an order of magnitude better.
I see that you are unaware of the large collection of programming language improvements accumulated in recent decades, which I will guess is because your favored language mostly does not implement them.
In the advent of code, I solved day 23 (the crab and the cups) doing brute force, and even with today's hardware it would have taken many hours, if not days, to complete. A tweak of the algorithm got it down to 0.248 seconds. That tweak required an 4 megabyte array in RAM.
I back ported the solution to Turbo Pascal 7, it used a "file of longint" in place of the array which you can't do in MS-DOs... it finished in about 10 minutes, because I have an SSD. Otherwise it would have been about 50,000,000 IO operations at 10/second --> 50+ days.
We can and do use the heck out of the hardware, but it's not infinite, and there will always be orders of magnitude performance to be gained by seeking and using better algorithms.
Well, using "himem.sys" you could make use of RAM above 1MB... I used it to write a sort utility that only uses temp files when RAM was exhausted.
On my 386 with 4MB total RAM I could use about 3.7 or so, the rest was a shadowed BIOS copy. (Never tried to deactivate that, perhaps it would have been relocated and then made available?)
A missing colour in the discussions around complexity today is the ever growing ambition for the problems we're all trying to solve. My grand unified profoundly unserious theory is that we're always going to complain about accidental complexity because we will keep discovering new reams of it by extending what we're trying to do out into the unknown parts.
If we continued to play in just the known parts of the problem space, we'd see it melt away further and further into Brooks' model of the world. Instead, we built distributed systems, we work in absolutely giant teams, we deploy to the strange heterogeneous runtime we call the web, etc etc, and all of that generates new complexities to abstract away before we get back to just the essential stuff.
> In 1986, perhaps I would have used telnet or ftp instead of scp. Modern scripting languages didn't exist yet (perl was created in 1987 and perl5, the first version that some argue is modern, was released in 1994), so writing code that would do this with parallelism and "good enough" error handling would have taken more than an order of magnitude more time than it takes today. In fact, I think just getting semi-decent error handling while managing a connection pool could have easily taken an order of magnitude longer than this entire task took me (not including time spent downloading logs in the background).
> Today, we have our choice of high-performance languages where it's easy to write, fast, safe code
> In 1986, there would have been no comparable language
Shell existed in 1986, and IIRC ftp could actually be scripted, a bit, so I think that it could have been used to run the download portion. Maybe I misremember?
Lisp existed in 1986; the first edition of Common Lisp the Language was written in 1984. It certainly could have been used, and even in the 80s I think it would have good enough performance for log parsing. Certainly it would now, but that is after four decades of optimisation.
His point about libraries, though, is excellent. The rise of open source libraries radically changed the software world for the better.
I should note, though, that Plan 9 — which was developed in the late 80s — removed a ton of the accidental complexity in dealing with e.g. FTP servers. We could have much less accidental complexity, but we choose not to.
The reality is that Common Lisp would not fit onto a Unix workstation from 1986 in a light weight way, like firing off another /bin/sh or /bin/awk job.
Machines had memories measured in single digit megabytes, often closer to zero than to 10.
> In 1986, perhaps I would have used telnet or ftp instead of scp.
In 1986 you would probably have used rcp (the thing that inspired scp). A lot less secure but with one benefit - when Dan did this mass scp i bet his local CPU was absolutely pegged well before network saturation?
Ssh imposes a hefty cpu overhead because of encryption that i didn’t appreciate until around 2005 when i was updating an rcp centric script to use scp.
> Modern scripting languages didn't exist yet (perl was created in 1987 and perl5, the first version that some argue is modern, was released in 1994), so writing code that would do this with parallelism and "good enough" error handling would have taken more than an order of magnitude more time than it takes today
You could have used ksh (a popular scriptable shell that heavily inspired Perl).
rcp errors are denoted by return code and thats also the natural error handling approach in ksh.
As for the grep / sed / awk part of the problem, it’s possible the 1986 local machine was uniprocessor - it might have been faster to do the processing locally on each node (via rsh) then only network transfer the result set back to the local node.
EDIT - job control (the thing that makes parallelism trivial in ksh) was only added in 1989. To easily get parallelism you would have had to write the script in csh - life’s too short for scripting in csh (or its descendants) so i conclude everything i said before is wrong and in 1986 i’d have said sod this and went and brewed a cuppa tea instead :-) (not strictly true since i was coming up on 2 years old at that time...)
Yeah, my thought on the problem was that there’s probably an AT&T promo video from 1986 on YouTube somewhere that shows Brian Kernighan logging into 10 machines and downloading phone records and then processing them with AWK. He just wouldn’t be able to download quite as many and do as much with it.
Complexity in just what I and doing right now is impossible to comprehend. Imagine a mobile app with dozens of dependencies taking on a "secret" project involving a dozen of those dependencies that are drifting far way from the rest, with a dependency manager unable to function in this circumstance requiring manual guessing of what goes with what. Now imagine trying to develop with this Frankenstein set of dependencies and have any hope of getting work done. Of course the problem is not technical, its institutional; yet it's an unreasonable well of complexity that belies the demands placed on its workers. Technological complexity is not the only kind to worry about, it's the difficulties of managing complexities that are not obvious and cannot be explained to non technology savvy executives leading to painful results.
Now add a wealth of server side complexities with webs of micro and not-so-micro services interdependent on each other made by teams further and further removed from each other leading to more instability and unpredictable behaviors.
Managing systemic complexity (or not doing it well) seems endemic to programming today much worse than when Brooks wrote his essay (and I actually predate).
I normally agree with Dan's essays, but this time I very much disagree. As someone who first started programming around the time Brooks's article was written, my conclusion is that not only was Brooks right, but, if anything, that he was too optimistic.
In the late 80s and early 90s, a lot of software development was done with various RAD languages (Magic, Business Basic -- my first programming job used that -- and later VB). We fully expected the then-nascent Smalltalk, or something like it, to become dominant soon; does anyone claim that JS or Python are a significant improvement over it? Our current state is a big disappointment compared to where most programmers in the 80s believed we would be by now, and very much in line with Brooks's pouring of cold water (perhaps with the single exception of the internet).
My perception is that the total productivity boost over the past three decades is less than one order-of-magnitude (Brooks was overly careful to predict no 10x boost due to one improvement in language design or programming methodology within one decade), and almost all of it comes from improvements in hardware and the online availability of free libraries (Brooks's "Buy vs Build", which he considered promising) and information -- not from changes in programming methodology or language design (although garbage collection and automated unit-tests have certainly helped, too). The article also mixes hardware improvements and their relationship to languages, but we knew that was going to happen back then, and I think it's been factored well into Brooks's prediction. Moreover, my perception is that we’re in a period of diminishing returns from languages, and that improvements to productivity Fortran and C had over Assembly are significantly greater than the gains since.
The best way to judge Brooks's prediction, I think, is in comparison to opposite predictions made at the time -- like those that claimed Brooks's predictions were pessimistic -- and those were even more wrong in retrospect.
I would also add that if you want to compare the ratio of essential and accidental complexity in line with Brooks's prescient analysis, you should compare the difficulty of designing a system in an accidental-complexity-free specification language like TLA+ to the difficulty of implementing it in a programming language from scratch. I find the claim that this ratio has improved by even one order of magnitude, let alone several, to be dubious.
> Brooks states a bound on how much programmer productivity can improve. But, in practice, to state this bound correctly, one would have to be able to conceive of problems that no one would reasonably attempt to solve due to the amount of friction involved in solving the problem with current technologies.
I don't think so. Although he stated it in practical terms, Brooks was careful to make a rather theoretical claim -- one that's supported by computational complexity results obtained in the 80s, 90s and 00s, on the hardness of program analysis -- about the ability to express what it is that a program is supposed to do.
It is my observation that if you want to write a GUI, the tools peaked around Delphi and Visual Basic 6. GIT is way nicer than making manual pkZIP backups, and there are nice refactoring tools in Lazarus... but it's not THAT much better.
What was really surprising to me is the lack of GUI building IDEs for python, etc. Now I know why they are scripting languages, and not application languages.
I went through wxBuilder, but the problem is the same... it's a one way trip, and once you start hooking python to the generated code, you lose the ability to tweak the UI without losing work.
I got around this with wxBuilder by building my own interface layer to decouple everything, but then I needed to change a list to a combo box, and everything broke.
Things that take HOURS this way are a few seconds in Lazarus.
With Tkinter Python itself was the GUI-building IDE. (I have no idea why Tkinter isn't more popular. The TCL/Tk widgets that it wraps are boss. For example, the text editor widget comes with word wrap modes, undo/redo, tags, marks, customizable selection patterns, rich text, embedded widgets, ... it's really cool.)
How maintainable were creations on those old RAD tools? Because in my limited experience problems quickly outgrow them or become nigh impossible to comprehend monstrosities.
It depends to what you compare it to I guess, I think Qt fares pretty well once you get over the original learning curve for instance.
But these days it seems that the standard is web-based interfaces and honestly whatever scaling problem these "old" RAD tools have, the web has times 20.
I was late to the webdev party, I only reluctantly started to write JS a couple of years ago and to these days I'm still baffled by how barebones it is compared to the GUI toolkits I was used to. You have to import a trillion dependencies because the browser, despite being mostly a glorified layout engine, doesn't really support much but bare primitives.
Thinks like date pickers or color pickers are a recent development and are not supported everywhere.
Making range inputs is not supported by most browsers and requires a heavy dose of javascript and CSS to achieve, and you end up with something that won't look or feel like a native control in all browsers.
Ditto for treeviews.
Styling and customizing combo-boxes is so limited and browser-dependant that you have dozens of libraries reinventing the wheel by creating completely custom controls, each with their own quirks and feature set.
There's no built-in support for translations and localization (unless you count the accept-language HTTP headers I suppose). On something called "the world wide web", that's pretty embarrassing and short-sighted IMO. But you do have a Bluetooth stack now, so that's nice.
Can't speak about Visual Basic, other than it apparently worked just fine for small-to-medium complexity UIs.
On Delphi side, there was significant difference between "code produced by someone just starting out" which tended to make an unholy mess of generated code mixing UI and non-UI operations, but it was quite workable for bigger projects if the developer was more experienced and put some architectural thinking into project (for example, separating business logic and the UI code that called into it).
I suspect this is the key. The RAD tools made application developers more productive, but they came with a downside. If the tool became popular, the vendor lock-in kicked in and it became expensive or the company maintaining it stopped doing a proper job of supporting of it.
Therefore, in the 90s, people became tired of this and looked for ways out of the vendor lock-in. That's why OSS started to get traction, and also Java. It turned out that it is cheaper to develop your business app in Java from scratch, rather than in a commercial RAD tool and then pay up your nose to a disinterested company for maintaining the base on which it stands.
So I think OSS is actually less productive and less polished than it could be (many of the RAD tools are actually really cool, but insanely expensive), but it is still case of worse is better.
I think it's possible that some business DSLs (which are at the core of the RAD tools) will win mindshare again, but it is going to be quite difficult.
(I work in mainframes and there is quite a bit of RAD tools that were pretty good in the 80s, when mainframe was a goto choice for large business apps.)
Agreed. And Dan is doing nothing with ANY essential complexity here. Seriously: he's copying logs and generating plots from them. That's no different from copying logs and generating plots in 1986. Only the underlying software and hardware infrastructure has advanced. Conceptually it's an identical problem. And touchingly he thinks that having a grasp of how to use sophisticated tools to achieve what is a very simple task is somehow essential complexity. It isn't. It's still just copying logs from one place to another and generating a plot from them.
And you and Brooks have been absolutely right: there was no one improvement in language design that gave a 10x boost in productivity.
I think you misunderstood. Dan agrees there is no essential complexity. On logs: "this task ... is still nearly entirely accidental complexity". On query and plot: "I think it's unbelievable that essential complexity could have been more than 1% of the complexity I had to deal with".
Brooks claimed in No Silver Bullet that 2x programming productivity improvement is unlikely because now essential complexity dominates over accidential complexity. Dan is trying to demonstrate that it is completely untrue.
But you can't demonstrate it to be untrue by picking specifically crafted examples (which, BTW, are simple not because of any advancements Brooks was talking about in PL design and programming methodology, but due to hardware improvements and availability of OTS components). You could only demonstrate that by showing that most of the effort in software at large is of this kind, and that the improvement is due to language design and/or programming methodology.
When distilled, Brooks's claim is that the relative size of accidental complexity in programming overall is (or was) "low", i.e. less than 90%, while Dan wants to claim it is (or was) "high", i.e. more than 90%. His experiment does not establish that.
Exactly. Developing an operating system for example still has the same ratio of essential complexity to accidental as it ever did and guess what: OSs are still written in C and assembler. Has there been an order of magnitude improvement in productivity because of program language design in either of those two languages since 1986? Nope.
> Our current state is a big disappointment compared to where most programmers in the 80s believed we would be by now, and very much in line with Brooks's pouring of cold water
People not exposed to this era almost have a hard time believing it. My litmus test is this: how easy is it for any person to make a button in a given computing system? On a Mac in the late 90s this was so easy that non-programmers were doing it (Hypercard). Today where would one even begin?
The fault in this discussion and what I was hoping to learn about is the reduction in complexity of evolving systems. This post and the original cited work is very much about one-shot waterfall development. This is not what is done today and has little bearing.
How much of the effort in fitting a new requirement into an existing system is due to choices in the way it was decomposed and structured? I'm convinced there are much better than 2x better ways of keeping this effort low by choosing seams carefully and putting in the work when some are found to be in messy places.
Another way to say this might be, any system could (for sake of argument) be built with 2x essential complexity, but an evolved system typically has a much higher factor due to its lineage. Finding ways of keeping this near 2x should be the focus.
> In this example, let's say that we somehow had enough storage to keep the data we want to query in 1986. The next part would be to marshall on the order of 1 CPU-year worth of resources and have the query complete in minutes. As with > the storage problem, this would have also been absurd in [1986], so we've run into a second piece of non-essential complexity so large that it would stop a person from 1986 from thinking of this problem at all.*
If you are given a bucket to drain a lake, the enormity of task at hand is not "accidental complexity".
OP is asserting that doing x one time or a billion times is still x, with an inherent complexity of using a bucket to (dip, draw, transfer, drain). This is true -only- if we willfully insist on ignoring that our 'conceptual model' (a reservoire, a bucket, and a worker) is intimately bound with available technology and is a leaky abstraction. The actual conceptual [model] (move liquid from one reservoire to another) says very little about "inherent complexity". (Can you point it out?)
It seems it is more accurate to note that the complexity of materializing abstractions tends to increase with scale of operations.
Our goal then, as a discipline, should be to insure this relationship approaches linear [or better] to the extent possible. Continuing the example, the actual task (drain a lake) could materialize as an army of workers, each with a bucket, all converging on a sink to drain their bucket. Is managing the workers really an "accidental" complexity?
What above then implies is that the notion of "essential complexity" is a faulty paradigm if considered to be a static measure indepdendent of du jour technology. The essential complexity of a task is held to be inherently related to the magnitude of task, and thus inherently addressable by, and related to, advancements in technology.
And finally, since nearly all of the modern software systems are merely a scaled up ("web scale") version of earlier, "less complex", systems, it seems reasonable to question the industry's singular obsession with languages as the remedy to what is inherently a problem of scaling well understood tasks.
I don't think that Brooks was arguing that any given fixed programming task could not be made easier over time; clearly, even in 1986, there were programming tasks that had become trivial by use of existing code but were originally quite challenging to code on their own. I think Brooks was talking about the work of the individual programmer: given two identical programmers, with the same available libraries of existing code, is there a language or IDE that will make one 10x more productive than the other? And I think the answer is still, no.
You're added premise is an implicit point of Luu's article. Or perhaps the point is that Brooks' framing doesn't really apply when you can avoid programming altogether.
The silver bullet is open source. The ability to utilize the millions of hours of other people's work to accomplish your programming task is the game changer, and is what makes formerly complicated tasks trivial.
The essential complexity of the software I deal with is about two orders of magnitude more complicated than stuff I dealt with 20 years ago. I'm not sure that 20 years from now it will be another two orders of magnitude, but perhaps with utility computing and higher level orchestration concepts, it very well might be.
Broooks framing does apply when you can avoid programming altogether. He specifically mentions that as is prefered solution to the problem. Rather than coding systems from scratch, in which case you have to deal with the essential complexity inherent in the problem, you can either buy software or share components between developers and avoid having to deal with the problem altogether.
I think the lines between "available library of existing code", language, and IDE are much murkier than they might appear. If one programmer is given raw assembly and the other Python, you're probably going to see at _least_ a 10x productivity bump to the python user _even if they use no additional libraries_. Language and IDE choices impact the vocabulary available to express solutions in the same way.
You can certainly contrive a counter-example---even in 1986, you could say much the same thing about assembly and something like C or BASIC. But once that leap of abstraction was made, we seem to have hit a plateau of expressive power. Is the jump from C to Rust as empowering as the jump from assembly to C? I don't think it is.
> The essence of a software entity is a construct of interlocking concepts: data sets, relationships among data items, algorithms, and invocations of functions. This essence is abstract, in that the conceptual construct is the same under many different representations. It is nonetheless highly precise and richly detailed. I believe the hard part of building software to be the specification, design, and testing of this conceptual construct, not the labor of representing it and testing the fidelity of the representation. We still make syntax errors, to be sure; but they are fuzz compared to the conceptual errors in most systems.
I think Brooks by focusing on the reality of the world reflected into topology of systems focuses on the essentials for the complexity. Yes, we can abstract some complexity but we also know these abstractions are leaky - heaping too many of them on top of each other and it topples.
Even more important there are limits to how much most of us can juggle in their minds - exceed that threshold and we need a team with all the overhead that accompanies that. Once inter human communication is part of the process any optimization on tool level can only improve the situation by so much.
>This does not represent productivity in solving new problems, it just tells us it's worth reusing pieces for already solved problems.
That's neither here nor there.
First because most (if not all) of new problems can be solved by reusing existing parts (from languages and libs to widgets and cli tools).
Second, because we tend to re-work on the same problems over and over for the most part, anyway, so improvements on those are still greatly important (even more so than novel problems actually, which are about potential uncertain future markets, whereas solved problems are what drive existing global markets, so dozens of trillions of dollars).
Let's say you're building a home. Foundation is poured and it's time to frame it. If you hire some random guys with power tools standing outside a home depot, it'll take weeks to months for them to finish, and might well end up crappy. But hire two 55-year-old men with hammers, nails and a single skil-saw, and they can have it up in 3-4 days. Or grab your Amish neighbors and raise a barn in a day.
Productivity is not bounded by your access to technology, it's bounded by the problem you need to solve and the means you use to solve it.
> Brooks explicitly dismisses increased computational power as something that will not improve productivity ("Well, how many MIPS can one use fruitfully?", more on this later), but both storage and CPU power (not to mention network speed and RAM) were sources of accidental complexity so large that they bounded the space of problems Brooks was able to conceive of.
Yet here in the future, our excess of storage, CPU power and network bandwidth have bounded our ability to be fruitful with them. If you add up the MIPS we use today and compare it to what solutions we're implementing with it, we are vastly less productive today. Brooks was right, but in a weird way. There is a bound on programmer productivity, and it is the programmer.
In the past, people wouldn't have set themselves up to need to collect a billion billion logs to analyze some tiny metric among them. Anyone would've seen that as poor planning and a horrible waste of resources. Why would you bury needles in billions of haystacks, and then need to hastily build a robot that finds needles? Why not just keep the needles in a pile of needles? Do you really even need all those needles?
> Thinking that the tools you personally use are as good as it gets is an easy trap to fall into.
This actually seems like a hard trap for most to fall into. Nearly every software engineer I've met either thinks they could make a better tool, or just wants to try for the hell of it. On the other hand, most engineers think that building a better mouse trap will solve all their mouse problems.
Maybe off topic but this site might have convinced me to move to Firefox. Are all of you reading this page on mobile or a small screen?
I have a large monitor and this page expands full width of the screen and the lines are extremely hard to follow. It turns out Firefox has a "Reader View" that makes sites like this be put into a nice readable column.
I find it interesting that this post reached to the top of HN with such an unreadable layout and so I was interested in if everyone has some tricks for reading content like this or is Firefox/Reader Mode the best way?
My gut is that the claim translates better to percentage of processing power per user. With such an excess of power today, you can scrape a ton of use out of it.
That said, the use of scripting languages kind of proves the point. There is little attempt made to maximize compute. Only gets more true when you consider the absurd amount of memory being used.
Hypothesis: Constraints on time is the cause of accidental complexity. Given infinite time I would think longer, yak shave more, rewrite everything simpler, recur until perfect. In real life what constrains time? The business, its competition and available resources. Traded off vs speed of human thought and communication.
The only way Brooks's argument makes sense is that we naturally tackle problems as big as the resources available to us permit. So, our daily load of programming hardly changes, because as we get better equipped, more is asked for.
But in that interpretation, none of his supporting detail applies.
Dan, I'm curious how all those people listed at the bottom read the post early. I sometimes want early feedback on my writing and you seem to be getting it from a lot of well known people. How do you do that? Do you just email them all and ask?
This essay badly misinterprets Brooks original essay. Brooks was talking specifically about the problem of building new systems from scratch.He was speaking about programming productivity. This essay is talking about operational tasks. Of course operational tasks can be speed up with improvements to software and hardware, that's the whole point of writing new software and developing new hardware.
Furthermore, Brooks doesn't actually argue that there is no way to tackle essential complexity when writing new software. What he says is that there is no single technique that will speed up the process of writing code across the board, that is, there is no silver bullet. However, he does make allowance for techniques that allow us to break up complex systems into smaller systems which can then be solved independently. He specifically mentions object oriented programming as a tool which makes it easier for programmers to write software as components but CPAN would have been a better example. He just doesn't allow that OOP by itself would allow for an order of magnitute improvement (something which I think that experience has born out.)
The author of this essay, points out a couple of operational tasks which involve programming were easier for them than they would have been 34 years ago because faster hardware noow exists and because software that already takes care of a subset of the problem now exists when it didn't before. Partially, this is confirmation of Brooks' idea that breaking down problems it the way to go long-term, but more critically, it's a failure to realize on the part of the author that they're speaking of productivity in their role as the user of a computer system whereas Brooks is speaking about productivity in designing new software products.
edit:
I think that the fundamental root of the essay's problem is that the author seems to have misssed what Brooks was referring to which he discussed accidental vs essential complexity. Brooks specifically pointed out that he was referencing Aristotle when he used those terms, and in this case essential doesn't mean unavoidable, but a part of the essential character of the problem itself. A more modern terminology would be intrinsic vs extrinsic. Complexity is extrinsic to programming if it comes from something other than the nature of the problem itself. If my computer is too slow to allow me to write a program, then that's probably not due the the nature of the program itself (except indirectly) and so represents a form of accidental complexity. Feature requirements however, represent essential complexity. You can manage essential complexity according to Brooks by dividing the problem up or avoiding altogether, but it's not something that will go away simply with better tooling or a better methodology, (unless that better tooling replaces software that you otherwise would have written.
What he's saying there is that today we have free implementations of libraries which solve incidental complexity and those libraries work (nearly) everywhere (nearly) the same.
I worked in finance on K and the name makes in impossible to use google/stack overflow to find anything about the language. Which meant I had a paid for help and tech support from a company that's been largely unchanged from the 80s.
It was fucking phenomenal.
Once I gave up the idea that I knew best and started working in the way the language expected me my productivity was an order of magnitude higher than it would have been in a cobbled together python pipeline like the one he uses there. And because someone was responsible for the language and tooling I could use scripts from before I was in primary school, and they worked as expected.
I dare you to try and do the same with any language that isn't C.
Essential complexity from how I read Fred Brooks is a domain knowledge like what an accountant does and what features he is using. All frameworks, servers and doing scp or queries in that is accidental complexity.
Accountant does not need any React or Angular front end, they don't need a server.
Where angular brings in reduction in accidental complexity since by using it I don't have to build a lot of stuff from scratch, it does not make tax laws any simpler. Where tax laws are essential complexity that needs to be coded in accounting software.
The same with servers, going to cloud so reducing my accidental complexity by removing need to deal with servers is not making tax laws simpler or easier to grasp.
This way in my opinion, essential complexity and accidental complexity holds really well as general idea.