Hacker News new | past | comments | ask | show | jobs | submit login
Big Ball of Mud (1999) (laputan.org)
122 points by brudgers on Feb 20, 2020 | hide | past | favorite | 48 comments



The essay is a classic.

> Shantytowns are squalid, sprawling slums. What is it that they are doing right?

> Shantytowns are usually built from common, inexpensive materials and simple tools. Shantytowns can be built using relatively unskilled labor.

This is a powerful metaphor, and I see it often in code.

If you've ever seen an API gateway written using the same web framework as the other services, where in order to route "/external/order" to "/internal/order", the first step steps are e.g. "File | new | Controller", call it "ExternalController", make a GET method called "order", and in there code up a http call to "internal/order" ... you have a shantytown API gateway.

It's terrible, it requires manual extending for any new endpoint. But it's not a specialised skill. Anyone on the dev team can extend it by just adding more of the same.


The advantage is when a new person starts. "How does this work?", oh here is a link to a well explained document on this thing called "Rails". Oh you know "Rails" already, of course you do, that's why we hired you.

Where as with a customized "elegant" meta-programming DRY NIH solution unique to ThisCorp. "How does this work?". "Try to figure it out yourself. Jack has left and Ryan is too busy fixing production issues to help you".

Obviously if you have a team of Haskell compiler writing geniuses YMMV.


> Where as with a customized "elegant" meta-programming DRY NIH solution unique to ThisCorp.

In this specific case, I wouldn't recommend that either. e.g. there are plenty of products in the API gateway space that do that thing well. e.g. NGINX, Kong, Istio, Apigee etc.

They're not "unique to ThisCorp" but neither are they well-known to many Ruby / Node / Java / C# coders. It's a separate, specialised knowledge.

I think that's the "relatively unskilled labor" part: i.e. "yes, there are specialised tools that are excellent at solving this problem .... but we don't know them"


Well, maybe the good thing here is that at some point someone will notice that the specific gateway code can be extracted and then refactored? Idk there's probably a reason why they decided to do this, the most common answer I've gotten is, product requires feature out tomorrow. How does one reconcile this problem of competing priorities that aren't actually sized right (ie. short term feature development versus long term code health?)


You have to get management and customer support, and that's the hard part. They have to understand the idea of the capability trap [0][1]. A lack of maintenance is inviting disaster, this is true across domains. Software is cheap, though, and usually doesn't have terrible consequences (see the notable exceptions like Therac-25 or 737 Max). Who cares if your IM client is a POS, it's not going to cost lives. And even if it is junk, we can buy a new one (MS) or build a 17th one (Google).

The way around this, and again you have to get management and customer support, is to bring maintenance activities into every iteration. Create systems that can detect issues and errors early (more comprehensive testing, new testing styles like fuzzing and property-based testings; moving critical portions to statically typed languages or adding type annotations; etc.). Make addressing those issues a priority, some portion of your time in each iteration should be spent addressing concerns or increasing the ability to detect issues. Over time, when I've seen this done, it's made making changes much faster, even for larger changes. But without this, even "trivial" changes can end up taking months, instead of days or weeks.

[0] https://web.mit.edu/nelsonr/www/Repenning=Sterman_CMR_su01_....

[1] https://www.systemdynamics.org/assets/conferences/2017/proce...


Some what of a random question, how did you find [0] and [1]?


I don’t recall now. I’d heard of Sterman from various other readings on business and system dynamics and a recommendation from here. I can’t recall now if I found these papers from googling his name or a recommendation from my sister.

I think she’s the one that gave me the phrase “capability trap” and Google found these and other articles.


Awesome thank you, I was just trying to figure what the current state of your mental model on the subject is :)


> the specific gateway code can be extracted and then refactored? I

In this case it's generally an attractive nuisance for business logic, aggregation, data transformation etc, which means that after a while

1) extracting the proxying is hard and

2) you can't understand a service without reading the code in multiple layers, since some of it's business logic is in the proxy layer. Big balls of mud can be big, distributed balls of mud.


> If you've ever seen an API gateway written using the same web framework as the other services, where in order to route "/external/order" to "/internal/order", the first step steps are e.g. "File | new | Controller", call it "ExternalController", make a GET method called "order", and in there code up a http call to "internal/order" ... you have a shantytown API gateway.

That's not a shantytown. That's an architecture astronaut utopia, creating according to the latest methodologies.


This is required reading for all software developers. Any successful software system will end up as a Big Ball of Mud or other anti-architecture eventually. It's just the nature of the beast. And while we should fight the entropy, it will occur despite our efforts (there might be a few exceptions in the world, but for the most part it is inevitable)

In order for software to be successful it needs to solve a problem. In order for that software to be successful in the long term it will have to change and adapt to continue solving the problem. Architecture usually means abstraction. And abstraction is most always a tradeoff - you sacrifice flexibility to make something common, easier. But there is the rub. A successfully abstracted architecture that solves the problem today, will not be able to solve the problem tomorrow without big changes!


There's a very short chapter in one of the books I like, Working Effectively With Legacy Code, entitled "We Feel Overwhelmed. It Isn't Going To Get Any Better." The chapter offers little more than hope based on seeing turnarounds happen -- once you get some control over your code, you start developing little oases that are pleasant to work in, even if the rest is crap.

I see something similar with large software systems. A 'system' is something that can stand alone, but very large software is logically composed of many subsystems that could in theory stand alone even if presently they never stand apart from the full system or if their components are so intertwined that the logical subsystem is only a potential abstraction. While the full system might be a ball of mud, and you have little to no control over what Team X does with their subsystems and their relations/interconnects with the overall system, it is possible to rework your own subsystems in a way that they themselves aren't balls of mud and stay that way indefinitely. (I suspect this is part of some of the optimism behind microservices -- followed by pessimism when people realize the full system that interconnects everything can still easily become (or start as) a ball of mud, one even less pleasant to work in than in a monolith design.)


> All too many of our software systems are, architecturally, little more than shantytowns. Investment in tools and infrastructure is too often inadequate. Tools are usually primitive, and infrastructure such as libraries and frameworks, is undercapitalized. Individual portions of the system grow unchecked, and the lack of infrastructure and architecture allows problems in one part of the system to erode and pollute adjacent portions. Deadlines loom like monsoons, and architectural elegance seems unattainable.

I would argue they are more akin to failed housing projects: competent architects went to great lengths to justify their existence as a cost-effective solution to an existing problem, but this approach didn't work and they offer no better conditions to their residents than shantytowns.


This is one of my (two) favorite essays regarding software production - it deals with the evolution of software in an uncontrolled environment. I just posted my other favorite "classic" essay which deals with the evolution of a technical team in an uncontrolled environment. Perhaps others will find it equally enlightening: https://news.ycombinator.com/item?id=22376468.


Unshortened URL: https://news.ycombinator.com/item?id=22376468

"The Wetware Crisis: The Dead Sea Effect"


Don't use URL shorteners.


Thank you for sharing this article, it was a good read.


> Here’s an example of one of the scripts that generates the attendance report:

  echo "<H2>Registrations: <B>" `ls | wc -l` "</B></H2>"
  echo "<CODE>"
  echo "Authors: <B>" `grep 'Author = Yes' * | wc -l` "</B>"
  echo "<BR>"
  echo "Non-Authors: <B>" `grep 'Author = No' * | wc -l` "</B>"
  echo "<BR><BR>"
> This script is slow and inefficient, particularly as the number of registrations increases, but not least among its virtues is the fact that it works. Were the number of attendees to exceed more than around one hundred, this script would start to perform so badly as to be unusable.

I'm not entirely buying it; I remember what performance was like in 1995, and grepping a hundred or so small files wasn't any sort of big deal, even on consumer-grade equipment like an 80486 running GNU/Linux, even with multiple users hammering on that CGI server to obtain the above output.

In any case, the logic could easily be rearranged so that the output of exactly the same code as above is cached as static HTML, which is regenerated only when the set of registration files is altered.

So if the registration exploded and the machine got bogged down with multiple instances of these slow scripts, there would be no need to change the actual code, just to plug it into the overall solution in a different way.



Some of these essays (probably including a few of PG's) should be automatically put on the front page every two years or so! It looks like this (excellent) article averages about that.


IMHO containerization exists because of this "pattern." It's a way to just take big balls of crap, roll them up along with the entire host OS image, and deploy them as a unit. Problem "solved."

Of course now you have microservice architectures where you're basically creating programs made of containerized services. Any bets on how long it'll take to have containers-of-containers be a thing?

Maybe this could be the "big box of crap" pattern.


Yeah, software gets bigger as more requirements are thrown at it, no matter how it's done. Eventually you just end up with a these balls of crap.

Frankly, I think this pattern describes most programming constructs as well. As what's a good approach for managing largeness (or complexity)? Dividing it into smaller simpler understandable chunks, that can be changed in isolation, which is what containers attempt to do, as what many programming constructs attempt to do.

But you're right, it'll just grow more, and things will likely just get divided again into more chunks. I don't see a way out of this process.

In fact, I feel this describes a general law, as it describes everything large in general. Take companies for example. Large companies are less innovative, because they're more complex and hard to change, just like large software. Additionally, large companies tend to containerize things into divisions, departments, teams, and so on, in attempt to manage complexity.

Bigness is the anti-thesis to change/innovation...


> Of course now you have microservice architectures where you're basically creating programs made of containerized services. Any bets on how long it'll take to have containers-of-containers be a thing?

As soon as Kubernetes gets mature enough to have to make backwards-incompatible changes and keep old code running, we'll have to have a way of deploying clusters that were defined in the old versions. So probably a few years?


Putting the application in a container doesn't make it easier to make safe, controlled changes to the internal workings of the application. It doesn't help any developers understand the theory (if any) underlying the application code, that might help them understand how to code with the grain when making changes.

Stuffing it into a container to deploy it is orthogonal. It doesn't solve many problems


You're both right.

I have worked professionally in a company that is a giant, sprawling software slum at best. Containers are huge there because they are perceived as fixing this issue even if they just make it worse in practice.


I see containers, at least when used this way, as a consolidation loan for technical debt. They let you consolidate your existing debt load embodied in all your messy uninstallable ball of crap software and then max out your credit again.

One of the issues here is the lost art of writing installable and well-organized software. It's quite normal these days to have services and apps that consist of multiple often redundant services and sub-applications wired haphazardly together with crap strewn all over systems.


It may sound contrarian, but I was very, very recently looking at a project that uses three or four different base containers because the people involved can't agree to use the same flavor of linux.

It's more like they are doing the technical equivalent of credit card churning but never closing out the accounts. I have never seen anything like this, and I have, on and off in my career, worked in contexts best described as "software superfund decon expert."


I'm currently refactoring a legacy codebase which has many of the attributes of big ball of mud. It seems to have started with a misguided amateur attempt at an architecture, which turned into "the most complicated thing which could possibly work", then imploded. I've been in the "trench warfare" stage ever since I inherited it after the original developers had all gone.


As a counter point, I've been here, and thought the same, but it was my lack of experience that made me think that way. The codebase was written the way it was for a reason, and I was just not experienced enough to understand it's reasoning.


I think the earlier (1970s) and non-pejorative use of this term "big ball of mud" (e.g., as applied to LISP compared APL) is also insightful.

To wit, if what you have is a big ball of mud anything you add to it becomes mud too. At least without heroic effort, this seems to be true. It's not necessarily a bad thing, but it is worth recognizing in systems thinking.


This always feels so unfortunate. It's not that there isn't validity in the thesis, there is. It's just that I've worked with people that reason, "It's inevitable so why fight it? Embrace the ball and hack away" as an excuse not to do good work.


I have the greatest sympathy for discovering you have a Big Ball of Mud. It may have even been polished by years of grazing touches by many hands into a dorodango, achieving a kind of beauty. Someone, perhaps many someones, spent a lot of time on it and this is your inheritance. My condolences.

Still, even the dorodango is fragile, prone to cracking, sensitive to many conditions. Inheriting one is a situation which may occur to any programmer. What the organization decides to do once they have a Big Ball of Mud on their hands is the real issue.

I consider this the reaction to the eventuality to be a real test of organizational mettle.


Maintain a system with low entropy, always means injecting more energy from the outside. Because the entropy in an adiabatic(closed) system always increases.

Sometimes the energy cost/(effort) can be higher than the benefit, or the cost of the opportunity of having a mud/noise system can be negligible. As other human constructions, sometimes best is not better, it depends on the metric.


Also: The Rise of "Worse is Better" http://dreamsongs.com/RiseOfWorseIsBetter.html

featuring the MIT vs New Jersey approaches.


I hate this essay so much.

Wherin dick gabriel rolls up his sleeves to get into the language wars and calls more popular languages than his favourite "viruses" and says they are "worse" by definition without proper justfication. It's like catnip for those around here who want to ride a technology to career success through early adoption so then put on their language warrior garbs to hunt the heretics who dare question. Old greybeards did since the time of the dinosaurs. And it sucked then too.

Sensible argument and discussion are so much more productive, interesting, enjoyable, educational than language war flame-fests of which this is one.


What the hell is with that image in the intro, with the smiling waiter?


Waiter delivers a shovel of spaghetti [code], client happy.


Yeah I get that much (Sorry should have said so) I am trying to understand the real-life context.


So the image is actually a link, but the page 404s. Putting it into the Wayback machine results in 63 snapshots, only the oldest of which (From 1998! It was gone in 2000!) has content: http://web.archive.org/web/19981206071543/http://oak.cats.oh...

No images though. However, I noticed down at the bottom this text:

> All Beatles Images (c) Apple Corps Ltd. Page Design (c) 1997 jwinterprises

Well... off to google?

"Beatles shovel spaghetti" brings up multiple copies of this image, including an explanation: https://biteswiththebeatles.wordpress.com/2017/09/27/aunt-je...

> If you have ever owned a vinyl copy (or certain CD copies) of Magical Mystery Tour, you’ll know that it comes with a 28-page booklet containing song lyrics, drawings by cartoonist Bob Gibson, and scenes from the Magical Mystery Tour movie.

> [..]

> Yes, you looked at that correctly. That is John Lennon, dressed as an Italian waiter, literally shoveling buckets of spaghetti on this woman’s plate. You’re probably gonna want some context…

(Edit: and if I only scrolled further I see someone else already posted that link)


I suppose the same context as serving a drink in a jar.


The important lesson being to do it with a big smile!


Funnily enough, I recognize this image from The Beatles' "Magical Mystery Tour" soundtrack liner notes (the image itself is from the movie). Why it's included in this article is lost on me.



The waiter is John Lennon. TIL.


Ok thanks, that's more what I was after :)


It's from the meaning of life by monty python.


You're, perhaps understandably, thinking of Mr Creosote - https://en.wikipedia.org/wiki/Mr_Creosote


The founder of the first company I worked for used to say "You are only qualified to design software that you have already written" which seems to apply to this as well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: