Beware of the Turing tar-pit (2004) (raganwald.com)
Beware of the Turing tar-pit (2004) (raganwald.com)
67 points by sph 4 months ago | 27 comments

In my experience one of the most difficult things to get a team of software engineers to realize is that you are just trying to move some bits from one place to another. Folks come up with these elaborate constructions that create all kinds of problems in order to solve problems that are fundamentally not that complicated. "I want to make some graphs. I need a generalized data transformation pipeline." No you don't. You need to ingest the data you have, and you need to draw a picture. "I want to automate hardware tests. I need a platform that provides pluggable instrument interfaces, a data stream abstraction, and analysis tools." No, you need a Python script that creates a CSV file. "I want to serve a web page. I need..." ...well, you get the idea.

Solve the problem. Don't make new ones.

> You need to ingest the data you have, and you need to draw a picture.

Works well for one chart from one dataset. But you want more than one chart, and pretty soon you find yourself repeating code, so you create functions, and you start thinking about data bindings and before you know it you’ve invented a generalized pipeline (or a buggy mess, depending how much forethought you put into it).

Good. Now that you know the real requirements - the ones derived from the real use cases, not the ones you were imagining - go find a library that implements them best.

If the pluggable stream abstraction platform already exists, it's probably easier and more bulletproof to use that. It's probably evolved over time for similar use cases and you can find discussion on how well it works.

If you're thinking of writing it yourself, then you will probably need six months and a whole maintenance team, and who knows how well it's actually going to work or whether it is really needed or useful.

I may be an outlier here, but I'm always pushing back against generalization. When someone reviews a PR and asks me to extract an interface (in case we need to change this step?) I find it really annoying. I know some people think that way first, and it's not really the point of the article, but still.

I agree that it seems that some people are drawn towards abstraction, while are are drawn to spezialisation. (Also in learning, in PL class, I needed to see code samples and infer the rules from there, others needed to see the rules and could produce code samples from that).

I think that it is good to view generalization as a tool. Like everything in software engineering, it is not an exact since and is influenced by many factors etc. But for me a guiding rule is: "When it is really clear that something will change in the future, then it makes sense to generalize. It is clear enough when you were willing to bet one week of your salary on that the need for generalization arises within X time." (where X is something less up to a year).

Mainly because this communicates intend to future code readers: "Beware, this thing will likely not be alone soon".

It's a trap (especially early in a project where a lot of things are simplified placeholders) to prematurely combine previously decoupled parts of a system due to some superficial similarities. "Hey, so our 'customer' object and our 'stock_item' object both have ID, name, comment, and location code, we should have a 'handled_entity' table to store both of them." No, bad!

The response for an “in case” is premature abstraction.

Unused abstraction makes reasoning about the codebase harder.

I have found that early/premature abstraction generally has the opposite of the intended effect. Rather than making the system open to extension and flexible, it pours code concrete around the concrete classes and the interfaces. Invariably, the interface ends up with every public method of the only implementing class. And methods taking "SuperInterface" which only use a subinterface of that interface can't be untangled without a ton of effort.

Meanwhile, if you can figure out small reasonable interfaces with potentially multiple implementers, then all the sudden things get a lot easier to deal with.

And they're expensive. You pay thrice:

1. Engineering the generalized solution, then

2. Carrying the cognitive burden of this new indirection layer, and finally

3. Ripping out this premature -- and so likely wrong -- abstraction

Premature abstractions are like alcohol: fun borrowed from the future, and paid back later with interest. If unavoidable, choose the abstraction most narrowly scoped to your actual code base, not a future one. Humans suck at prediction. Ask me how I know.

I like the comparison with alcohol, and on a tangent from that... I wonder if the 'Ballmer Peak' is partly explicable by being just drunk enough to avoid this kind of abstraction procrastination.

My personal rule is to rewrite specific solutions many times (all similar to each other) until I have "learned enough" to start thinking about an appropriate generalization (versus generalizing from the beginning).

“No premature generalization.”

You have the right attitude and I really hope you're not an outlier.

As a push back ask them to raise an issue to request the interface. In the meantime fix the immediate problem only.

There are countless overused abstractions and sources of accidental complexity. Interfaces though are one of my favorite tools, in most languages.

The primary function of an interface is of course to swap out implementations. The first and most common use-case is generally for tests. You can do it other ways too, but they are usually more invasive.

That said, it’s easy to add interfaces later (which is one reason it’s a good abstraction). I would still defer it until needed.

My rule of thumb: If an interface only have a single implementation, it is likely superflous.

Yeah, unnecessary abstractions are just bloat. Generalizations should be introduced when necessary not when possible.

I don’t think your opionion is an outlier among people who have experience in maintaining software. But a lot of teaching make it seem as if abstraction is a goal in itself.

A driver for unnecessary abstraction is the idea that “it will be more difficult to abstract later” not realizing over-designing is a cause of this problem in the first place.

> Keep Alan Kay’s words in mind: “We aim to make simple things simple and complex things possible.” When solving the general problem makes complex things possible, that’s good. But it shouldn’t be at the expense of making simple things simple.

Great quote.

I think this is really good advice. I've always tried to teach this to my junior engineers from a perspective of "we will need to maintain this for years...".

> I've always tried to teach this to my junior engineers from a perspective of "we will need to maintain this for years..."

IMO that kind of long-term phrasing can sometimes backfire, because then the enthusastic junior engineer will go: "Ah! I'll impress everyone by making the most flexible configurable modular thingy!"

In other words, what TFA cautions against:

> The danger of the tar-pit is that instead of developing a solution to a problem, you develop a tool for solving problems. [...] situated right in the centre of some of the most attractive real-estate in your imagination. Tools that solve whole classes of problems in generic ways offer the potential for vast improvements is productivity.

Or as I tell people, if you die, I'll have to maintain this, so I want to make sure I know what it's doing and why.

Bring my yahoo pipes back!

Simple is robust.

"With XML you can do anything!"

I'm still undecided on whether XML is brilliant or a tarpit.

Xsltproc segfaults on large files which is a bit dubious, and webassembly is refusing to run parts of libxml2 on the grounds that memory accesses go out of bounds. Browsers seem to have abandoned it.

I do want an extensible tree notation though and it does have a lot of existing tooling. Some of it quite good. Am I stuck in the mires of sunk cost, or am I wisely not implementing a DIY tree transform? Hard to see from here.

For everything else, there's Perl.

