
The Wrong Abstraction (2016) - LopRabbit
https://www.sandimetz.com/blog/2016/1/20/the-wrong-abstraction
======
gadrev
I found the following comment very insightful in a past discussion:

[https://news.ycombinator.com/item?id=11042400](https://news.ycombinator.com/item?id=11042400)

I reproduce the relevant part:

 _Dependencies (coupling) is an important concern to address, but it 's only 1
of 4 criteria that I consider and it's not the most important one. I try to
optimize my code around reducing state, coupling, complexity and code, in that
order. I'm willing to add increased coupling if it makes my code more
stateless. I'm willing to make it more complex if it reduces coupling. And I'm
willing to duplicate code if it makes the code less complex. Only if it
doesn't increase state, coupling or complexity do I dedup code._

State > Coupling > Complexity > Duplication. I find that to be a very sensible
ordering of concerns to keep in mind when addressing any of those.

~~~
amelius
There is a quote by Linus Torvalds that is relevant here:

"Bad programmers worry about the code. Good programmers worry about data
structures and their relationships."

~~~
bch
"Show me your flowchart and conceal your tables, and I shall continue to be
mystified. Show me your tables, and I won't usually need your flowcharts;
they’ll be obvious." \-- Fred Brooks, The Mythical Man Month (1975)

~~~
andai
I looked up the full text of the book, but couldn't figure out what tables
mean in this context.

~~~
wazoox
The quote comes usually with "data" instead of "tables" and "algorithms"
instead of "flowcharts".

------
shoo
Yup.

On the one hand, often there can be shared lines of code without a shared
idea, this shouldn't be a candidate for being factored out into some new
abstraction.

On the other hand, you _may_ want to introduce an abstraction and factor out
code into a common library / dependency / framework when there's a shared
well-defined concern/responsibility.

That said, on the gripping hand, I say _may_ because even if there's the
opportunity to introduce a clean future-proof abstraction, introducing it may
be at the cost of coupling two things that were not previously coupled. If
you've got very good automated test coverage and CI for the new common
dependency and its impact upon the places it is consumed, then perhaps this is
fine. If the new common dependency is consumed by different projects with
different pressures for change / different rates of change / different levels
of software quality then factoring out a common idea that then introduces
coupling may cause more harm than good.

~~~
kolpa
People often forget "copy-on-write". Coupling doesn't have to be permanent. If
refactor to create a sahred component, and then you want to modify a shared
component to help one client, you can fork it -- it's not worse than simply
not having created the shared component in the first place.

~~~
Buttons840
In my experience people will most likely just hack the shared component by
adding awful arbitrary if-statements or other such hacks, rather than fork the
shared component. This is the path of least resistance. Once this happens a
few times that shared component begins to be seen as a central component and
is quite a complicated mess.

~~~
marcosdumay
Well, after enough settings are added, take a look at your components, and
define a clearer 2.0 version of them.

When systems need to use newer functionality, port them to the new components.

I've had a mixed experience with this, but at some point you get the API
right, and then it works.

------
zellyn
The Go community has always embraced this. On the Go Proverbs page[1], it's
expressed as “A little copying is better than a little dependency” — true,
it's not precisely the same idea, but close.

[1] [https://go-proverbs.github.io/](https://go-proverbs.github.io/)

~~~
brianpan
Hmmm...I feel like these ideas can be refactored into one idea. BRB

------
femto113
I experienced a crystal clear example of this about 15 years ago. I developed
a prototype of perl script that read data from one folder, filtered and
enhanced it, and wrote it out to another folder, where it was picked up by
another process. The script was configured by another perl file that contained
the paths to read and write from. The initial (pre-production) deployment had
the two folders adjacent, so the config script looked something like

    
    
        $input_folder = "/some/annoyingly/long/path/my-cool-project/input/"
        $output_folder = "/some/annoyingly/long/path/my-cool-project/output/"
    

At some point the script was handed off to some other dev who looked at those
paths and apparently thought "that's not DRY!", and changed the code so that
the config file just had

    
    
        $project_folder = "/some/annoyingly/long/path/my-cool-project/"
    

and actually append the "input" and "output" in code when needed (fairly
elegantly leveraging some existing keys that already defined those two
strings).

The problem was that when I developed the script the actual consumer hadn't
been finalized, so that output folder path was just a placeholder. When it
comes time to deploy we get the actual path which is now some NFS thing like

    
    
        /mnt/other-services-host/other-service/input/
    

At this point I naively go to the config to update my $output_folder variable
and discover the code changes made by the other dev, which have made it
impossible to separate the two folders, and because of the "elegant reuse of
existing keys" made it a huge pain to even change the code back, since the
assumption that the last segment of the path had to match the intended use was
deeply baked in. At this point I think I just started swearing for a week.

~~~
DerSaidin
Too much abstraction?... Or not enough abstraction?

    
    
      $project_folder = "/whatever";
      $input_folder = $project_folder . "/input";
      $output_folder = $project_folder . "/output";

~~~
aserafini
This does not solve the problem. The input and output folders had different
roots in the production app.

~~~
dwaltrip
However, that abstraction is very localized and thus easy to remove (once the
new understanding has been gained), so I'd say it is better.

------
ereyes01
Dijkstra also beautifully sums up this concept in his 1972 ACM Turing Lecture
[1]:

> The purpose of abstraction is not to be vague, but to create a new semantic
> level in which one can be absolutely precise

[1]
[https://www.cs.utexas.edu/~EWD/transcriptions/EWD03xx/EWD340...](https://www.cs.utexas.edu/~EWD/transcriptions/EWD03xx/EWD340.html)

------
jrvarela56
Just copy-paste code and add a comment pointing to the source (starting with
TODO helps you find those later).

It will become obvious which duplicated code to abstract when you find
yourself changing all/many at the same time or fearing you’ll forget/break
something if you don’t change all instances. Writing tests is also a good
motivator as it means even more code per duplication (and reduces the fear of
breaking something).

It really takes a lot of duplication for this to get out of hand. Wait til it
happens, you’re a software engineer you’ll figure out how to get rid of
duplicated code just fine. Coming up with a great abstraction is extremely
difficult before seeing at least a few examples.

------
cpeterso
I recall when ESR tried to school Linus Torvalds about the "curse of the
gifted student" and that Linux would collapse under the complexity of device
driver code duplication. Linus didn't care because he was more concerned about
introducing code dependencies that would block developers and make them less
productive than some duplicated code in drivers maintained by different
people.

    
    
      Date:	Tue, 22 Aug 2000 16:00:52 -0400
      From:	"Eric S. Raymond" <esr@thyrsus.com>
      To:	Linus Torvalds <torvalds@transmeta.com>
    
      Linus Torvalds <torvalds@transmeta.com>:
      > 
      > On Tue, 22 Aug 2000, Eric S. Raymond wrote:
      > >
      > > Linus Torvalds <torvalds@transmeta.com>:
      > > > But the "common code helps" thing is WRONG. Face it. It can hurt. A lot.
      > > > And people shouldn't think it is the God of CS.
      > > 
      > > I think you're mistaken about this. 
      > 
      > I'll give you a rule of thumb, and I can back it up with historical fact.
      > You can back up yours with nada.
    
      Yes, if twenty-seven years of engineering experience with complex
      software in over fourteen languages and across a dozen operating
      systems at every level from kernel out to applications is nada :-).
      Now you listen to grandpa for a few minutes.  He may be an old fart,
      but he was programming when you were in diapers and he's learned a few
      tricks...
    

More here:

[http://lwn.net/2000/0824/a/esr-sharing.php3](http://lwn.net/2000/0824/a/esr-
sharing.php3)

[https://news.ycombinator.com/item?id=11077799](https://news.ycombinator.com/item?id=11077799)
(2016 discussion)

------
cjdell
I still regularly fall for this trap. My instinct to "DRY" is so ingrained
into me that I always find myself deduping similar looking blocks of code. I
think this culture is a knee jerk reaction to dealing with code bases on the
opposite end of the spectrum where entire classes are "copy and pasted" with
only a single change. I've had the misfortune of dealing with these kinds of
projects.

I now try to find the middle ground by remembering to "do the simple thing"
even if it appears less elegant. This makes it easier to refactor in the
future (if required) at which point more information will be available to
design a more appropriate abstraction than would have been possible before.

~~~
PDoyle
However, let's not go too far: I think DRY is a good default for new
programmers. They can learn to break that rule as they gain experience.

------
gregwebs
Removing duplication != abstraction.

Many copy & paste scenarios can be avoided without creating any meaningful
abstraction. Generally this is best done with a stateless (pure) function that
has few if any conditionals and does not involve design patterns such as
inheritance, overriding, or even creating new compositions. It should feel
easy and boring when you are doing this.

Abstraction is a new representation that calls for deep thought and I agree is
easy to mess up.

The need for DRY also depends on type-safety. Type-safe boilerplate generally
adds verbosity, but few bugs. This article is coming from the Ruby world where
a bug lurks behind every untested line of code: this can create a lot of
pressure to make boilerplate free code which can end up turning into
abstractions. But every code change (without code coverage) in dynamic
languages is also an enormous risk. In type-safe languages the compiler can
help ensure that the process of removing duplication is correct.

~~~
alexandercrohde
I agree. I suspect the author is coming from a purely-oop background, where
making a subclass/trait is a very costly and clumsy way to reuse logic.

In a functional paradigm, even the most trivial of abstraction pays off
handsomely. This is exactly what "map" is for example.

~~~
spion
Trivial doesn't mean it wasn't well thought out. On the contrary the more
trivial the abstraction the more likely its more general, since it has less
constraints.

------
anonytrary
> When dealing with the wrong abstraction, the fastest way forward is back.
> ... Re-introduce duplication by inlining the abstracted code back into every
> caller.

This can be tricky if the person who wrote the abstraction takes it the wrong
way. At my previous company, I've been yelled at for doing this. Some
developers get emotional about their code, in which case undoing their
abstraction causes offense. How do you get around this?

~~~
hobs
I find that the best approach is to talk to them first, explain your use case,
ask them how to solve the problem.

Most of the time you will either be welcomed for your proper obsequence, or
find out that there's a jealous guardian of that code and no matter what you
do it won't alleviate that tension anyway.

~~~
PDoyle
Yep, always talk to the person who made a WTF-inducing change. It's even
possible they've thought of something you didn't.

------
shados
What I go to war against lately is abstractions that are only there to save
people some typing. I strongly believe abstractions are there to abstract
-complexity-, not to save you some typing (if the bottleneck doing your job is
your typing speed, ask your boss to give you harder problems, hehe).

Being able to evolve 2 pieces of code separately is powerful, and
realistically is a more common case than wanting a change to pop up in 2
different places.

------
shaan7
There's another angle to this which I think is quite important. 4 years back,
I joined a team where the tech lead would always warn developers about the
abstraction problem described here.

Very regularly I'd hear "Code duplication is fine, do not use an abstraction
here". What he meant was "In this case the abstraction might be incorrect. Use
abstractions only when they actually relate to business logic, not just
because two pieces of code happen to look the same". Unfortunately, while that
was very obvious to him, to new developers to the team (like me) it sounded
like "Do NOT use abstractions, they are evil". Over the years I developed a
habit of never thinking about abstractions because they are evil. I duplicated
code that should have been abstracted and today we pay the maintenance cost
for that.

tl;dr: Experienced folks, be careful when you caution your peers against
abstractions. Be very explicit and assertive that they _CAN_ be used correctly
and one shouldn't avoid them.

~~~
fold_left
This is good advice. I've come to the conclusion that, when suggesting almost
anything to do with Development, I really need to be at pains to prevent it
being received that the answer is _always_ either Black or White at _all_
times.

It's hard to get across that the answer is usually one of the Greys and, even
then, the shade will probably vary a little from time to time.

Rules are the children of Principles, they're important handrails as you're
learning but to progress from there you have to understand the Principles
behind them and how they confirm or contradict eachother.

------
ChrisCinelli
When in doubt, I use this rule of thumb:

\- It is ok to have the code duplicate twice. Add a comment to track the
conscious decision.

\- But when I find myself to do it the 3rd time, it is time to think if I can
factor it out.

Three use cases tell me more than two about how things can be abstracted and
if it makes sense.

~~~
abakus
But the whole point of this discussion is that you should probably duplicate
as many times as needed if it helps to avoid wrong abstraction.

~~~
PDoyle
No. The point of the article is that it's ok to undo an abstraction that has
done more harm than good. Abstraction and DRY is still a good default mode of
thinking.

------
Animats
The question is whether updates can be made against both copies of the
duplicate, requiring synchronization. It's the difference between a replicated
slave copy of a database and a real distributed one. The first is easy. The
second is very hard.

~~~
gregmac
I'd argue if you find yourself needing to do this, it actually might be a hint
that abstracting is appropriate, as it's proof there is more than superficial
commonality _. Like so many things, I don 't think there's a way to make any
hard and fast rules, and figuring this out is more art/experience/taste than
science.

_ Not including cross-cutting concerns that modify all usages (eg, changing
your logging or dependency injection library).

------
strait
The "wrong" (less than optimal) abstraction might be better as long as it's
rigorously documented, which is rare, of course. In the right hands, it can be
an important stepping stone to a better abstraction. I'm dreaming already.

Duplication has it's own set of dangers leading down the road to a verbose
mess of convoluted crap code in most cases.

Most established code bases make me want to puke before long.

~~~
hinkley
The older I get the more I appreciate XP. The Rule of Three in particular
becomes a bigger presence in my life as time goes on.

With two copies of the code you can’t be sure if the similarities are factual
or coincidental. At three the situation begins to crystallize quite rapidly.

~~~
maxekman
Couldn’t agree more!

[https://en.m.wikipedia.org/wiki/Rule_of_three_(computer_prog...](https://en.m.wikipedia.org/wiki/Rule_of_three_\(computer_programming\))

------
jimmaswell
After all the hemming and hawing about this, I'd really like to see a real
case study on it. It's never been an issue for me. I've had functions where I
added parameters later on for new use cases to fit the function to them, yes,
but I never felt like I was suffering for it. Maybe DRY isn't so bad after
all.

------
gdulli
I agree with this so much. The way I've put it before is, the difference
between under-engineering and over-engineering is that you can fix under-
engineering.

~~~
contravariant
Personally I don't think that's entirely true. Both under- and over-
engineering can be hard to fix.

What definitely does make things easier is simply having less code to fix.
Although measuring the 'amount' of code is at least somewhat subjective.

~~~
targafarian
Yes! I think what I've come to is: Spend more time thinking about how to do
something more concisely, rather than spending time thinking of how to
abstract the code. Sometimes the former leads to the latter. But less code--or
rather more concise code (not to be misrepresented as "fewer characters" or
"everything in one line"!)--is almost always better for maintaining the code
into the future.

------
BjoernKW
I prefer disposable code over reusable code [0], i.e. code that's easy to
delete instead of easy to extend [1].

The nature of software development is change. Extensive use of abstractions
can make your code base rigid and averse to change.

[0] [https://bjoernkw.com/2016/11/06/writing-disposable-code-
not-...](https://bjoernkw.com/2016/11/06/writing-disposable-code-not-reusable-
code/)

[1] [https://programmingisterrible.com/post/139222674273/write-
co...](https://programmingisterrible.com/post/139222674273/write-code-that-is-
easy-to-delete-not-easy-to)

~~~
swah
I love that post from @tef_ebooks. Actually I come to HN/Reddit every day to
find another posts like that one, but only happens once or twice a year..

------
ninjakeyboard
Eric Evans maybe said it the best that code should be rewritten again and
again as new insights come from the domain to get closer and closer to the
real thing. Every time we write the code we gain new insight so the original
should be essentially written closer to the real things with these new
insights. I've been taking this approach coupled with ddd where the aggregate
root's pure domain logic is relatively easy to actually discard and recreate.

I can't find the snippet but it's somewhere in the "DDD is not for
perfectionists" vein

------
makecheck
While unexpected features can definitely complicate something that was merged
into an abstraction, bug “fixes” also matter and they can be worse:

\- Programmers may look at a bug in a simple shared function and conclude that
it “obviously” should be fixed, and do so quickly without really understanding
what else could go wrong. (As a completely contrived example: You “fix”
something that previously couldn’t return a negative value, and move on; turns
out this “fix” allows a bug somewhere else to crop up, a catastrophic improper
cast from signed to unsigned, blowing up your -1 into an iteration over
billions of expensive operations.)

\- Bug priority levels vary between features, even if code is shared. Your
abstraction may make it effectively impossible to fix just _one_ high-priority
feature, if your deployment is (say) set up to run hours or days of regression
tests on all affected parts. Generally, the more segregated things _actually_
are, the easier it is to set priorities well.

Just because something is duplicated doesn’t mean that it’s that way forever,
either. At a good branch point, such as a new project, you can _aggressively_
prune out things that won’t apply to that branch even if they helped keep
things stable on the previous project.

~~~
hinkley
My own read in this situation is that this problem is almost always less of an
issue in code that is very explicit about what it’s doing. A function that
uses nouns and verbs with a very precise meanings survives these changes
better than wishywashy code.

“Generic” functions make it difficult to find all uses or even understand what
scenarios they belong to. With bland say-nothing nouns and meaningless verbs
like execute() or process() that appear everywhere in the code, you’re just
crossing your fingers and hoping for the best.

------
zimablue
I've done exactly this where I'm just extending my code in an ugly way. Two
alternative patterns to extending a function with a parameter switch are to
pass in a function to the function or split the function into multiple
functions around the bit thats different.

Definitely a lot harder to fix when you've done this though.

I think "the wrong abstraction" is too lofty a title this is just about
oversize functions.

------
lifeisstillgood
The "wrong" abstraction is still a nexus of control and understanding - a
point you know the code will return to under certain circumstances

this is not to say that if you rollback the code and _then commit_ tonfinding
and implementing new abstractions you won't win - but the second part is
necessary or you are just digging a deeper hole.

think of it as a wrapped transduction - younhave to do the second part as
well.

NSTAAFL

------
mshenfield
I think this sometimes gets used as an excuse for people who don't want to
deal with another layer of indirection added by an abstraction. The gist of
this touchstone blog post is that we need to be willing to abandon our
abstractions as soon as we start special casing them. It's not wrong to
abstract, it's wrong to cling to an abstraction that is broken.

------
beaconstudios
I can't agree with this beyond the high-level sentiment. Perhaps the example
is just not good. In the listed example the error clearly lies with programmer
B who decided to extend an abstraction with a branch. That's not extending an
abstraction - that's squashing 2 different abstractions into one function!

~~~
gwbas1c
Which is the point of the article!

~~~
beaconstudios
but the article argues against creating the abstraction when the functionality
was the same in both places, which I disagree with. To be honest, the whole
topic disregards whether the two pieces of functionality are actually doing
the same thing in a domain-logic sense or simply a code sense, which is the
heart of the issue. If the similarity is coincidental (e.g. validation on two
model types temporarily happens to be the same) then the abstraction should
not occur. If it is not (e.g. two distinct operations on a model type happen
to have an identical subroutine) then the abstraction should occur.

~~~
Chyzwar
It is easier to introduce the right abstraction later. The article argues that
if there will be no specific abstraction, programmer B would create the right
abstraction without hacking on top of existing. This becomes more oblivious
when the application is in maintenance and everyone is risk averse.

------
welliman
I really like Sandi Metz as a speaker, and am pretty sure she has thought
about these problems much deeper than I have. However, I often find myself
wondering if most of the problems she is trying to solve would go away if
programmers spent more time thinking about their code.

The case filled code that she is describing seems to be a result of the
programmers not fully grasping the purpose of the code, and being unable to
tell if the current abstraction is fitting. I understand that deadlines and
the sunk cost fallacy play a factor, but, at least for me, finding the right
abstractions / architecture is most satisfying part of coding! Shouldn’t that
be what these programmers are focusing on in the first place?

------
shafte
I think people tend to focus on specific symptoms of bad code (duplication, in
this case) without thinking about what makes good code.

Ideally, we'd like our code to be:

\- Mutable (i.e. easy to modify

\- Understandable

\- Good at doing what it's supposed to do.

\- Other stuff that I'm forgetting.

The general recommendation against duplicate code is intended to promote
mutability (by avoiding multiple implementations that need to be changed). If
you apply it blindly without keeping mutability in mind, you can get
situations like the one the authors describes.

I see some of the same myopia when people talk about testing. Testing is there
to ensure that your code is correct, and that it's easy to make changes
without affecting correctness. As soon as you find yourself writing tests that
aren't for those two reasons, consider whether it's worth the effort.

------
willemmerson
What if the programming language could choose the right level of abstraction
for you automatically? For instance a language like Rust forces you to
structure your code so as to avoid race conditions etc., it's extra effort but
once you've done it it should work. What if a language forced you to make the
right abstractions or else it won't compile? We have self-balancing trees,
couldn't we have a self-balancing programming language? I keep thinking about
some way of programming which is more visual, perhaps involving graphs, where
the problems of over/under abstraction would be more obvious and the graph
could somehow balance/normalize itself somehow.

Just off to prototype this now, should have it done by the end of the day...
:)

~~~
Erlich_Bachman
You would need a working AGI (Artificial General Intelligence) for that to
work.

And at that point, you don't really need to worry about compilers, you just
have AGI looking at the code.

What you are proposing requires a technological miracle to implement. That's
why it doesn't make sense. When we can do miracles, we will obviously use them
in the mentioned and in many other areas. The problem is to create AGI.

------
HumanDrivenDev
I don't get this post and I don't get all the comments in support of copy
pasting code, and against DRY. I am going to need a real life example of when
copy pasting is a good idea, because I've never seen it. Giving some 'shared
code' a name really doesn't seem like it's a dangerous path.

 _Programmer B feels honor-bound to retain the existing abstraction, but since
isn 't exactly the same for every case, they alter the code to take a
parameter, and then add logic to conditionally do the right thing based on the
value of that parameter._

Programmer B's poor decision doesn't mean you should reach for ctrl-v, in my
humble opinion. But I'm willing to change my mind, if there's a compelling
case.

------
SilasX
I had this in mind recently when doing the Cryptopals challenge, which
requires you to implement the Counter mode of block encryption[1]. In that
case, encryption is the same mathematical operation as decryption. I still
figured I should have two different functions for encryption vs decryption to
make it more obvious which variables are intended to hold a plaintext vs
ciphertext.

[1]
[https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation...](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Counter_\(CTR\))

~~~
tedunangst
Depends on language, but if the type system supports encrypted and plaintext
types/traits, this is the way to go.

~~~
SilasX
Good point, I didn't think of that! That would be a better way to do it! (At
least in some respects.)

------
rsyring
Good idea in general. Not sure I agree with the idea of inlining a bunch of
code in order to refactor an abstraction.

I've always found the easiest way to refactor is to get really good code
coverage on the outermost layer of code that uses the abstraction,
remove/ignore unit tests if there are any on the abstraction itself, then
refactor the abstraction with vigor until it feels appropriate and hopefully
elegant or is removed if necessary. As long as all your tests still pass, you
should be good to go.

------
nickjj
I usually end up with quite a bit of duplication before I even think about
creating any type of abstractions.

There's no point trying to think about abstractions before you know what the
problem is.

~~~
mlthoughts2018
There’s also a solid YAGNI argument against abstraction, especially when
circumstances or requirements can change.

You don’t yet know what abstraction you need or what extensibility or
generalizability you need, and prematurely extending in these directions can
either paint you into a corner where you have to do terrible things to avoid
throwing things away and rework, or else you have to bite the bullet and do a
bunch of rework.

There can be a lot of benefits to just duplicating things like config,
occasional pieces of function bodies, classes with modified functionality, or
even using whole projects as templates, like quickly getting a collection of
inter-related web servics going with copy/paste code and factoring out common
code later.

------
styts
Watched the talk a few years ago. Several times. It really hit home. Before, I
would try to introduce abstractions as soon as I had one instance of
duplication. After, my code has more duplications, less nesting, less
abstractions. It's easier for a newbie to understand (or myself down the
line), it's easier to delete and modify. I've shared the mantra "duplication
is better than the wrong abstraction" with colleagues on many occasions.

------
patientplatypus
Imagine in 6 months you have survived a traumatic brain injury and you're
required to maintain the awful crap that younger/smarter you thought, at the
time, was clever.

Abstraction is bad and is the price you pay for being able to move lots of
things around at once. WET (Write Everything Twice) > DRY (Don't Repeat
Yourself) because you might be able to grok what the heck you meant at the
time. It kills me that my colleague wants to be clever. NO! Clever is bad!

------
wonton2
His just circles the problem i think. It is a symptom of programmers not
taking responsibillity for their own code. To cover your ass in the best
possible way you have to write as little code as possible and that is what
happens when a person sees a function that sor of doeas what they want. For
some reason we think this is deduplicating code and it is good. But at least I
have never been taught this in scool or by anyone in my work. We should treat
it for what it is: an antipattern used by programmers to avoid taking
responisbility. Another side to this coin is some times the «enablers» who
create unneccesary abstractions. These are people who play at being livrary
designers when they sre supposed to be developing applications as fast and
safely as possible. The people who creates complex tools and unneccessary
integrations inside the application code they write. Now, we are all risk
averse and enablers to different degrees and at different times. To counteract
these antipatterns i think it is important that we teach a few things: that
functions/method should have a single purpose, if ou have a boolean argument
and a if you should refactor. Rabbit theory of code. And at a higher level:
always keep the focus on the business requirement and do codt brnefit analysis
(light wieght in our head). And take time to learn the domain and tools you
are working with. This also requires experience. But if we focus more on
teaching people to learn sbout these things i think it may improve things

------
sevensor
In a similar vein, see "Premature Flexibilization is the Root of Whatever Evil
is Left":

[https://product.hubspot.com/blog/bid/7271/premature-
flexibil...](https://product.hubspot.com/blog/bid/7271/premature-
flexibilization-is-the-root-of-whatever-evil-is-left)

------
mehrdada
Shameless plug: Building White-Box Abstractions by Program Refinement
([https://mehrdad.afshari.me/publications/building-white-
box-a...](https://mehrdad.afshari.me/publications/building-white-box-
abstractions-by-program-refinement.pdf))

------
gwbas1c
I'm currently working with a driver developer. He doesn't have access to
simple connections. (Everything is a linked list.)

Every single lookup was a copy & paste while loop with business logic inside
the loop, and then a break statement.

This is a textbook example of when not to copy and paste.

------
veebat
I use the "dueling sins" model of "copying versus coupling." Early lifecycle
code benefits from the flexibility of copying. Mature code benefits from the
coupling forces of abstraction. Both have the capacity to do harm.

~~~
hinkley
Do you follow Jim Highsmith at all? His philosophy is that there are no
answers. That we are minmaxing a bunch of competing criteria and trying to do
the best we can.

In his words: we are a solving problems, we are resolving paradoxes.

~~~
hinkley
Too late to fix typo: we are _not_ solving problems, we are resolving
paradoxes

------
raz32dust
Personally, I'd still err on the side of creating or thinking about an
abstraction rather than duplicating code. While creating the wrong abstraction
is a problem, it should be a deliberate choice, and duplication should be
picked only in the rarest of cases, or where the logic is trivial. If a future
requirement renders the current abstraction wrong, it should be refactored to
fix the abstraction. If folks are adding edge cases to the current abstraction
instead, that is a culture issue that must be fixed. And you should always
have good test coverage anyway. This article makes an assumption that future
developers are lazy or incompetent and will not fix the abstraction, and I
think that we should strive for a culture where such laziness is not
tolerated, instead of living with duplicate logic everywhere.

------
0xBA5ED
This is analogous to confirmation bias. Existing code is the current
narrative. New requirements are like new evidence. With each piece of new
evidence, you must reconsider the narrative to fit all the evidence.

------
jmull
When you find yourself at his step 8 you don’t necessarily have to go back and
fix all the sins of the past. There’s a cost to this which may or may not make
sense to pay. You could simply not use the bad abstraction.

~~~
strken
It should also be pointed out that when you find yourself at steps 6 and 7 you
don't have to sin, and when you find yourself at step 1, you can obey more
complicated heuristics than DRY, like "I will apply the rule of three if the
duplicated code is pretty short and not inherently self-contained" or "if this
big method looks like it will change, instead of fully abstracting it, I'll
just break out the bits that look like neat little functions".

You only find yourself at step 8 after a suite of bad decisions, and possibly
even bad decisions that you signed off on during code review.

------
trhway
Premature abstraction. Especially endemic to large enterprise projects.

------
gwbas1c
One of the easiest ways to fix this is with lambda functions or callbacks.
Each caller passes a lambda function or call back that is the specific case
that's unique to the caller.

~~~
contravariant
In the case they're describing this should almost surely not be a lambda
function. Those should be reserved for when an arbitrary function makes sense,
but it's unlikely that someone would have abstracted a function that could
have done _anything_.

It's much more likely that the code in question is responsible for one
particular thing, and switches between several different ways to achieve some
sub-goal. Those parts should be lifted into some kind of interface, where the
different variations are lifted into different implementations of that
interface. A lambda function is the most general interface possible, so it's
probably not the best choice, you'd eventually end up with callbacks calling
each other without it being entirely clear which callback does what.

~~~
dragonwriter
> A lambda function is the most general interface possible

A typed lambda function (i.e., with a defined arguments + return signature) is
exactly as specific as any other typed interface, an arbitrary lambda function
isn't, sure, but there's few languages where a static interface and an
arbitrary lambda function are both available tools.

------
JonasJSchreiber
I am seriously going to start doing this. Great suggestion and the authority
and insight with which it was presented gives me a lot of confidence it'll
work out. Thanks

------
api
Nature obviously agrees. Look at the structure of the genome with all its gene
duplication with minor variation. Copy-paste-hack is one of the primary
mechanisms of evolution.

~~~
kirubakaran
Nature takes billions of years though, as it has to operate in blind
watchmaker mode. We don't have that luxury. Not often, anyway.

------
jonnycomputer
one problem with duplication is that its tempting to copy the code you need
and replace the appropriate parameters or values. Its pretty easy to mess that
up.

------
kazinator
Sometimes it's good to let some duplication proliferate before trying to
condense it. You don't always know which way it is headed.

------
nialv7
The real problem is that very few people knows how to tell which abstractions
are "wrong".

------
Walkman
I also like what Rob Pike used to say: "A little copying is better than a
little dependency."

------
dibujante
When optimizing for complexity, this is a reasonable argument. What about
optimizing for performance? It might be cheaper to make a network call in one
place and have multiple consumers each use the result, even if each of them
needs to use the result in slightly different ways.

------
dugreader
Premature abstraction...

------
ada1981
About half way through I realized this was true of mythology.

------
erikpukinskis
TL;DR: don’t put switches in functions.

~~~
logicallee
No just don't switch on a new kludgy ad hoc argument:

//bad:

int countlines(file afile, bool hasfuckeduplineendings, bool
needtobusywaituntilreadswillsucceed)

My preferred solution, and I don't claim that this is correct, is just to put
a global variable

bool nexthasfuckeduplineendings = false; //set to true before counting lines
in a file that needs preprocessing for fucked up line endings

bool needtobusywaituntilreadswillsucceed = false; // this is a hack. Certain
specific files will just fail to read for an unspecified period of time, they
will fail and fail and fail and then succeed. For cases that we know this will
happen, set this to true.

See how awful and fucked up this is?

It is "obvious" that this hack is just so wrong.

But is it really? It's clear, gets shit done, and is super transparent about
how wrong it is.

Should every reader hang in a busy loop?

Should every reader preprocess line endings?

Maybe "no" and "no".

What do you all think?

~~~
erikpukinskis
I think you should copy/paste the whole function, change one of the copies to
suit your needs, and then factor out any subroutines the functions have in
common.

This is what OP recommends too.

~~~
logicallee
That works okay for the first bool, but for the second duplication wouldn't
there be 4 versions already?

~~~
erikpukinskis
You can have 1000 versions. As long as the function is mostly free of side
effects, except whatever side effects are documented in a public interface,
then you can scale the repository linearly without any real increase in
complexity.

This is because while the namespace is wide, in practice you work within a
“working set” of your daily use packages.

------
dustingetz
What happens when Rails and Javascript is the wrong abstraction?

~~~
twothamendment
404

