
Why do so many developers get DRY wrong? - jerodsanto
https://changelog.com/posts/why-do-so-many-developers-get-dry-wrong
======
3pt14159
Eh. Something I find pro devs do is just code the damn thing out quickly and
wait for the right abstraction to emerge before stuffing it blindly into a
function. If that means a bit of repetition, fine. If you push everything into
tiny little methods or functions or abstract them into their own objects the
first time you come across a couple of repeated lines of code then the clearer
and better solution may not emerge as the requirements start to change. On the
other hand, self documenting code is most easily done via method naming.

This type of topic is hard to talk about. It's so nuanced that saying a
statement about how to do it sounds like a a gutless generality. It also
depends on the programming language and lifetime of the project. I've banged
out some real ugly code when servers were on fire, but it was all stuff that
was destined for an early death.

~~~
networkimprov
Documenting code via single-caller functions is usually a mistake, because to
everyone else who looks, the set of functions is an API.

Both internal and external APIs must be kept coherent.

Also when readers are trying to understand exactly how some function changes
the system state, having to refer to numerous other functions it calls is
tedious.

~~~
hinkley
Spreading state mutation out across the system is almost always a bad idea. I
would categorize that person as doing DRY badly.

Coalescing state transitions should trump decomposition. But it’s often the
case that you can do both at once.

~~~
Marazan
That second paragraph should be on the front page of the Redux website.

~~~
acemarke
Noted!

[https://github.com/reduxjs/redux/issues/3592#issuecomment-58...](https://github.com/reduxjs/redux/issues/3592#issuecomment-586620584)

:)

------
beaker52
I find DRY fascinating.

It's the source of a large portion of the accidental complexity I find in
code. "If I just create this abstraction, all this duplicated code goes away"
\- we've all heard it and many of us have told it, but few of us realise that
it's the prequel to the most popular story of all: "all this code is such a
mess, there are all these extra layers that don't really make sense and
unpicking it is such a pain, I can't believe someone wrote this".

The story inbetween is about a young, inexperienced developer who has 3-days
to deliver the one-feature-to-rule-them-all, to appease the almighty project
manager, necessitating an adventure into the labyrinth carefully crafted by
the developer in the first story.

~~~
Buttons840
DRY increases "connections" in the code (often called "coupling"). Pay
attention to what you are connecting.

The dominant example of this in my mind comes from some traumatic (and
dramatic) work experiences involving web scrapers.

You scrape pages A and B. Both require logins. You notice that A and B use
similar code to login, so you factor it out and now you have "connected" A and
B. Or, more accurately, the common "login" method is connected to A and B.

The problem is that A and B are separate ships, and they are going to
different places, and you tied a rope between them. That rope is going stretch
and fray and break. There's no reason logging in to A and B should be similar,
it just happened to be that way at one point it time. One day, probably, those
two sites change their login workflows to be completely different.

So "just repeat yourself" and let each scraper be self contained. Let each
ship sail their own way, don't connect them.

One case of this isn't bad, but if your not careful you end up creating dozens
of connections between things going all different directions. Break your
project into "things" and graph their dependencies. If you can break X by
changing Y, then X depends on Y. If you can break Y by changing X, then Y
depends on X. Your dependency graph should look more like a tree than a total
graph.

~~~
temac
If things needs to evolve to separate direction, you can duplicate them.

Unreasonable X is unreasonable, that's true no matter X but that's not very
insightful. Unreasonable factorizing is unreasonable. So are unreasonable copy
pasts.

Now a good question can be: would I "prefer" one or the other gone too far?
That's merely a personal preference question, and on my side I've got a clear
answer: I prefer something a little bit too abstract, to something a little
bit too copy pasted. Because while abstract I still can understand _more_ of
the system _more_ quickly, whereas duplicated I basically have to start by
reading a lot more and factorizing (possibly in my head) before getting to a
balanced situation...

Your own preference may vary, but I've yet to see a system maintained
correctly by people copy pasting too much and not caring about anything but
the _one example_ they have to handle (because e.g. of a workflow described in
a ticket).

Things can be accidentally similar, but _WAY_ more often in copy pasted code-
bases they are accidentally diverging, and that multiply the time I spend on
that kind of mess by a factor that is probably near 10.

~~~
darkerside
Problem with abstractions is, once they exist, people assume they are
meaningful.

~~~
gmueckl
That is because proving that it isn't is a major task that involves dissecting
the codebase, its current specs and planned future features (a seemingly
pointless abstraction now can be the groundwork for a new module that will
appear soon[0]). Thus the asymmetry between adding and removing abstractions.

[0] where soon either means really soon because product management needs it
yesterday or never because there is too much work to do in other places.

------
sethammons
A friend at work (years ago) introduced me to WET: Write Everything Twice, a
cheeky response to DRY enthusiasts. It falls just short of the Rule of Threes
(as soon as you write it a third time, refactor it out).

I think this article does have something going for it. DRY should be about
knowledge. Don't repeat yourself by handling tax rates all over, get that into
a central place. This is not about turning similar looking blocks of code into
a clean single block that handles everything as these tend to actually hurt
maintainability (ever see the littering of conditionals in "DRY" code because
multiple call sites use the similar code slightly different? Yeah, you did it
wrong).

Another comment wrote it well by mentioning SPOT: Single Point of Truth. DRY
and WET SPOTs. I feel like an analogy is forming that someone more quippy than
myself can ferret out.

~~~
hinkley
DRY tests are. the. worst. Way to lock us into last years’ requirements, bud.

This is where DAMP comes in. Descriptive And Meaningful Prose.

I should be able to read the test failure message and know what I broke.
Barring that, I should be able to read _the test_ (not the whole fucking test
file) and tell what I screwed up.

Anything else kills momentum, and I just want to get away from your code as
fast as possible. Which means more tech debt.

------
layer8
What they meant by DRY is otherwise known as SPOT — Single Point Of Truth —
which is harder to misinterpret. The same “truth” — which can be data, values,
behavior, policy, etc. — should not be defined multiple times in separate
places, because a future change would have to be applied to all the places, or
else cause different parts of a program or datastore to have inconsistent
views on what the “truth” is.

If you google for it, you will find the synonymous “Single Source Of Truth”,
which however makes for a worse acronym.

~~~
maest
Incidentally, this may explain why caching is hard - cache invalidation in
particular. By definition, it must violate this SPOT/SSOT principle.

~~~
layer8
While there is a thematic connection, SPOT is usually more of a design-time
(or coding-time) principle. Caches still represent the same source, just time-
delayed.

------
keeganjw
I feel like this article ended before it should have. I'm still not exactly
sure what the author means by Don't Repeat Knowledge. Should we not be
refactoring or... just don't go overboard?

~~~
mntmoss
When you refactor you are taking shots at moving around where coupling occurs.
If your code is maximally decoupled it is primitive copy-paste code that never
calls functions and intoduces unique variables for each section - and if it's
maximally coupled it will look like swiss cheese, trying to reuse the same
functionality for everything with clever parameterization, recursion,
indirection and globals. Intentionally coupled code is most common in memory-
starved environments since implict dependency helps reduce data overheads.

And so "DRY", to the extent that it's useful, encourages you to find slack
areas in the code where there's low potential for introducing coupling, and to
factor those out so that you have code that is mostly-decoupled without also
being redundant and hard to modify - the factoring reflects "knowledge" about
the problem. And yet it's not always obvious when you have the knowledge or
not. Sometimes redundant-looking code is a form of hardcoded data and a
factoring would only push it towards being fully data-driven(which exacts a
price in debugging). The Rule of Three is just a common way of making this
decision about knowledge.

~~~
temac
If your textual copy past includes some requirements of conforming to a
contract, or simply actual copy past of "knowledge" (which can very well
_also_ be a under the form of a big textual pattern), the mere form of
"primitive copy-paste code that never calls functions and intoduces unique
variables for each section" does not actually reduces coupling (except MAYBE
if it is expanded down to the metal, including expansion of syscall and/or low
level libraries, but you won't do that).

Also what have globals anything to do with that, and why do you put them in
the supposedly factorized code. You are mistaking it with a bad mess you once
saw, maybe? On the other hand, code bases obtained through copy paste based
programming can _not_ be considered anything else than a bad mess.

But yes, factorizing can be done badly, even to the point of being
counterproductive. Like anything.

------
jfengel
I hadn't heard of the Rule of Three, but it parallels my own heuristic. The
first time, I write the code to do the thing I need. The second time I
encounter a similar thing, if I can't find the right abstraction to unify
them, I go ahead and repeat myself, writing a second, similar round of code
that does what it needs.

If I encounter it a third time, then I've got enough data points to make a
good guess about what the right abstraction will be. If I've done a good job
so far, it shouldn't be too difficult to refactor it. (Strong, static typing
helps.)

This is, of course, just a heuristic, and it's not all-or-nothing. I'll take
my best guess about what the right abstraction is going to be, and I'll try to
get it right the first time. The second round also presents opportunities to
take two points and extrapolate a line.

It all comes down to experience: not just with the system, but with the domain
that the system is about, and with the way systems change and grow. No one
rule of thumb ever encapsulates all that.

~~~
AnimalMuppet
I use the same approach to automate processes. The first time, I do it
manually. The second time, I still do it manually, but I think "Hey, I did
this once before. This is looking like something I maybe ought to automate."

The third time I automate it. By then, I understand it well enough to have
good odds on being able to do the automation successfully.

~~~
amelius
How often did you automate something yet?

If it's more than three times, you ought to automate the automation!

~~~
AnimalMuppet
You're kidding (I think), but I missed a piece.

Part of the point of doing it the second time is to make sure that I really
understand what I'm doing and how I'm doing it. Without that, I can't write
the automation on the third time.

Well, if the task is "automating things" (for very general values of
"things"), I don't understand how I'm doing well enough to automate _that_.

------
rojobuffalo
Having a couple lines that are similar or copied in several places shouldn't
be considered such a bad thing. Repetition reveals similarity, and having
clear signals of similarity is really important. It's often more expressive /
easier to understand than a single method name.

Premature abstractions are way worse than repetition. A poor or insufficient
abstraction leads to obfuscation which leads to misunderstanding which leads
to novel constructs for the same responsibility. Because a poor abstraction
can be really really difficult to back track, you end up with hacky work-
arounds to get something done.

I think encountering novelty in a codebase is the biggest thing that damages
comprehension; and repetition actually enhances comprehensibility.

------
reggieband
I have so much to say on this topic that I feel like I can't say anything.
I'll just leave it at: beginners tend to do too little, intermediates tend to
do too much and experts try to do no more than enough.

------
namuol
> [...] the original “Don’t repeat yourself” was nothing to do with code, it
> was to do with knowledge

> The trouble with DRY is it has no reference to the knowledge bit, _which is
> arguably the most important part_!

Okay. Now what does this mean? Is this article effectively a tantalizing
recommendation to read The Pragmatic Programmer?

------
AdriaanvRossum
Like DRY is wrong? I don't really get the point of this article.

~~~
pkaye
Sometimes "A little copying is better than a little dependency."

~~~
wrmsr
I've seen so many times people going all in on DRY not understanding that just
as dangerous as duplication is _coupling_ - the inevitable result being some
ungodly $COMPANY_NAME_common lib with a thousand dependencies, and usually
only depped in a codebase for a config parser and a string helper. See also
node_modules and left-pad.io.

------
drewcoo
Does anyone else see a certain irony in the raw number of Dry articles in
existence? Or the fact that people keep writing them?

I think the deeper problem in the software industry is that we have no
collective memory and need to DRY up as a whole.

------
goto11
"It feels good" \- I think that is a really important point! Overzealous
DRY'ing is like a game. Every few tokens you save by clever reuse is a small
victory. So it is easy to lose sense of the big picture. Programmers often
like logical games and challenges, but it is dangerous to treat development
like that.

We should be weary of micro-optimizations for "elegance" which actually hurt
the larger-scale maintainability of the system.

------
crispinb
The comments here suggest two things: (1) most people misunderstand DRY (ie.
they think it's about code rather than knowledge duplication), and (2)the
article didn't do a great job of clearing the issue up.

Though an alternative to (1) is that the meaning of DRY in common dev parlance
has changed & has come to mean something different from Thomas & Hunt's
intention.

------
lr4444lr
I feel like this article is being critical about something without justly
staking a clear claim about what the right approach is. In my experience, the
benefit of DRY code is bug reduction and overall increased new development
velocity. There is a whole class of bugs around similar behaviors that devs
and product managers _expect_ to move in sync which _don 't_, because features
develop over time and it was just easier to code separate small bits than
refactor into a common code path. Yes, it can make readability harder to unify
into abstractions and create the right configs or import steps. But the time
hunting down and fixing the bugs, plus the drag on overall feature development
due to having to write updates in multiple places and test for them is far
worse to deal with for _not_ taking that preventative measure.

------
seanwilson
The comments should make it painfully obvious there isn't a general rule that
applies to all projects in all situation.

DRY, WET, DAMP, SPOT, KISS, YAGNI...

Software engineering is too nuanced to be summed up in an acronym and surely
the inventor of each acronym only intended it to be a basic rule of thumb.

------
wruza
The opposite of the rule of three and pattern detection is an instant
detection stupid code (no funny acronym for that). An example of stupid code:

    
    
      let setX = (e) => {this.x = +e.target.value}
      ... other setup ...
      input({onchange:setX, value:this.x})
    

This code is not a subject of undry or rule of three, as it’s ratio of meaning
to character count is too low.

And yet some frameworks make a decision to abstract it out at a wrong point:

    
    
      let [x, setX] = ...
      ... other setup ...
      input({
        onchange:(e) => setX(+e.target.value),
        value:x})
    

instead of clean and readable

    
    
      input_num(this, 'x', {})
      // or
      this.input_num('x', {})

------
makecheck
I don’t mind repeating something as long as I write it in a style that _could_
become a function in a straightforward way. Seems like the first time you have
to change something, you realize it applies to one case and not the other. The
more important thing is that the repeated/common parts are _obvious_.

For example, if you reuse the same logic in a couple places, where the only
difference is some _specific_ variable, it should be written as a block with
alias variables at the top. That way, two different cases look _literally the
same_ (except for a couple assignments at the top).

------
rileymat2
I don’t disagree with this but it is more complicated because often the order
statements are executed in is knowledge. Much duplicated code is duplicated
knowledge.

The question is whether it is a coincidence or the same concept.

------
dllthomas
It's memetics.

"DRY" is catchy, easy to talk about, easily verbed as a recommendation, seems
like a good idea, and seems to be recommended by people who know what they're
doing.

The problem is that it _seems_ self explanatory, so no one discusses the
definition. At the same time, the more obvious definition isn't the right one.

As a counter-meme, I have been proposing we refer to overaggressive syntactic
deduplication as "Huffman coding".

------
hyperpallium
You can have specious repetition, code that is identical only by coincidence,
not in a deep semantic sense.

To determine what it is, you need to understand the domain. Part of that is
knowing not just what it is, but how it generalizes and predicting how it is
likely to change.

Of course, this is approaching perfection. One could also cut and paste
monkey-like, e.g. instead of looping (without a performance need to unroll
loops).

------
drbojingle
Imo, Dont Repeat Yourself makes most sense when you forget about code and take
it to mean that you shouldnt repeat your work.

IE, if you wrote some code in one place that you needed elsewhere, copy
pasting the code is fine under my interpretation because it allows you to
spent the least amount of time working on a problem you've already solved.

------
fsajkdnjk
nowadays, as golang dev, i don't even know how DRY looks like :D

what I've learnt throughout many years of coding is that purest mantra one
should follow is YAGNI. devs think like devs and strive for perfect code. but
that goes directly against the business. the majority of the entire world is
being run on really bad code. but that bad code works. and that is what is
important.

in a way, you should treat things like blackboxes with strict interfaces. no
one should care how the box works, as long as its interface works like it is
supposed to.

PS: the dangers of DRY are introduction of deep dependencies that might, and
probably will, bite you in the ass along the way. DRY should be used only for
libraries, not for business logic - ever.

------
luord
Unlike Abramov's original post, I actually agree with this one, being more
nuanced. The author acknowledges that DRY (and ultimately, clean code et al)
itself is not evil; it's just misunderstood.

------
aazaa
This article would benefit from some code examples.

As it is, it left me with the same thought as those who claim to never need
debuggers or object-oriented features: Fine let's say you're right - _how_ do
I implement your system?

------
epicgiga
DRYing the code is just an entry level form of _simplifying_ the code. It's a
good starting point, but its value quickly taps out once junior devs start
writing incrementByOneFromZeroToNLoop(n) functions.

Code is closer to a craft like carpentry than a pure knowledge job. There
aren't any rules, only heuristics, and it takes time to hone your skills (not
"learn" them).

The biggest limiter in this is the excessive tendency for flat hierarchies in
dev. It flies in the face of the apprentice - journeyman - master system that
has always naturally structured the delivery and learning of craftsmanship.

------
skywhopper
I wish the author here or Dave Thomas would explain what they mean by DRY
meaning “knowledge”. They both say that and leave it as obvious...

~~~
dllthomas
Happy to do as asked, though I am a different Dave Thomas.

Per my understanding, it's anything you might say about your system,
especially as it relates to your domain.

"There is a button here", "this is what we store about users", "broken widgets
are red", "this is how we calculate interest", ...

------
WalterBright
Any programming guideline regarded as _Dicta Bölke_ is going to lead to
garbage.

They always must be leavened with _Good Judgement_.

------
_bxg1
The article is weirdly void of a counter-proposal, or a definition of what it
means to not repeat yourself in terms of "knowledge"

------
anonymouswacker
Why do developers not see the forest for the trees?

------
saber6
DRY like anything can be properly used or misused. For example, you can
normalize a database so much that any basic query comes with a massive
overhead (recursion). There is a middle ground between "religion" (pure DRY)
and "chaos" (no DRY).

~~~
goto11
How does normalization lead to recursion? Normalization is one point where DRY
really is critical, because redundant data in a database can lead to data
corruption when it get out of sync. And if the data is corrupt, your whole
business is screwed.

~~~
saber6
I may be mis-communicating what I am trying to say.

I once had the (mis)pleasure of dealing with some novice DBAs. There were
instances in the data model where things were so recursively linked that one
query would fan out to 200+ to actually return the set of meaningful data. As
you might of imagined, this caused scaling issues when you started dealing
with significant amount of data (hundreds of GB of data in the DB).

Eventually we brought in some pros and one of the first things they did was
eliminate a number of overly recursive queries and duplicated some (not all)
fields of data to speed up performance. We went from 1->200+ fanout in a
typical query to 1->25-ish. The performance gains were insane.

Of course it violates the "don't repeat yourself" and to abstract linked
(repetitive) data in a relational way. But sometimes this best practice can
really be counter-productive in edge performance scenarios. But in general
yeah, don't be repetitive, abstract your stuff, and keep it clean and easy to
change globally if/when needed.

------
proc0
So what's the problem with what people do? The article doesn't go into detail.
Keeping things 100% DRY is never a bad thing, with the only exception that
code becomes too obscure and hard for new contributors to start on, which
granted it's pretty important, however it's not like people are getting DRY
wrong.

