DRY is not about reducing code -- in fact, it's right in the definition the author provided:
> Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.
It says "knowledge" not "code". This is an important distinction. Removing duplicate code might not be removing duplicate knowledge and having duplicate/similar code doesn't necessarily mean you're duplicating knowledge.
A toy example: if you happen to have 2 products that are the same price you still wouldn't want to combine them into one constant value.
Some code might be similar by happenstance and combining that will cause problems when that happenstance stops being the case.
Perfect! It implies that there is something you have to think about before you think about DRY. This "something" is representation of facts and data. The universal idea of pulling data out of code seems to be as old as programming itself and is proposed by Unix/C programmers and Lispers alike.
After pulling out the data, the code becomes simpler, more general and more composable. The data structure then often reveals repetition that is not redundancy: If you look at a configuration file, a SQL table or a REST response that has repetition, it becomes clear that it represents different facts (to use your term) that happen to have the same values (Alice and Bob have the same age). This form or repetition is fine, correct and often interesting!
So by the practice of pulling out data from code we get DRY (by the definition of the authors) code.
However there is also also another form of repetition that is a bit more subtle: boilerplate. There, the representation of an algorithm (as code) is drowned in noise that does not convey the intent and the flow of what is happening.
To increase the signal to noise ratio in code that is already(!) DRY but clouded by boilerplate, one needs a way to reduce syntax itself, which can be achieved by meta-programming and DSLs.
I see this all the time in well-intentioned "agile" projects. They fail to fully understand the domain and then micro optimisations like this lead to features breaking in interesting and unexpected ways.
On a related note, I've seen people taking this too far in functional/procedural code. Not every concept needs its explicit name, not every function needs to be 5 lines of code or less. All that jumping around between function definitions is a serious impediment to readability.
(EDIT: this is mentioned in the article as "localization complexity")
I've been thinking about it lately. Esp. since I've taken another look at the refactoring chapters of Clean Code, but instead of being enlightened I felt disgusted. Like, for-loops are ok, if-statements are ok, those are the basic constructs of code, you can see the shapes visually and it's easy to follow the code path. This is what you find in low level C/C++ code, lots of switches, loops and conditionals. Those programmers just get used to the verbosity and look at shapes instead of reading line by line. Maybe it's easier for subtle mistakes to pass unnoticed, but refactoring out everything into layers of abstraction brings jumping around and context switching between these different pieces and trying to make a mental picture of how it all fits together.
There are ways to flatten those trees and leave the code a bit more readable. Bertrand Meyer hit on it 30 years ago:
if (weShouldDoThis()) {
return doIt();
}
Structurally, this seems very similar to having twenty methods with one conditional block and a delegation to some other method. But the stack trace is half as high, and the testing surface area is greatly improved, because the code in the 'if' clause is pure, or close to it, while the 'doIt()' method might alter shared state.
There's a big difference from a robustness standpoint between having half of your functions pure, and half of each function pure.
That stuff drives me insane. Can't cope with it at all.
I'm working in audio DSP as a rule, so I already have to maintain state of 'what the audio is doing related to this set of sample data', and often some things about how controls must move and feed information into the program to relate to a desired change in the audio which itself can be somewhat indirect. Maybe I am just stupid, but when I have to track DRY code I just blow a gasket immediately and can't parse it at all.
I've wondered whether some of that stuff is about taking trivial problems and making them hard to follow for the uninitiated, to protect employment. WHY not repeat yourself, when the problem is repeated tasks? Or more relevantly, why not write a complicated and long sentence, rather than mangle the sentence into a series of more abbreviated footnotes? DRY reads to me like that chain of footnotes. You lose the plot.
The Sahara-loving sort of people can justify both definitions of DRY for the same code change.
Often they've gone into removing idiomatic code as if idiomatic code isn't a presumed piece of knowledge the entire team shares, and seeing that code indicates "everything is normal" while seeing a function indicates "something odd may be happening, spend cycles looking at me".
The other definition is requirements, rather than knowledge or duplication. If two pieces of code are driven by separate requirements, eg, the price of two items is typically based on very different business situations, unless the two items are flavors of the same beverage perhaps, then the fact they share a price is an accident of the current state of the business, not dictated by a common requirement that they both be $1.50.
If both have to calculate sales tax because of the same statute, then don't implement the calculation twice. If a new statute changes that situation, for instance due to a sugar tax, split the code when the new requirement comes in.
when that happens it's called coincidental coupling, when two things get coupled together for the wrong reason. It often is done when people are trying to be too "dry" and trying to remove as much duplication as possible. But duplication is just an "indicator" that something might have bad modularity. However code should have good design principles guiding it and must drive modularity decisions like ensuring modular parts have good cohesion, it's not simply just because there is duplication.
"A toy example: if you happen to have 2 products that are the same price you still wouldn't want to combine them into one constant value."
A more common example, take an online clothes store. They only sell shirts and trousers (pants - :]). It's a really simple app. Obviously, I'm skimming a great deal here, but each product category is described by the following properties:
What the DRY principle generally addresses, is this basic idea.
My instinct would be to create an abstract class called Clothing:
- Clothing
-- Price
-- Size
-- Colour
Trousers (Pants - :]), for me, would definitely be an extension of Clothing.
But until my suppliers can furnish me with Dresses, T-shirts, Scarves, etc, I would not be too concerned about even creating a separate Shirt class, but probably would.
What I've since found through experience however, is that not everyone thinks this way. For some people, each product category would have absolutely have to have it's own wholly encapsulated class in this situation. I can see the problem such an approach attempts to head off, but until the problem actually exists, I believe this pattern makes the development process painfully inefficient, and harder to maintain.
It's interesting to me, because while it seems like 'one of those' arguments, it's somewhat more impactful than 'tabs vs spaces'.
Classification systems ("ontologies") often represent a particular path through a graph, rather than an absolute tree.
In your example, the store adds "Shoes". Then later, it adds "Sale Items". Now you want some shoes, trousers, and shirts to be on sale.
So each item now has two edges, its "primary_category", and "sale_items". A little bit later, the store turns into a Sale Items at Massive Discounts! Store, and the primary ontology becomes price tiers.
Even here, I feel we're overcomplicating things. For example, possibly ignoring what 'Shoes' are comprised of, were Clothing now to contain:
- Clothing
-- Price
-- Size
-- Colour
-- Discount
Then does your scenario really become as big a problem as you suggest? For one, some ontological relationships needn't be described in the same layer, or can simply be implied.
A Red Fox, a Red Ant and a Red Snapper are definitely related in some way, but I wouldn't use the same class to program a representation of all three.
A Red Shirt, and a Red Pair of Trousers/Pants though, probably share enough for a rough outline for online shopping app though.
Actually I would not create any Clothing class or abstract thing, but instead probably create a structure, which is named something like "product" and then let it have a flexible attributes container. For example it could have: id, categories, price, product-attributes. Where product-attributes is a key-value thing. categories would be either a hierarchy or tags, but probably tags, as they avoid issues with inventing artificial hierarchies later.
This way I would never need to make a class "shirt" or "pant" or whatever. I would simply add categories for them. Way simpler than encoding this stuff in some class hierarchy. There is that word again, hierarchy. The problem with those is always, that sometimes you will have an item, which fits into 2 sibling categoies or into 2 parts of the tree, which are not on the same path. With classes you will need multi inheritance or you will need to come up with a new parent of both of the classes.
The problem with this approach (and I've done it) is that product-attributes is now this black hole that's difficult to query, difficult to modify, a huge potential for errors that could even be caused by a typo. It's also incredibly difficult to version.
I regret going with this approach even though it results in far fewer tables and classes. It's short term gain for long term pain. Everything is trade off -- it might well be more than worth it for something not expected to live long or have a lot of changes.
Although I agree with your point that not everything needs to be hierarchy. You can be explicit with your data without having everything inherit from everything else.
Fair point. I guess you would still have to have your conventions (and actually follow them) about what you put in the attributes for which kind of product. I am thinking though, that if you need to implement any new functionality for a specific product, you would discuss it with other developers (or yourself, if you are alone) and find a good convention of what to store and then on another part of the program make use of it. If you wanted to encode the convention into your code, you can create product type specific constructors, which you have to use (another convention!) when creating such a product. Still lighter weight than creating a class for everything and avoids coupling behavior with state.
I guess the approach stands or falls with developer discipline. If you one is not disciplined enough about the conventions and carefully choosing them, one could end up with a mess of a mass of unknown keys in products, which no one knows why or how they got there.
What I'm saying is that trees represent a single path through a graph. As requirements or nodes change, the path can change, or multiple trees can coexist.
Think tags on items represented on a tree, where clicking the tag reorients the category around what you clicked on.
While I largely agree with you about what the right kind of takeaway you should get from DRY, it seems like a no-true-Scotsman argument to me.
DRY stands for “don’t repeat yourself”. that’s its de facto definition, unfortunately... whether it’s useful or not.
Just like REST and OOP and a host of other things that “everyone gets wrong”, you have to consider that if everyone gets it wrong, maybe the messaging isn’t very good in the first place.
> Just like REST and OOP and a host of other things that “everyone gets wrong”, you have to consider that if everyone gets it wrong, maybe the messaging isn’t very good in the first place.
In my experience, it just means that an army of ignorant middlemen grabbed hold of a concept and ran too far with it.
OOP / UML / XML / etc. etc. They all actually solve problems very well. But the people who taught those concepts were mostly selling snake-oil under the guise of XML or whatever.
If any particular paradigm becomes useful: the parasite class of marketers / con-men come out and sell fake versions of the strategy. Eventually, the con-men completely corrupt the concept and we just have to move on to the new concepts that haven't been corrupted yet.
I'm working in an older codebase (not THAT old, but it feels old because it's so bad), lots of database interactions. There's three data models (or sets thereof) that are mostly the same - MOSTLY.
I could make the tables flexible (add columns to fit all data models). I could make the queries dynamic (make the table part flexible). But I'm duplicating code instead.
As the proverb goes, a little duplication is better than the wrong abstraction. There's bits of code in here where the author tried to be smart, but queries like "select id from $table where x = $y" really throw my editor (and brain) off.
On the other hand you could argue, that "how a computer does things" is also some kind of knowledge.
For example: How to render some part of a website for example or a calculation, which uses the same formula as is used in another part of the code to calculate a number.
So with that interpretation of knowledge, you would have to avoid duplicating code. And it makes sense, because, if one day you need to change the way the computer does something, you want to avoid changing it in multiple places, because that makes it easy to forget places. This way avoiding duplicated code increases maintainability.
> It says "knowledge" not "code". This is an important distinction.
Yep. Though the same pathologies exist even with this framing. By that I mean it's easy to misdiagnose what the "knowledge" actually is. Even totally identical types and behaviors can be distinct in the knowledge sense if they exist in different contexts.
Yepp. It took me a long time to embrace the mindset that repeating things is sometimes way more elegant than unreadable abstractions. Dan Abramov's article [1] brilliantly put into words how I feel about DRY. It's a great heuristic, but still a heuristic.
The problem is DRY almost always improves readability, at least with well named functions. It reduces comprehensibility, so much of the logic is abstracted away that the bigger picture can't be seen from the inside or outside of the abstraction.
I felt that Abramov's article showed...nothing useful at all.
Yes, he found an abstraction that turned out to be not useful. But that doesn't mean there wasn't an abstraction that could have been useful. Or even that the judgment call that his exact abstraction wasn't useful was correct.
DRY is important. Maybe he didn't find the right abstraction, but for the most part it's possible to refactor repetitious code with readable abstractions.
Dan's point in that article was that he spent time doing something basically for the sake of keeping up appearances. Kind of like going through motions without achieving anything useful.
It wasn't solving any problems.
It's better to focus on solving problems as they rise up, as opposed to doing things to keep in line with some arbitrary standards that may not be relevant.
Key is learn how to discern between trivial issues and real issues that affect the project.
No, it runs deeper than that. Dan came up with a really twisted, weird set of abstractions to link things that really weren't as similar as he thought they were. After he was done, you had to keep track of whether things had sides or corners and how many of those there were, something that didn't enter into how the original problems were framed. Instead of presenting 'okay, here's our resize for these respective things', his (disavowed) code began demanding how many corners your thing had. What about orientation? If you have a star, with five corners, and you can rotate it 180 degrees, what do you then call 'top left' to force that into your DRY not-repeating resize routine? Is it the bounding rect? If you've rotated the thing, is top left now bottom right or are you only concerned with the bounding rect? What if you have a non-contiguous selection so you've got two top lefts? Are they one big rect, or are you trying to resize each thing in its individual location?
The point of his article was that he'd CREATED possibly significant problems through imposing an abstraction where it didn't fit. I can think of a number of new 'things' for 'resizing' that would be valid new 'things' with intitutively sensible behaviors that would violate any enclosing abstraction he might have produced, especially the one he offered as a 'bad example' where sides are conflated with corners and so on. He's right to have realized he was on the wrong path.
I've heard of WET before, but I don't think that's the best "anti-DRY" phrase.
In the unit-testing world, they say "DAMP, not DRY". Descriptive and Meaningful Phrases, which happens to be repeating things a lot more than twice!
Some things are just clearer when repeated. Under such conditions, repeat yourself all the time (as long as it is clearer).
The question of DAMP vs DRY is eternal. Repetition is sometimes more clear, but sometimes is overly verbose and unclear. Its hard to know without experience.
When I have encountered DRY in the wild, it has always been DRY zealotry. Repeated code was absolutely never allowed, and everything had to be modularized such that if anyone could ever imagine a future where something might get used again, it needed to be structured like that from the first merge.
What it led to was an inscrutable codebase where unless you had been working with it for years, finding out where a thing was actually done was an exercise not unlike an archaeological dig.
Like most things, it’s a matter of something which is a good idea, but the reasons it is a good idea are lost in its most common application which is at the exteme.
I think one key problem is whey people attempt to DRY the code before actually understanding what they're building. I like to let the code "tell me what to do"; I'm building something, and I repeat the pattern a few times, so I keep an eye on it. Eventually, I'll have a few repeated patterns I can look at. A lot of times, they're similar, but not the same. Waiting gives me the opportunity to review several similar patterns and find a more elegant solution that can accommodate all of the use cases, rather than forcing a sub-optimal abstraction for the sake of being DRY alone.
There are definitely times where the DRY vs DAMP question is a tough one, but I’ve also had lots of discussions where the DRY argument basically adds up to “not knowing ‘Find/Replace in Project’ exists”. Man, if I had a nickel for each time this has happened:
I’m discussing a refactor with a DRY engineer, and they say something like “So what do you suggest? Just replacing that name in - every - single - place - it’s used? That would be error prone and time consuming.” Then I run a find/replace, the unit tests run, the green light flashes, heads explode.
Again: DRY vs DAMP is often a very good thing to be debating, but in my personal experience the DRY advocates have gotten way too dogmatic.
It's a fairly common thing in our industry to take ideas to extremes where the initial benefit becomes a liability. Then comes the next thing and the same zealotry kills the next thing as well. The next thing is very welcome because it allows people to start as a blank slate and escape the corners they painted themselves in.
Nice. Love this because principles should always be judged in relation to other principles. "DRY is good/bad" gets you going in circles, but "DRY but prioritize KISS" is a lot more practical.
Repetition also makes it harder to replace one thing in all instances but I agree that too much DRY can make a codebase a headache to wrap your head around: lots of different overloads for constructors and lots of branching inside the common code makes it hard and unpleasant to work with or ramp up as a new hire.
Its super fast & easy to take a business service like CustomerLoginService and copypasta it into a new version called CustomerLoginService2 which addresses 99% of the same stuff but has a small tweaks throughout to help with some special business edge case or migration to a new way of doing things. Adding the new service to DI and being able to access it (in a well-designed application) is typically 1 line of code on both counts.
I don't worry too much about this kind of stuff anymore. If we end up with 5+ copies of this thing and it actually causes problems or otherwise slows us down, then I would probably consider taking the time to actually clean it up using something like parameters, inheritance, or generics to establish desired context of usage and normalize code.
This sort of scheme might look tacky and trigger developers a little bit, but seeing CustomerLoginService2 in a stack trace and being able to instantly zoom to a scoped implementation unique for that context of usage is a pretty big advantage. Contrast with if you were to normalize down to a single CustomerLoginService. Debugging the added conditionals & state will certainly be more cumbersome.
It depends how clear it is to future maintainers that there are 5 places to change the code to implement a given functional change. The major risk is often that one of those 5 files gets forgotten about and it leads to a surprise later on. You can use parallelism and naming conventions to help communicate this
Sure, but my experience is that this is more in line with the client's or manager's mental model and expectations. Eg, a designer will lay out some sort of user story which typically defines "flows" from page to page. And they don't expect that the work on this new "flow" will break the existing ones. In this case sharing code can become a liability. And this sh* happened a bunch of times where other devs trying to be clever, instead of just duplicating a UI component or page and making a few changes, they put in a bunch of params to support the new use cases (a better way of course is to extract common elements). The result as predicted was a spaghetti mess to untangle. Refactoring it is like dismantling a bomb.
Interfaces do well for that, let you work as much as you need while breaking early if some of the classes diverges too much and being one "find implementations" away from each other
DRY is great guideline. KISS is another one. I find that it is not uncommon for the two of them to butt heads a little, and it can be tricky to find the correct balance. In general, I am finding that I favour repetition over abstraction more and more.
> Every time we extract a chunk of code from a larger function into a smaller more encapsulated one, the code becomes just a little bit harder to follow.
I, in most cases, disagree with this. If this were true, I suspect the best and easiest to understand functions would be thousands of lines long
Often you write functions that only get called from one place, just so you can give a name to a block of code and make it easier to understand.
Just today, for example, I wrote a big case/switch statement to figure out the URL for a user on a system where the URL is different depending on where the user exists in a hierarchy (I didn’t write the server side of this - gotta work with what I have). I extracted this all out into a “getUserUrl()” function. In the caller, it’s much easier to understand what is going on when you see “getUserUrl()” than it would have been if there was the giant switch statement there.
Cognitive burden would generally not be improved by inclining every function I call.
> Are there special cases or arguments only used by a fraction of callers
My thought is special case arguments are fine for internal calls but are lousy for external ones.
If you have two public functions with the same meaning but slightly different circumstances, having a protected method that handles both via an extra argument makes sense.
Consequently, the only way to end up with five conditional arguments on that function is to have an enormous number of public functions all doing more or less the same thing. Somewhere around 2, alarm bells should have been going off in your head, asking why you/we have done this and perhaps now is a good time to stop?
The article is about DRY, but it opens talking about clean code. IMO, "clean" is a poor word to use when evaluating code quality, not because there is no clean code like the author says, but because clean is an undefined subjective judgement call. What do we _mean_ when we say clean? Easy to debug? Follow? Maintain? Extend? Predict? These are more objective qualities and can be quantified. I think using more precise language instead of "clean code" results in ... uh.. cleaner code? lol
I personally follow a rule called “simplest model that explains foreseeable reality”.
So at first you really have to dig into what “reality” is .e.g if you are building an auth system, how many ways do you login? Login page for customer, login via sso, login via google auth federated etc.
Now foreseeable reality is what is most likely going to happen. YAGNI proponents would say don’t do this but I’ve found that thinking through future scenarios that competitors are doing or customers asking for (even though you don’t have time to do it) should be taken into account. You realize it would be nice to support auth via personal tokens restricted to certain api calls.
Having a solid understanding about foreseeable future is paramount. Task #1
Then it’s understanding what pieces need to be stitched together to make it happen. What changes together, what changes independently. What are the separation of concerns between the pieces?
You need to represent different auth types in db. A central place that takes a request and it’s headers to authenticate. Also authentication and authorization are different things and shouldn’t be bolted in the same place.
Having an in-depth understanding of separation of concerns and boundary contracts is Task #2.
Nailing 1 and 2 down usually leads fo clean implementations.
Also huge believer of continuous refactoring. Sometimes reality changes. Focus on a small thing you understand for sure. Incrementally add things and re-think the separation of concerns and how reality is represented in datastructures. If that needs a change, make it happen. Building on wrong level of abstraction and working around it has caused a great level of pain.
There are a lot of things that go into the decision to future-proof (and how) vs doing the absolute simplest thing and moving on.
One of the things that goes into it is how clear the direction the business is headed in and how experienced you are with both the tech stack and solving the specific problem.
I think the whole problem with DRY is that it's too often about adopting layers of abstraction that aren't useful. They're not worth the overhead they demand.
Therefore, the opposite of the DRY issue could (should!) be Worth A Preemptive Unhooking of State. And most of that stuff will not be at all worth unhooking whatever state you've got in mind, in order to go into whatever layer of abstraction is required to DRY up something absolutely trivial and meaningless. Semantics matter. We need to be focussing on what's important in the problem set, not getting derailed into computational linquistics.
Code that is WAPUS-Y might be deeply unfamiliar territory to our more rigorous brethren, because the very idea of asking whether something is Worth A Preemptive Unhooking of State implies there are matters outside what we learn in comp sci, a subject to our coding outside the coding itself.
But that is so often true.
And maybe this framing of it can be, erm, sticky and illustrative in its own right :D
The end of the 1940s gave us the subroutine. A means through which one could encapsulate some functionality into a named construct that could be called (reused) throughout a program.
It’s a construct available for a programmer to use. DRY encourages one to do so when appropriate. Dogmas interpreted that one must always do so are simply unhelpful.
DRY is about creating a common abstraction for code that is "semantically" equivalent, not "incidentally" equivalent.
Even if the code is the same today, is there a case to be made for a new feature that would require them to evolve independently? If so, the "logical" meaning of your function is independent between the two use cases. The code looks the same, incidentally (by accident). Speaking moreso to "domain"/business logic.
And of course, there's the more obvious DRY case of truly context agnostic operations, like math functions, string manipulations etc. Ideally you're using some standard library for these things... agree with the author that it's better to avoid sharing tiny custom libraries if amount of logic is too low to warrant the extra abstraction. The cost of abstraction is not fixed, of course, but dependent on how common and familiar that abstraction is to the broader audience.
> A great example would be a math function like log2. That function should work for every case that you need to calculate a logarithm – each call is fundamentally the same.
Even in the case of log2, some log2s might need to be fast approximations, others accurate down to the LSB, some operating on complex numbers, and others on array-like data...
It's still correct in the sense that you shouldn't repeat the code for calculating the logarithm in every single scenario where you need one. Of course if you need a function for a different purpose then it makes sense to make another one.
Of course mathematics is always an endless game of expanding and contracting definitions so might need some additional versions anyway (most commonly something like f(x) = log(1+x)), though if you're clever about it you can rephrase most of what you want with existing functions.
For well known scenarios i use DRY or not DRY right away. But when I don't know how code will evolve i use three times rule. Three use cases are often enought to form an understanding of code usage, to apply DRY if needed.
So I just leave the mess for a while and see what will grow.
It is a perfectly good solution if we accept that refactoring is also a thing. And it should be. Code can not be perfect from day one.
These rules are can be applied widely. For example sometimes it is good to repeat code in s React component and leave it that way.
Especially if it is unknown how the design will evolve. So also it is sometimes better to leave some mess that is easier to manage than to slice everything that may make it harder to manage the code further on.
Often chasing DRY and other rules too early leads to trouble. I call it early optimization. Just let the code breath! Give it some time to make it perfect!
Nice. I like the distinction the author identified - coincidentally similar vs. fundamentally similar code. I think the passage of time is sometimes the only way to tell the difference between the two. More specifically, by seeing if the code changes in all places for the same reasons.
DRY is something to strive for but as pointed out there are often tradeoffs with DRY. One that I ran into not too long ago when writing https://github.com/ohler55/ojg was the performance penalty that DRY code can incur. Sadly there is a lot of repetition in that code but by avoiding a dry approach the code is also several times faster than if more functions were used to avoid repetition.
>Obviously, the cognitive burden becomes greater and greater the more definitions you need to jump through.
Who says the cognitive burden is lowered just because of adjacency? If you have to keep N variables in mind, its about the same if they're at current depth or 1 level deep.
To me the most evil form of DRY is when you get functions calling functions with arguments that are just passthrough. If a function only uses an argument to call a subfunction, it's a sign that those functions should be decoupled and better orchestrated.
A single level of pass-through may indicate a boundary between two modules. Two layers of pass-through means you have two abstractions talking to each other, which is not good. I'm removing some of these from our code right now.
The hard thing to remember about code is, nobody gives a shit about your code when everything is working. Nobody is poring over it basking in your magnificence.
If they're looking at your code, it's because something is wrong. So they're looking for wrong things, and they don't know where the wrong thing is. So they're looking at every code smell they see along the way doing a probability calculation. Your ornate clever code causes them to dump important state, such as "why am I looking at this code in the first place?".
In the case of argument passing, we're all familiar with argument transposition. I have to check you haven't done it. If you rename variables, I have to watch for that too. If you have a stack of empty methods just ferrying arguments around, you've multiplied the surface area I have to think about.
I've found the problem of "too many abstractions" fairly common in functional programming contexts... though not saying it's inherent to the paradigm.
Usually it manifests in having tens or hundreds of tiny, stateless functions, with very little logic in each one. There's a common line of thinking about this being "good" because you can "mathematically" prove the correctness of a function, and thus can ignore its implementation. But abstraction is only a big win when the abstraction is easily and widely understood. When you create a one off, ad hoc abstraction its purpose is not going to be obvious to the reader off the bat.
The important thing is that the total "api surface area" within your context is low, and each function/abstraction has sufficient meaning. There's a sweet spot where you can fit most of the local functions in your head, without having to constantly re-read to remember the purpose of a given function. Naming is quite important to this end.
I find DRY to be useful in some cases even when there's only a second copy of the code. I hate seeing any duplicate code past a certain level of complexity. Other times I'll copy a few lines of code all over the place and not worry about it.
The core conclusion I've come to is to take ALL such "rules" with a grain of salt. Programming is a skilled craft, and there are no rules that you can apply universally.
Don't worry about DRY until you hit three copies. You probably don't know enough to make the right tradeoffs until then. To me DRY has always been about, i need to change something, and now need to magically know everywhere else it was copied to to change as well. So that's kind of my mental check for when it starts being worth it.
I find the concept of pontification useful. If you find yourself pontificating in your software, that is, writing abstractions for the sake of abstracting things, you may be going too far. Abatractions should be clear, concise and purposeful.
> Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.
It says "knowledge" not "code". This is an important distinction. Removing duplicate code might not be removing duplicate knowledge and having duplicate/similar code doesn't necessarily mean you're duplicating knowledge.
A toy example: if you happen to have 2 products that are the same price you still wouldn't want to combine them into one constant value.
Some code might be similar by happenstance and combining that will cause problems when that happenstance stops being the case.