It takes experience to unlearn this bad habit and realize that “duplication is cheaper than the wrong abstraction”.
While this post may not provide a perfect example I think it gestures in the general direction of this very important principle.
On the other hand, DRY as a principle shines when it allows logical changes to a program to only require physical changes to the code in one place. E.g., in <horrible but self-contained algorithm> it's plausible that bugs might exist, and you'd really like bug fixes to apply to all implementations. The easiest way to manage that is to only have a single implementation. Likewise, to the extent that they're sometimes necessary your magic strings should be given a name so that your compiler can catch minor typos (supposing the edit distance between various names in your program is largeish).
It’s really silly to argue about this stuff. This claim has no evidence. For example, I feel the exact opposite, that abstracting early always makes it easier to refactor, and does not prevent ending up at better abstractions later on in any way.
I _feel_ like that’s true. And you can’t prove or disprove either side without agreeing on a cost model.
In terms of your algorithm, you would want to decouple your ingredient prep from your cooking algorithm. Otherwise, if prepping ingredients takes longer because you buy a new prep tool, your food winds up over or under-cooked. Secondly, you want to decouple your cooking algorithm from your equipment model. Otherwise, every time you upgrade your oven you need to rewrite every recipe. But this is all a digression.
In future if you want to make a point about software, I would recommend using either English or real code and not a stressed analogy to a novel domain. But in general, it seems you are still learning the craft. It's really great that you are thinking about the evolution of a codebase over time as this is a key area that people earlier on in their career miss, and IMHO one of the greatest learning experiences for a programmer is maintaining non-trivial system over an extended period as the environment and requirements change.
Oh, and check out https://web.archive.org/web/20021105191447/http://anthus.com... (1985).
To prove anything in the OP solution would be extremely complex. To extract new knowledge and abstract the solution in the future would be nearly impossible without a complete rewrite.
For example: what if we need logging? Timing of the steps taken? A list of dish washing tasks generated? Parallelism in the tasks, given an extra cook? Exception control? Unit testing of the dough?
Adding an extra recipe is not the only possible new requirement you can have. Anticipating and preparing the right abstractions, that’s what good software engineering is about.
I find this a rather surprising statement. It sounds almost like something from some weird parallel universe. In my universe I generally take for granted that anticipating abstractions is not something that actually works.
Let me give an example. Some years ago I was working on some code that was writing and reading data from a database. A colleague said that we need validation so every field that can be written to the database needs to have the ability to have a validation. So, optionally a validate function can be attached to each and every database field. I was against it at the time because as I stated I do not believe in anticipated abstractions. But the colleague was convinced that it was necessary and wrote this. A few years later indeed validations had been added but literally all validations where about properties that a set of fields together should have and literally none of them were about single fields. At some point I just deleted the single field validator. It had been there for years but it never was anything besides completely useless.
To me anticipating abstractions is a recipe for all kinds of over-engineering. The need for abstractions arises as more requirements need to be fulfilled and when writing those one should probably think about what is likely going to be desired in the future but anticipating them before it is needed is something I have given up a long time ago.
If you leave a hook in for some feature you "know" is coming you'll only screw it up - either that feature won't be needed or it'll look different than you expect. Your field validator is a great specific example, I think I'll use that one in future.
(A corollary of this though is that you have to be happy to ruthlessly refactor existing code when a new requirement actually does enter the scene, because of course that requirement was deliberately not prepared for in the existing code.)
An abstraction would be to say: "These programming statements are actually objects." Or, the time and ordering constraints implied in this program should be explicit.
In addition, I was talking about anticipation and preparation of abstraction. Not the actual abstraction.
By "hook" I didn't mean anything specific, like a React Hook. What I meant was adding some extra abstraction that has no purpose except for some future feature. For example, taking a class that works perfectly well on its own (let's say Rectangle in a drawing program) and separating it into a base and derived class (let's say Shape and Rectangle) in anticipation of a feature in a future release (next time we're going to add a Pentagon class too and that'll need to derive from Shape). If you don't consider that change to be adding an abstraction, then I suppose we just have different ideas of what an abstraction is.
> In addition, I was talking about anticipation and preparation of abstraction.
I wonder if you've just used a word that doesn't reflect what you really mean. Maybe you just meant designing and creating an appropriate abstraction for the current requirements? "Anticipate" would be the wrong word for that.
"Anticipate" literally means making an educated guess about some future event before information about it becomes available. Most of us have experienced other devs aniticipating future requirements, and creating abstractions in response to that anticipation, and the inevitable negative fallout of that. So seeing "anticipation ... of abstraction" is bound to generate a negative emotional response.
This in turn influences the choice of programming language, unit testing, naming, modularization, infrastructure...
Just as a simple counter example: I write my code so I minimize the amount of state and provide good type information. The advantage is that future (re-)composition in other abstractions of the code will continue to work.
In your example, you were actually implementing an abstraction and anticipating a future use. That is something completely different.
What I am trying to bring across is: there are multiple 'simplest' solutions. It helps to know how, in the future, you are expecting to abstract away from that solution to newer 'simplest' solutions.
For example, one might want to use a FP-ish approach, because simplest solutions within that space tend to abstract better than an OOP approach. Or in the recipe example, we could have modeled the steps as objects with dependencies. Given the right programming language, that solution might be as simple as the OP, but provide many more extension points.
At least, I've really struggled to convince people that it's true until reality hits them in the face.
If we already know that a database might be required later then using something like sqlite straight is smart move that would allow you to write code that is instantly going to work when you actually need something fully fledged instead of re-writing it.
Knowing if you need a database or not is simply one of the first things you know, and writing code before setting that up is a waste of time or a learning process.
It is, unfortunately, very easy to write difficult to read code, especially with good intentions and principles.
In practice, I've found that the most important principle is Locality, that is avoiding nested indirections and unecessary abstractions.
I completely agree with the author of the article here, the simple and dumb recipe with constants local values is both easy to read and easy to maintain.
It might seems like duplication but the complexity has to live somewhere and it is more manageable when it is not scattered.
The issue I often see is that when abstractions are created, the thought process or design of those abstractions aren't well explained. When you create abstractions, more verbose documentation and design is needed to share the ideas. Anyone using this in the future needs to understand your abstractions and there's a cognitive cost of dealing with it.
You reduce this cost when you explain everything well, give examples, show use cases, etc. When you don't provide this sort of verbose documentation, you might as well have made it a large sequential program because it likely would have been easier for the next person to understand.
When unsure, it is still better to write less abstract code and somewhat messy code than falling into the over-engineering trap.
It will cost less to fix and it will also cost less to write in the first place.
The author chose a simplified example to demonstrate a point.
Also, without a proven a need, building for these considerations would result in an over-engineered solution.
Shrug. I too can make any point if I get to choose my own contrived examples. And when a bad example is chosen, like here, any reactions will devolve into bikeshedding about the appropriateness of that example (as is clearly seen in this thread).
Writing maintainable software is about solving the problem you actually have as simply as possible. Introduce abstractions only when you need to.
You see, in the 'problem solution lattice' there are multiple bottoms  (most simple solution). Knowing which 'simplest solution' is the best, depends on your knowledge of the problem, your anticipation on how the problem might be extended in the future, your array of possible solutions, etc. etc.
For example, the original solution to the recipe problem in OP uses a very sequential and object oriented approach. That solution is one of the infinite number of solutions that fixes the problem. The extension of the problem (an extra recipe) plots a graph upwards through the latice towards another solution that fixes both the original problem, as well as the new problem. The number of steps required (the distance) is dependent on choice of the original solution. And since there are multiple bottoms, we could have a solution that is not able to be further simplified, even though there is a parallel bottom, that has a much shorter distance to the second solution.
This is a completely unnecessary personal attack on OP.
The code is the data here. Imagine instead of a program to make cookies it is two different scheduling algorithms for an operating system.
I like to think of this in terms of the Charizard Pokemon card
For context in this example I have this card and I'm sensitive about damage to it
so in this OO example I put the card in a box and allow you to interact with it
in a very limited way, you cannot use anything you're used to to interact with it
like your own gloves or hands etc
Just my "methods" so I might give you a tiny hole to look at it, you could still damage it through the hole, so I have lots of logic to ensure you cannot poke it incorrectly hopefully the verbosity on both your and my side is/was worth it and bug free and not missing cases, hopefully my hole was in the right place for your uses
Obviously I can't give you too many holes in the box otherwise what's the point in the box? I need the box to maintain my sanity
The other alternative is I just give you the card, and take the risk
that you might damage it, this is a disaster for my well being OR I duplicate the card perfectly and give you the duplicate in which case I don't care what happens to the duplicate, MUCH easier in my opinion, so please
Stop creating hellish boxes with holes for other developers to peak through just choose a language with efficient immutability as the default or use pass by value semantics with mostly pure functions
Reserve your classes for things that are truly data structures in the general sense, not bs domain stuff like "bowl", bowl is not a fundamental type of computer science like integer, bowl is just data and it should be treated as such https://www.youtube.com/watch?v=-6BsiVyC1kM so it can have schema and such but don't put it in some kind of anal worry box, otherwise your program may end up more about managing boxes and peak holes than it will be about pokemon cards
A major benefit of OO is that you can actually enforce this. Encapsulation is useful for data objects where some configurations of bits are valid and some are invalid. Careful interfaces let you ensure that the object is always in a valid state and does not permit you to do a thing when it is not valid. The fact that you'd be unsure of these questions is an indication that your interface is done poorly.
Granted, this is really hard to get right. Doing it badly leads to the nightmarish combination of easily mutable state that isn't easily visible.
"Copy everything" can be a really compelling option for many programs and there are persistent data types that help do this in a mostly scalable fashion. But there are plenty of cases where it just won't work. In my job our system primarily works on a data object that is too large to meaningfully copy everywhere. The solution is extremely judicious use of "const" and clear rules for automatically invalidating certain dependent program state when the underlying state we are working with changes. Lots of work, but in the end you get a ton of very strong invariants that make it really easy to work with the data.
> not bs domain stuff like "bowl", bowl is not a fundamental type of computer science like integer, bowl is just data and it should be treated as such
To an extreme: if your abstraction isn't formally verified, kill it?
Assuming as truth the idea that abstractions follow organizational structure, then only divide an organization when you have a formal abstraction for each division?
I wish there was a way to reason about this stuff that isn't so artful. I intuitively understand things like DRY, SOLID, etc, but being absolutely confident that they are true or whether they have been applied correctly is art, and I would prefer it to be math.
My usual view of recipes is poor - I always see them being something like this:
1. blophicate the chicken for 5 minutes or until soft (I made up that word but you should be a good enough cook to have some idea what it means)
2. coat with a paste made from the garlic, herbs and butter (you did know you should've made that earlier, right?)
3. now add them to the fat you've been heating up for the past ten minutes (come on, surely you had that ready?)
4. serve on a bed of hand-soaked cous cous, which you prepared yesterday using this mini recipe:
I was going to say I'm just going to try this out on one of the recipes I've recently used, but someone did that already:
How is that not strictly superior to traditional "word-problem" recipes?
Also, one thing that I hate about recipes, as a person who cooks only occasionally, is the "to taste" direction. I know what to do when I've done a given dish 10 times. But the first time around? Why no recipes ever provide any kinds of bounds? "Add to taste; between 0.5 and 5 tsp, 2tsp is typical".
(Truly, cooking is what happens to process chemistry when you care so little about the quality of the outcome that you can wing every part of the process.)
"To taste" is usually reserved for salt or some sort of textual element (like thinning a soup with water). Sometimes it involves pepper. Rarely will it involve other spices, though I don't think that's a terrible idea since most everybody has old ground spices so 2tsp will taste very different between kitchens.
You can't make a recipe amazingly precise for home cooks because equipment is wildly different and ingredient quality is wildly different. This is just a property of home recipes. Serious kitchens have things like salt done in precise weight ratios.
Oh, but you actually do! You write your programs in form of trees (abstract syntax ones), and those trees generally represent directed and usually acyclic graphs of dependencies. For example:
(defun foo (x y) (+ x y))
(defun bar (x) (foo x x))
(foo (42 (bar 12)))
And the reason we don't lay out programs as visual, interactive graphs is because text is faster to work with using digital tools. It's faster to type the structure than click it into being, it's easier to grep through it, to diff it, etc. But that has nothing to do with linear flow of text - linear flow is a limitation that we do our best to work around.
 - Except we sometimes do, see: LabVIEW, UnrealEngine's Blueprints, Luna Lang, shader editors in just about every modern 3D application, ...
 - Though structural editors exist; see e.g. Paredit mode for Emacs, which lets you do edit operations on the tree your code represents, moving and splicing branches around while ensuring the tree structure is never damaged.
Implicitly. This isn't actually represented graphically as a tree. You still have code in linear segments. You just know how to parse the text into its tree structure. The fact that code is actually parsed into an AST doesn't mean anything here. Pointing this out is just showing off. I'm happy to talk about programs as graphs (my PhD is in PL) but for the purposes of understanding how people write programs we write them as linear text.
> And the reason we don't lay out programs as visual, interactive graphs is because text is faster to work with using digital tools.
Yes. And similarly, if you are publishing a recipe book the laying out text is easier than gantt charts.
>  - Though structural editors exist; see e.g. Paredit mode for Emacs, which lets you do edit operations on the tree your code represents, moving and splicing branches around while ensuring the tree structure is never damaged.
Yes we all know. There's been decades of research in this space. And basically nobody in real life uses them.
When I was learning to cook my process was to read the recipe thoroughly, including any instruction on techniques. A good book will tell you how to chop an onion and sauté it. I would then distil this into a dependency graph which I would write down in a book. I still have this book. A recipe in there looks something like this (excuse the bad ASCII art):
2 eggs -|
200g sugar -|---- beat together---|
200g butter -| |
|--- combine --- bake
240g flour -| |
1tsp baking pdr -|-- combine ------|
1tsp vanilla -|
Nowadays I don't have to actually write this down because I stick to a few cuisines that I know well (English, French, Italian and Indian generally). I know 90% of the techniques I'll need for any recipe so I can simply read the recipe, assimilate it, then execute it in the kitchen.
The two biggest mistake I see new cooks making is not reading a recipe through first, and not building a library of common techniques. Instead I see people taking the original recipe right into the kitchen, often on their phone these days, and executing it as they are reading it through for the first time. This usually leads to incredible amount of wasted time due to poor scheduling. Always aim to be free of the recipe. Like a musician you should eventually be able to play the piece without the music.
I tend toward configuration-driven design, the more I get into operations (not necessarily development in the purest sense).
If I'm writing things that I want people to use, I want them to describe what they want - I don't want them writing code unless they need to extend what I've already done.
Configuration as code can definitely work and make some things more clear (at least, until the point an edge-case has to be added to the core routines to account for a new/custom type of configuration process).
Using a lisp tends to treat code as data, which solves all the problems in one fell swoop.
So I think specifically the post's analogy of "it's hard to know how-much of each ingredient to use in each step" doesn't really map very well.
Adding indirection can make things more difficult to read. If the details you need to know are placed in multiple places, this is complex and adds cognitive load to understanding the code. -- Ruby is nice to write, but a PITA to refactor, because the 'type' of a method's argument is implicit. Whereas with languages with ADTs and records, a piece of code can be made 'smaller' and more explicit, and easier to refactor.
Maybe the post's argument can be adjusted where with some baking items, an additional step may-or-may-not be taken.. where an indirect style makes it harder to get an understanding of what's going on. -- But it's also important to note that sometimes the system being modeled is complicated and benefits from the added indirection.
I also hate when people make oversimplified analogies about programming.
Or maybe it's just me and my brain being wired abnormally for cooking. I'd love to learn it, but I keep bouncing off it, hard.
Pick any random ingredient you enjoy and cook it using a few different techniques and a few different time/temperature combinations. Try to understand what is happening with that ingredient, why they taste different and which you prefer. You'll soon get a feel for how different combinations of time and temperature affect different ingredients and will often be able to guess how new ingredients will behave based on your experience with similar ingredients.
I personally consider Alton Browns old TV show Good Eats as a great introduction to cooking following this approach. Most episodes are dedicated to one ingredient or one technique and really breaks down the science behind everything and how different factors affect the outcome. Once you understand the basic techniques and ingredients, putting together recipes becomes a lot easier.
This, incidentally, is also the best way to improve your programming skills.
On a piece of paper, draw 5 nodes and make a fully connected graph. What shape does it have? Well, whatever way you drew it, the shape is quite recognizable and distinct.
Now on a piece of paper, draw 1 million nodes and make a fully connected graph. You can use a computer if you want. What shape does it have? You can't tell, because all the lines are in the way? Alright, well what if we make 5 clusters, and represent them as nodes, and since it's a fully connecte graph, you can just connect those 5 nodes fully. It has a recognizable shape again! Nice :) There is the tiny little caveat that you now have a cluster of nodes represented as one node, but what could go wrong?
More layers of indirection, that could go wrong. The more concrete stuff you have, the more you need to abstract away in order to maintain a high level overview, but the tradeoff is that while it is high level, it is less concrete (more abstract).
> Step 5: Bake for 10 minutes, cool for 5, enjoy!
yet even the initial implementation gets things wrong.
"I'd note that the recipe used in the post doesn't include a list of ingredients."
Yup. Also missing are preconditions, assumptions, defensive programming. Maybe forgivable omissions from a blog entry. But those "ingredient" steps are what allow the "recipe" to be simple.
"decouple your ingredient prep from your cooking algorithm."
This is The Correct Answer[tm].
But I don't see anyone explaining why: It makes the code testable, directly.
Stated another way:
Decouple all the async, blocking, I/O stuff from the business logic. And do not interleave those tasks.
How you know you're doing it wrong:
Any and all use of mocking, dependency injection, inversion of control is wrong. Therefore, the presence of Spring and Mockito (and their knockoffs) is strong evidence you're doing things wrong.
My professional cookery text book  doesn't write recipes like that. It starts each recipe with 'mise en place' . It also requires that you are familiar with the previously described general process for recipes of the type.
It's a shame that most people are only familiar with the trash that comprises the bulk of cookery books.
Sorry, rant over.
 Professional Cookery: The Process Approach, Daniel R. Stevenson, https://www.amazon.com/dp/0091583314
Representing a simple recipe as a process may work, but try modeling an increasingly complex system (say your simple local football players tracking system, to something more extreme like a payroll system or an aircraft traffic control system) in such a way.
It has been tried before with limited success during the decade of structured analysis and dataflow diagrams.
Isn’t such an approach suited to simple transformational problems only?
It reminded me of how FP is applied to a program where the instructions and intent of the developer are encoded to configure the system (recipe) and then the execute function is called. E.g. the onion concept.
Far from being an expert on the topic but was happy about recognizing the pattern.
To the article I would like to add that services can be designed as cooks so each and everyone has a purpose and a separation of concern.
My recipes are vague inspirations at best. Cooking is done by listening, smelling, tasting and looking, not reading a set of instuctions. Not sure how that transcribes to coding.
When making dinner for your family, sure. When needing to turn out 1000s of identical cookies day after day and you don't even know who will actually be making the cookies next week, not so much.
But ya, none of those solutions seem great
One of the best programming books I read was UML Distilled (which introduced the idea of some pattern beside teaching UML) and Design Patterns (Shalloway, Trott).
The moment I started reading this post I thought “that’s a strategy pattern use case”. And that’s the conclusion.
A lot of people fret upon reinventing the wheel (use library instead!) but then they do so very often with the software design where a lot of problems are not only well researched but also peer reviewed and properly described with consequences that come with them.
If you enjoyed this article I would recommend picking up book on design patterns (any popular will do) as there are many more prefabricated solutions to choose from.