Hacker News new | past | comments | ask | show | jobs | submit login
Write code that is easy to delete, not easy to extend (programmingisterrible.com)
559 points by AndrewDucker on Feb 14, 2016 | hide | past | favorite | 133 comments

I think this highlights a common issue when development is (or feels) rushed. You either end up with developers having only done the first part of each of these pairs (repeating themselves, ball of mudding, etc), without time to clean up as part of each iteration, or, you find developers immediately shooting for the latter half of each pair (DRY, modular, etc) without having done the former, and so you get abstractions that make no sense, overly complex interactions in a shared function as they attempt to be DRY, etc.

This latter is also, I feel, what informs a lot of the monolithic frameworks used for 'enterprise' development, Spring and the like, where a predetermined architecture and structure of the app is imposed by the framework, and which leads to, once you get down in the weeds of dealing with odd edge cases and things, hackery on the part of the developer, or framework bloat if the framework attempts to address the most frequent of those cases.

Couldn't agree more. Sandi Metz has an excellent talk touching on this topic. Developers exaggerated willingness to keep things DRY and elaborates a bit on why: it's one of the easiest things for a not-so-experienced developer to identify and one of the easiest things to teach.

Edit: One of the best quotes from that talk is (paraphrased): "The wrong abstraction is a lot more expensive than duplicated code". https://youtu.be/OMPfEXIlTVE

Funny you mention monolithic code. The post expressed a very important thought:

> we’re trying keep the parts that change frequently, away from the parts that are relatively static

Modularization and good code contracts help "sequester" code parts with different rates of change.

And this is my guiding hunch when I am splitting up code and deciding on module surfaces: how confident are we in the subject matter that given code describes? This does not replace actual proper concept modeling, but it is an intuitive feel for helping future iterations go smoothly.

Intuition is great but we often forget that we can use churn stats too.

find . -name "*.rb" |xargs -n1 -I file sh -c 'echo `git log --oneline file | wc -l`: file'|sort -nr

Applying a time period window is good too as some parts of code churn for a while and them reach stability.

What does that command do?

By my read, it prints a list of all Ruby files in your project sorted by number of commits against them. It may take a long time to complete depending on number of files and number of commits.

YMMV, I am not a BASH shell.

We put a variation of the script to music also. Seriously. It's on iTunes and Bandcamp:



http://explainshell.com/explain?cmd=find+.+-name+%22*.rb%22+... doesn't quite get it either, but I think you're right. I'm not near a machine with Bash to check.

Actually, it finds all Ruby .rb files, then prints out that file every time a change was made to the file. It then does a count of each of the times the file is output, and then sorts the list with those files that were modified the most frequently being at the top, and those modified least frequently being at the bottom.

In other words, it shows (as the poster says) file churn in a Ruby codebase.

It finds all of the files with the .rb extension in a directory, then feeds that in to git log --oneline. wc -l then counts how many lines resulted from git log, which basically tells you how many times each file has been modified. Lastly these numbers are sorted from largest number to lowest.

The gist of it is that you figure out which files get modified the most. This lets you know where your most frequently changed code is.

Your first paragraph just nailed. You were able to mix the post content with real world mechanics.

One thing I find really hard is where to draw the line to an uncleanable ball of mud.

For myself, I find the most important thing is to have clear interfaces (contracts). That is, I can write the hackiest code inside of a module, but I will spend time upfront to make sure that what that module exposes is the cleanest I can make it.

Then, I can isolate and fix a function by itself. It may have been written to be 200 lines long, filled with hackery and half measures, but my complexity is contained within it, and the functions it calls out to (so nothing calling it should need to be substantively changed). Those called functions, in turn, may also need improvements, but my focus every step of the way was keeping the interfaces clean, so I can always get down to some base set of functionality that has no dependencies, fix that up, make the necessary fixes to keep things working in the functions directly calling them, test, and then move up to those calling functions and repeat, until I get back up to the big ugly 200 line monstrosity. Every step of the way I can make sure things are still working, and I don't have to substantively touch anything above the 200 line monstrosity until I've cleaned it up.

So by keeping my interfaces clean, I can figure out how, inductively, I can make progress.

The lack of clear interfaces is the real problem; if you don't put in the effort when writing your code to keep those clean, you end up with circular dependencies, implicit workflows and state between functions ("to call X you first have to call Y to get a foo, pass that into X, then call Z to reset the foo value"), and other nightmares, that move you more to needing a full rewrite.

Also, plain old data interfaces are very desirable. They decrease coupling. Get the data off the interface and restructure it (renaming, preprocessing...) in a suitable way before implementing the actual functionality. This decrease in coupling makes representation changes in the interface more practiceable.

Of course sometimes stateful (procedural) interfaces are a must, but it's surprising how many painful OOP classes can be replaced with a "const struct foo" interface with a clear meaning to the data in it.

Higher order functions are one tool to help transform procedural interfaces into data interfaces---after all that's the whole point of treating functions as data.

What exactly is "the whole point of treating functions as data"?

Of course you can declare everything is data. You don't need higher order functions for that. But real data is simple and introspectable. Computation is not.

Yes, even pure functions are not introspectable. All you can really do with a function is call it on a value and get a return value. (For the sake of simplicity, I am ignoring side effects here.)

Let me try to rephrase with an example. Eg you might have an API for a file that allows you to open a file, manipulate it, and then close it. That's a very procedural interface.

As an alternative, think of an interface like the following:

    withOpenFile(filename, manipulator)
that opens a the file, calls the manipulator function on the contents, and automatically closes it.

Or compare map, filter and reduce vs manually iterating over a collection of items.

I did really like your grandfather comment (https://news.ycombinator.com/item?id=11099224). I hope I've cleared up my point?

nitpick: in Javascript you can actually call functionName.toString() and get the implementation. AFAIK angular did this trick to implement their dependency injection mechanism.

Yeah, but that's evil. And even with the source code, the only decidable properties of Turing complete systems are trivial. (Eg you can't even tell in general whether a given function will eventually return or loop forever from the source.)

That doesn't show the actual values in the closure. Which are subject to change anyway.

Thanks :-)

My experience has been that shitty code tends to be shitty because it accesses "magic global state," so it's hard to coral it into a single module almost by definition. This happens usually because of rushed deadlines; it's often hard to plumb through an extra parameter if there are many touchpoints or a lot of tests to fix, and easier to just stuff something into a global variable and ship the code.

That's why a language that forbids magic global state (or at least makes you clearly tag every tiny instance of it) can be so useful.

Then someone invented the singleton.....

In eg Haskell a singleton needs to be tagged with IO. And that makes sense, because singleton are clearly evil global state.

+1. Clean, data-driven interfaces also promote meaningful tests as opposed to cringe-worthy "mockup object"- driven tests.

Same here. Write your shitty code in an isolated private part. Expose decent parts and let that be used elsewhere.

Yeah.. Spring batch comes to mind.

The trick really boils down to: be messy, but clean up. If you don't do the second half of each thing (copy and paste / don't copy and paste), then you end up with that unmaintainable mess. If you do the second part too early, you end up with the wrong abstraction.

Sandi Metz also talks about this: http://www.sandimetz.com/blog/2016/1/20/the-wrong-abstractio...

Yes yes yes. When I'm coaching junior developers (or developers that like over engineering) this is what I try hammering home more than anything.

If you looked at my code at 10-70% solution complete you'd find so many egregious DRY errors, magic values (likely with a comment above for my own memory), functions that are too long, and a whol ball of mess. If you looked at it 90-100% complete it would be nearly unrecognizable in writing style to what you saw earlier.

Figure out how to solve the problem first, spend all your efforts on that, then figure out how to clean up your code and make things nice/performant/secure/etc.

I've found that as I've gotten better my sloppy is better than a lot of people's clean, which means I'm spending less time on cleanliness, and getting done faster.

In the real world, at least mine, once it works you get pressure to move on. Both my own mind saying, it works so it's good enough, and external pressure from a stake holder who says it looks good lets move on. I don't feel like I get to that latter 20% - 30%. As such, I try to be thoughtful on my design, but I also realize the need to get stuff done. It's a balancing act.

The pressure you put on yourself is your responsibility. Don't pretend it's something you have no power over.

The stake holder should be your customer, not your boss. Even if that dynamic is broken, you don't need to tell them it's shippable until you're happy with the code.

True, we are the ones responsible for our integrity and we often live with our mistakes, paying for poor design choices during later maintenance cycles.

Just be aware that it can be easy to get lost in a yak-shaving exercise doing your cleanup refactoring. I recommend timeboxing this phase to no more than an hour, or prioritizing one or two specific tasks so that you don't go off the deep end.

Don't forget that others are depending on you to get their work done. There may be a 2-hour cleanup task somewhere else in the code that will help your teammates far more than the itch you want to scratch here and now.

> I recommend timeboxing this phase to no more than an hour, or prioritizing one or two specific tasks so that you don't go off the deep end.

When I read comments like this I think to myself "How can they possibly be so productive that refactoring can be done inside of an hour?".

Then I remember that some developers split up projects/problems into really small components and can complete a feature or two in a single day. Well then an hour is actually a fair bit of time to spend refactoring.

Yeah, I don't see a lot of talk of how hard it is to clean up code. A lot of programmers just aren't good at it, so maybe it's just as well that they don't do much of it...

"Convert your working code to good design" is not a trivial task, and like you say, a big part of it is to pick your battles.

The whole of the Software Engineering discipline has not yet implemented a standard reply for "where are you at with this?" "is it done yet?"

Sometimes cleaning up code has positive business ramifications - for example, I took the time to design a component carefully in the past few weeks for work, and when it came to start a new user story for a nearly identical feature for another view, the careful cleaned up design carried over seamlessly & got finished in about a day of work. This will be assuredly reused again for another feature with nearly identical logic/models, and thus saved the company almost 2x the time of development.

Sometimes you just have to argue for the effort being expended, and my managers all had the foresight to push for doing it right when it had large ramifications.

Yeah, sometimes even though it works from a functional stand point, you can say it 'isn't working' and still be honest to yourself, since the messy code won't work in terms of long-term maintainability.

The trick is to think of the last little bit that makes it complete in terms of the 'cleaned up' approach, which means you bake in the cleaning up/abstracting as necessary for completion.

Exactly, it's not 'done' until it's cleaned up. Easier said than done, though.

When I use the word done I mean done to the point that I'm marking the task/project/module done and telling someone it's done. It's not like a PM is checking the CI server and noticing the build is green and the unit tests pass and telling me to get on to the next task...

There's done as in the spirit of the assignment is completed to a satisfactory level of cleanliness and then there's perfectly done, which means you'll never ship it. I think that's hard for a lot of people to come to terms with, people of all skill levels. The code needs to be easy to navigate and the patterns should be as easy as possible to learn and emulate for others entering- but you don have to start there, you just need to wind up there before you call it done.

Do you get your code reviewed? That's a good incentive for high quality code.

In Erlang and OTP in Action, Joe Armstrong is quoted as saying (and I'm paraphrasing here 'cause I don't have the book in front of me) - "First, make it correct. Then, make it beautiful. Then, if you need to, make it performant. Because 9 times out of 10, making it beautiful also makes it performant enough."

Funny I have a saying which is very similar:

"Make it work, Make it work well, Make it work fast"

I would disagree with beauty == performance, most performance problems I have found to be in the last place I would think them to be :-) On one (moderately large) system it was a hashcode() calculation ;-)

'Place I think it to be' is completely orthogonal to beautiful code. :P

From my own experience, many, many times, definitely a majority of them, when I have had performance issues, and finally found the bottleneck, and fixed it, my code was more beautiful afterwards. Algorithms were simplified, duplicated effort was removed, etc. One major one in my mind was where code was attempted to be reused, but in doing so it caused an O(n^2) behavior, where an O(n) could have been done instead. Upon finding it, and refactoring it, the code had fewer interdependencies, the logic was far more straightforward, both things I find to make for 'beautiful' code, and it ran far, far faster (from 'this particular operation takes an hour' to 'this particular operation takes < 1 second'). Had I invested time up front after making it work to clean up the code (rather than stop at 'it works, good enough') I would have stumbled upon the same fix, even before seeing the performance issue.

I believe the common idiom is

  Make it work

  Make it right

  Make it fast

It is; Joe was restating it differently ('right' is ambiguous. 'Beautiful' is subjective, but at least a more appropriate adjective for code quality), and also offering the insight that prioritizing code quality also often gives you performance.

That's probably where I got it from, somewhere in the mists of time, thanks rakoo!

I suppose I'm similar to the way you write code. However I'd like to note that this 10%-70% in my case (and may be true for you) is applied only in cases when I'm implementing something very unfamiliar to me.

If I'm working on some familiar problem or in a familiar area I tend to write nice, clean code from the beginning. As I have more and more experience in various areas it reduces number of times when I have to do this process of dirty code->refactor.

That's definitely true for me. If I'm writing a traditional web app using a language or framework I'm very familiar with then there the difference between 10% and 100% in cleanliness is nearly negligible. But those aren't the interesting projects. When I look at my code and it looks like shit and I've got notes and commented out blocks I feel happy because I'm working on something that my brain can't just spit out a complete answer for. It also feels even nicer to be near the end and see how nice it all looks once my thoughts have formed into a solution.

When I teach teenagers I tell them, there's never time to do it right OR to do it again. Time is money; Engineers are paid to solve the problem and not make a beautiful thing. Get used to just-in-time solutions and just-good-enough implementations.

With that in mind, its important to have some discipline, to make your 'just good enough' not too crappy. That's the difference between an apprentice and an expert.

I do not like this approach, and I think it is not the right way for medium to large software projects. Maintaining lots of interacting 'just-good-enough' code is a nightmare - and it can easily cost you way more than you saved writing it.

People get lazy and sloppy early enough. I would try to encourage teenagers to go for the best and most beautiful solution that they can possibly find.

I don't believe this is actually measurable. Any claims about time saved or lost are merely guesses, especially in aggregate for a large project.

Sorry, but I feel I need to be harsh to express how much I disagree; it sounds like you are a terrible teacher.

Beside feeding them with the "just write mediocre code" you are also taking away the beauty of software development: "you have to write something just good enough and move on"...

No; I teach about responsible Engineering. To make beautiful code is a great thing. On your own time. That's how you become an expert of course.

So no I take nothing away from the art, in the right place and time. And thanks for the personality evaluation!

It wasn't a personality evaluation, it was a critique of your teaching methodology.

Responsible engineering isn't targeting the minimum thing that works. That's short sighted thinking that involves taking on way too much technical debt up front to pay for a faster initial release. This method might work if you're writing one-off tiny projects, but it will slam you if the project needs to adapt to new requirements in the future.

That's the difference between writing software and building a bridge. Three weeks before a bridge is completed, nobody can say, "oh BTW, this needs to also carry two trains." With software major requirements can change last minute and you better have written your code to deal with that possibility.

For many, maybe most, engineers it is targeting the minimum thing that works. For every platform or structural problem, there are 1000 apps. And they get old faster than cheese in my refrigerator. Shipping sooner for less cost is a primary metric!

I disagree on the phrasing. When I'm teaching beginners or juniors in a school or professional environment, as I said, I hammer that it doesn't have to be perfect. It never will be. Don't over engineer things. But everything NEEDS to be right. Where right means the problem is solved, the code is understandable, the patterns (and there must be a pattern) are easy to discern and emulate. Along with proper code formatting- I am amazed at how many people in the workforce don't even understand how important formatting is! FFS most IDEs will auto format with a keystroke.

'Right' is definitely subjective, so to you, right be what I call perfect and we're talking about the same thing. But I feel like if someone is hearing me say it doesn't have to be right, I'm excusing bad behavior and shit code. Whereas when I say it doesn't have to be perfect I'm saying a small amount of code smell or some tricky areas are acceptable if they can't be helped given the money and time constraints.

At least one problem with this advice is you do not define "just good enough". When I was a new developer "good enough" simply meant it passed all obvious tests I could think of.

I'm not sure I object to just-good-enough solutions, but, there's still an awfully large benefit from paying down technical debt, and I hope your students appreciate that.

My approach is to pay down the technical debt before starting the next big change. I consider it akin to prepping my work site. As an example: consider a project where you're modifying some large continuously running system in production, creating an alternate codepath to ultimately replace some legacy codepath. You'll likely add the new codepath first and then migrate some subset of users to it.

There might be some diminishing returns from migrating a long tail of users to the new codepath (perhaps they were better served by the legacy codepath for reasons that don't quite justify the maintenance of both codepaths). And even when they're migrated, you quite likely don't need to rip out the legacy codepath. So at some point, perhaps before all the users are on the new path and perhaps before you rip out the legacy codepath, you'll consider the project a success and move on.

To me, that's quite reasonable. But consider then that any project on the same codebase in the future has to account for both of these potential codepaths. That adds an expense to every future change you make, as you have to maintain and support twice as many things. In my experience paying down that debt will pay for itself over the course of an astonishingly few (but greater than zero) number of future changes.

I mention greater than zero because quite a lot of systems get decommissioned, for one reason or another. If someone is going to bulk delete your code and everything near it, it doesn't really matter how the code smell was right before: the code that smells the least is the code that doesn't exist.

This is why I make various forms of cleaning up code - in-depth refactoring, deleting all unused code and tests, etc. - not the last thing I do when I finish a project, but rather the first thing I do when I start the next project. It's also much easier to explain why you're doing it to outside observers at that point - you're doing it not just because it "improves code smell" you're doing it because it makes what you're trying to do now substantially easier.

The only time I think this can start to fail is if you aren't the person who would be doing the next project / you could do the code cleanup far more efficiently than the person who is doing the next project. In which case this all becomes much more hard, and you should just remember to treat others as you'd wish to be treated. But I'm not sure it totally doesn't work -- for the person inheriting your work, getting to do some amount of refactoring and cleanup may well end up being a decent way of them getting to know the codebase better (if they knew it as well as you they could clean it up just the same).

"Figure out how to solve the problem first, spend all your efforts on that, then figure out how to clean up your code and make things nice/performant/secure/etc."

I think this explains it neatly.

Someone should invent "immutable programming". A paradigm where you can't delete code.

It's a bit the opposite of Vigil, a programming language who punishes functions throwing errors.


We've been doing that at our company for years. Unfortunately we're hitting the Android DEX limit... time to merge some methods!

Ethereum has immutable code.

Some embedded software gets pretty close to that.

Yes. It is an iterative process, like writing a story.

This is extremely validating to read. How many times have I battled with DRYists over which solution is "better."

I've happened upon the pattern of code growth described here after years of encountering and resolving pain points in code, often as the maintainer. DRYness for DRYness sake might feel satisfying when writing the code, but when changes need to be made, it often scatters constraints and requirements throughout the codebase, making any change an unestimatable mess of sneaky traps. Try writing your boilerplate initialization code straight some time... no metaprogramming, no helper functions, just get your configuration, create your objects and wire them together. It is oddly liberating to have your bootstrap code flat and unmagical.

> DRYness for DRYness sake might feel satisfying when writing the code, but when changes need to be made, it often scatters constraints and requirements throughout the codebase, making any change an unestimatable mess of sneaky traps.

This also applies to "one-liners" when writing code. For example in ruby:

Model.where(...).foo { ... }.flatten.bar { ... }.join(...)

Looks nice and all, but it can be confusing when you need to fix a bug that you aren't sure of the source (foo or bar? maybe we shouldn't flatten yet? maybe we should join sooner?). It's much better, IMHO, to write more lines of code here so that you can easily test each step manually when a bug occurs.

Your point seems to be more about using chains of set operators rather than the fact that it is on one line.

I am a C# guy and I find Linq which is the equivalent to this insanely useful and I often write code as above. The tax on this is the rest of the team need to be up to speed in reading such code. If they are then what you gain is use of a bunch of predictable operators rather than loops with counters, state, etc... that could have subtle bugs.

Using code like this is no different to writing SQL that does joins and aggregates in a stored procedure and calling it from your code, something I also thing isn't a bad thing.

Debugging can be done by splitting it into lines, or using a decent enough debugger. Or making it pure and writing unit tests or property-based tests to cover it.

> Your point seems to be more about using chains of set operators rather than the fact that it is on one line.

It isn't, though. My point is that sticking chains on one line makes it difficult to read as well as debug the various bits and pieces that all rest on a single line.

Consider the task of debugging any arbitrary error where you have little more than a line number to go by. You now need to refactor the line just to find out which method invocation is causing the error.

There are also readability issues with lines like these.

Ok so you are happy with this?

        .foo { ... }
        .bar { ... }
In Visual studio it doesn't really make a difference from a debugging perspective. However in other debugging tools perhaps it does, and in that case I would tend to agree. But then this is a minor point, something that can be covered with a coding standard and linting the source.

Actually, yes. That's exactly what I'm advocating. Whether they are separate assignments or just chains broken up into multiple lines, the end is the same as far as what I'm talking about.

> It isn't, though. My point is that sticking chains on one line makes it difficult to read as well as debug the various bits and pieces that all rest on a single line.

The thing is, if you want to split the line into multiple lines you have to introduce local variables all over. I actually find this harder and more cumbersome to read because if you are reading the rest of the function, you want to know if these are used anywhere (assuming imperative language). If you have everything on a single line, then you know that the intermediate values are not used anywhere else.

I agree that the line number thing is helpful though. However, if you realize you have a bug you'll likely have to reproduce it anyway, and at that time you can introduce local variables just to see where you went wrong.

Perhaps in the end this is just subjective? This style is very common in functional programming (which I prefer) so maybe that's why I like it and hate having more local variables to keep track of.

Whether you need to introduce variables is dependent on the language. Some allow for chains to continue onto a newline, while others won't.

> Consider the task of debugging any arbitrary error where you have little more than a line number to go by. You now need to refactor the line just to find out which method invocation is causing the error.

That's a tooling issue. The right thing is to enhance the stack trace to give more than a line number, and the debugger to allow a more specific breakpoint.

> There are also readability issues with lines like these.

Maybe, but the biggest readability issue by far is when a class or function spills beyond a single screen. Avoiding that is worth a lot of cramped lines.

At this point this is boiling down a subjective argument on readability and tradeoffs in tooling, so no point in continuing. Have a great day!

Couldn't agree more. Writing a single line of code that can fail multiple ways (especially the same way on multiple parts) is one of my constant source of annoyances. Not from a DRY perspective, but from your code has a bug, but I can't give you enough detail to immediately fix it because it could be one of these X reasons. Especially on production systems, these become the sorts of issues where the developer makes some temporary fixes to the code, then has wait to observe the bug again to work out which line is broken.

I'm not addressing your overall point, but one thing I learned about recently that could help in debugging those Ruby one-liners is the `tap` method. You can stick it in the chain and set a breakpoint.


Further off-topic, but I almost exclusively see tap used as some kind of clever way around a return line, e.g. instead of:

    def build_hash_or_whatever
      hash = Hash.new
      hash[:foo] = some_generating_method
      hash[:bar] = "some value" if whatever?
      # More hash construction nonsense
With tap it is:

    def build_hash_or_whatever
      Hash.new.tap do |hash|
        hash[:foo] = some_generating_method
        hash[:bar] = "some value" if whatever?
        # More hash construction nonsense
I think the tap-form is significantly worse and I don't understand the aversion to a single line that clearly shows what you are returning.

Changing code to enable debugging is barbaric. Tooling should be able do that. Or maybe I'm spoiled, lol.

Funny you mention it, I'm also a bit anti-chaining for the same reasons.

Do we need a cute name to describe the practice of optimizing code for maintainability? "Craftsmanship" speaks to the building process, what should we call it when we make code that is understandable and straightforward to change?

One of the earliest experiences I had as a professional software developer was taking over an app that had originally been developed by someone with 30 years of experience. He had looked at the problem domain, said "these things are all similar", and so had constructed his code so that every one of those things shared a huge amount of their code paths. It worked for the two, three things he actually coded, but as soon as I and another dev started to add more (around 12 total I think), it got hackier and hackier. Because the conditionals needs to distinguish the two or three he had originally handled were easy to ignore, but the conditionals needed to distinguish the 12 were littered all over the place, with no real predictivity to them.

We attempted to isolate it as best we could, pulling out into new methods and such, but it was bug whack-a-mole; we reliably were finding a new bug within a week of each fix we made, nevermind any we introduced with new features, because trying to isolate and fix one of the things invariably broke one of the others.

So, finally, I took two days and rewrote the shared logic of the application (only around ~10k lines). I borrowed liberally from the existing code, but basically isolated each thing entirely from the others. They still had the same structure, same interface, but rather than all inheriting the same implementation, they all had their own copy of the implementation, which in turn meant each one was far simpler, and could be tweaked in isolation.

That took about two weeks of use to flush out the bulk of the bugs. And after that, our mean time to failure kept improving; we went with maybe one bug the third week, then one bug a month for the next two months, then it essentially stabilized; it was only in introducing new features that we'd occasionally find/introduce new bugs.

That whole experience taught me not to worry about DRY until you actually have done the work. DRY is to ease maintenance and future development; it should not be a consideration for that initial implementation of a given feature. Because things that look similar enough to share functionality between often aren't once you become more familiar with the domain. If they turn out to be related enough to share functionality, it was trivial to not repeat yourself (copy/paste), and it's trivial to refactor. If they turn out not to share functionality, the effort to try and make them share code was higher than if you'd done them separately, and the amount of effort to pull them apart is huge.

It seems to me what you're describing was a deep and wide inheritance tree. I hate those too. In my experience they have always been very brittle, and lead to confusing code. I'm a huge fan of the recent movement to prefer composition over inheritance.

I don't remember the particulars. But they're not that important to the nature of the issue (though recognizing them as a bad smell is) - any time you have a reused bit of code (be it via inheritance, composition, or even just a function), you run the risk of that kind of complexity. And this can be hard to spot; a function that takes an argument that it then uses as a conditional in a few places can hide a lot of complexity.

That is, if I have a (pseudocode) function

  foo(bar) {
where //D indicates a line that does not change its behavior based on the passed in bar, and //C indicates a line that does (and in complex ways; if bar is of type bar1 or bar3 do X, else if it's of type bar 5 do Y, else do Z), the code has a lot of interrelated complexities, even though it's DRY. It's better to have

  foo(bar) {
      case(...) {
so that the implementations are completely different, that even though there will be copied code between foo1, foo2, etc, it's clear what is happening, and you change any bit of foo1, foo2, etc, without fear of it breaking something else (whereas in the original, changing any of those lines marked //C could break unrelated functionality). The issues with the first, and the improved clarity and reduced coupling with the latter, holds true regardless of whether the code is in an inherited object (where the first function would translate to a foo implemented in the superclass, with the //C lines being calls to abstract methods, that are then implemented in the subclasses, and the latter function, foo would be an abstract method itself, and all subclasses implementing their own version, that is partially identical to each others'), or via composition.

Makes sense - you can't know the right abstraction upfront.

As Hejlsberg said:

"If you ask beginning programmers to write a calendar control, they often think to themselves, "Oh, I'm going to write the world's best calendar control! It's going to be polymorphic with respect to the kind of calendar. It will have displayers, and mungers, and this, that, and the other." They need to ship a calendar application in two months. They put all this infrastructure into place in the control, and then spend two days writing a crappy calendar application on top of it. They'll think, "In the next version of the application, I'm going to do so much more."

Once they start thinking about how they're actually going to implement all of these other concretizations of their abstract design, however, it turns out that their design is completely wrong. And now they've painted themself into a corner, and they have to throw the whole thing out. I have seen that over and over. I'm a strong believer in being minimalistic. Unless you actually are going to solve the general problem, don't try and put in place a framework for solving a specific one, because you don't know what that framework should look like."


So of course write it dirty, clean it up afterwards seems like a better idea, because it lets you feel what the right abstraction is. However it takes discipline to always refactor something that works fine already

I feel like this is a little like learning to cook. At the very beginning, I would just create a huge mess as I was going, use a different pan for every item, a different mixing bowl for each thing, and I didn't know how to prep effectively in advance.

This left a train wreck in the kitchen after each meal that I was forced to clean up before I could cook again.

I learned how to do a little up front prep, and that saved time and mess, and made the whole process smoother.

As I learned more, I started trying to clean up as I went and conserve the number of pots and pans, I made a lot of mistakes washing everything after I used it, which slowed the whole process down, and I didn't need to reuse some of it.

Now I'm in a place where I have enough experience that I know in advance what I can reuse for this meal and whether it needs to really be washed (maybe you just deglaze a pan and wipe it down with a paper towel instead of hauling it over to the sink and scrubbing it), when I'm actually done with an item, what the approximate cooking times are, etc.

Now I can cook complex multi-course meals to a good restaurant quality and have just a couple of things left to clean up by the time dinner is ready to serve.

It doesn't take a genius to figure these things out, just the acknowledgement that these things are important, and the more people you are working with, the more important they are.

Sure, I could have spent the rest of my life cleaning the huge mess after every meal at home, and it wouldn't really matter.

But if you are going to be a line cook at a large restaurant, you must get a hold of these concepts.

And this requires introspection on your process, whether the domain is programming, cooking or really any human endeavor. Unfortunately not many people can do this effectively (and this is related to the 10k hours to mastery dictum - it requires effortfull practice which includes exactly this sort of introspection)

To add to his point about copy and pasting...

Notice the contradiction implicit in these bits of accepted dogma:

1. Copy and pasting code is evil (DRY).

2. Tight coupling is bad, loose coupling is good.

A system written with no abstracted functions (ie, where copy/pasting was the rule) has no coupling. Of course, I'm not advocating such a style. But it's worth keeping in mind that practicing DRY through abstraction necessarily increases coupling, which means that it's always a tradeoff and that you have to get your abstraction right so that the coupling, on average, makes change easier rather than harder.

That's not really true, is it? Coupling is independent from explicit code dependencies. Take, for instance, two programs. One writes to "/tmp/file". The other reads from "/tmp/file" but neither shares the code to determine the path ("/tmp/file", in this case). They're code independent, but very tightly coupled.

In fact, I'd hypothesise that copy-paste codebases are more tightly coupled than otherwise.

It's certainly true. Coupling measures the degree of interdependence between modules/functions/classes. If everything is copy/pasted, no module depends on any other -- it simply has its own copy of the code. This will result in a terrible system, but it will be very loosely coupled. When you change something, you won't have to worry about it's impact on anything else.

In your example, if the content of the file is changing the behavior of the program reading it, then the programs are sharing code via the file. It just happens to be going on at runtime rather than at compile time.

If you need to delete code in a codebase full of copy pastes, it's likely that you will need to delete it in many many places and miss some of them. So although the code isn't coupled, it is hard to delete.

In this case, the code is coupled; it's just that it's coupled through the programmer. :/

I think that was always implicit and intentional. There really is a tension here. Artful programming rides that tension and understands when (1) is worse than (2) or visa versa.

A standard rule of thumb for this tension is the old "rule of three" but it's just a rule of thumb.

Totally agree. One thing that I still find really hard to maintain/delete is CSS code. More and more I feel like it should be included with components rather than in a plain .scss or .css files. It feels good to be able to delete a component knowing that there isn't css crap left behind..

To solve this issue I keep component-specific .scss files alongside each component's JS file, in a component-specific folder. A root index.scss file imports all of the components' .scss files.

Coupled with BEM, this also helps prevent component style interdependencies.

This article could be summarised as "abstract only when you see a clear need to", something that seems to be the exact opposite of what a lot of programming courses teach; especially those dealing with object-oriented design. I think abstraction should be viewed not as a technique to be applied generously and whenever possible, but a necessary evil, resorted to only when nothing else can simplify the code.

A bonus of this style is that it often also makes the resulting code more efficient for the machine to execute, reducing the need for optimisations later.

The importance of DRY is proportional to what could go wrong if the repeated code has an issue. For instance, if the code is performing some vital calculation or is part of your security architecture, it had better not be repeated anywhere because somebody will need to patch it later and will need to guarantee that the update has been applied consistently.

Sometimes, it's just clearer to rewrite something yourself. For instance, just because you can express just about anything in terms of algorithms in the C++ standard library, you should instead write the simpler stuff by hand. (A great example is this silly idea of “copying to an ostream iterator” just to print out some data; I don’t care if that combination of standard functions happens to produce the desired result, because the code is painful to look at!)

It's also helpful to using aliasing (e.g. C++ reference variables) to make similar code look as similar as possible; i.e. rather than have two similar blocks using entirely different variable names throughout, declare references at the top to give them the same names so that the similarities are obvious. This also makes it easier to later pull common parts into functions if desired.

Easy to replace not simply to delete.

Anything that is easy to delete (and just leave it deleted) is superfluous. Don't write superfluous cruft, obviously.

Easy to replace is important in all engineering. A product or structure with easily replaceable parts is better than one without easily replaceable parts, all else being equal.

We'd never say "design a brake caliper with brake pads that are easy to delete". :)

From his about: "I am not a very good programmer. I forget to write tests, my documentation is sparse, and i’m pretty apologetic about it at any code review. "

Guy who doesn't write unit tests suggests work around to dealing with problems in code that has no unit tests...

The point of a lot of tef's writing and speaking isn't that he's a bad programmer, but that he is tearing a large hole in people who claim to be great programmers. tef is actually a pretty good programmer; good enough Heroku employ him.

It's dry British humour: "I'm a terrible programmer, but that's okay, because programming is terrible, the whole industry is terrible, and pretty much everyone is terrible at programming, especially if they are lecturing you about how to do it because they are probably full of quite a large amount of shit; now let's try and do better".

If you are used to chest-thumping evangelical "rah rah, everything is awesome" sermons, the British cynical attitude of "no, this is all shit, and I'm awful too, but maybe, if we make a nice big cup of tea, and thought about it a bit more, we might be able to be a bit less rubbish" sounds grating. The opposite certainly is true.

Corollary: Write code such that it's easy to use grep to find where things are used.

One of my pet peeves is when OO programmers take class's separate namespaces as a license to use super-generic method names. Method names like "add", "update", etc. Makes it near impossible to figure out where those methods are used!

That's why you should use an IDE with proper "find all references" instead of coding in notepad.

Bit trollish, but: IDEs which let you ignore the complexity of the code let you write complicated code. Writing in less-fully-featured environments forces you to consider simplicity as a desirable feature (so you can, y'know, work out what's going on when using a 80x24 terminal).

C preprocessor macros can also be a great tool for making things ungreppable -- building families of functions using preprocessor token pasting to put together the function names certainly avoids repetition in all those function bodies but it also makes it really hard to find the definition later...

Interesting. Thanks.

This looks like it's from a very functional (FP) view of the world. I like a lot that's here, but I fear it will make my OOP friends' heads explode.

Also, as a nit, the essay feels a bit on the "thrashing" side. I know when I'm trying to express complex concepts many times I will hit the same topic a few different times from multiple angles until I get something that's tight. This essay feels like on of those attempts -- nothing here to throw rocks at; it just doesn't feel "done" yet.

Hopefully, he has--cough--written paragraphs that are easy to delete, not easy to modify. Ha. Ha. Ha.

But seriously, this general idea strikes me as being applicable to writing words as well. Write in such a decoupled way that it is relatively easy to excise paragraphs and sections without destroying the rest of the piece.

That may introduce a small amount of redundancy, but it also makes it easy to read in smaller chunks, reorder, or remove outright rather than edit portions that no longer fit.

Do remember to actually delete paragraphs though. I've seen long-form articles that were really just several short-form articles that are linked together by a listicle format. If I'm lucky, there may even be a few transition phrases or a narrative that would justify the listicle format.

All these articles may have been good if the writer wasn't so wedded to the long-form format and started deleting paragraphs. But since he was wedded to the format, all the reader sees is a rather meandering and dull piece. It's an example of the sum of pieces being greater than the whole.

If you're not doing even OO in a functional style, wherever possible, you may be doing it wrong... but don't take my word for it


I think there's a general tend towards things being more functional these days, where possible, even in OOP languages. Since lambdas were added to Java, I know my code style has dramatically changed.

It's not even that OOP developers won't get it. The problem is that this recommendation is harmful on code organized on the Java/.Net/C++ OOP style.

I dunno, with years of experience in OO (ruby for the past 7 or so, Java before that), this still definitely hits home.

I don't agree with the commenter that suggests these suggestions are harmful on code written in OO style -- although they specifically say code written in "Java/.Net/C++" style; maybe? But not ruby/smalltalk/ObjC style.

Some of the good advice in this blog post is taken too far, like the advice about intentionally writing shitty code in order to learn from your mistakes. Coding is like any other skill. You learn from your mistakes only if you're trying your best not to make them. Otherwise there's no differentiating between your real mistakes and your carelessness.

> A lot of programming is exploratory, and it’s quicker to get it wrong a few times and iterate than think to get it right first time.

It's really hard to get it wrong a few times if you have 10 other developers building on top of your mistakes. Then you're pretty much stuck with the code you thought you were going to get rid of. In accordance with Murphy's law, it seems like the code you push out knowing it sucks always ends up being the foundation for something really important. So yes, build the simplest thing possible. Yes, don't abstract pre-maturely. No, don't write shitty code on purpose.

Similarly advice with "copy-paste 10 times". While I agree that you should copy-paste 2-3 times, 10 times is way too many. By the time you've copy pasted something 10 times, it's too expensive to refactor. Or if you do refactor it, 5 of those 10 instances have changed beyond recognition and will be missed. They continue to evolve and now for the rest of the life of your software, you're fixing 6 times the amount of bugs.

Step 8: Listen what other programmers say

Step 9: Don`t listen to other programmers

This all goes to how hard code reuse is - is it worth having two very similar functions that do the same thing instead of having one function that's DRY but more complex than both.

Maybe you should have a shared function that has the similar bits of both but then when you remove pieces you have this added complexity.

Everyone can give you arguments either way...

I remember reading this beautiful explanation about an idea for a functional language where you simply install collections of methods that do specific things and each method you install should be as simple as possible and do just one thing etc. - maybe it was somewhere on the Elixir mailing lists but I can't find it!

This is a super frustrating article, so I'm going to post the quote I think that sums up the art:

"To write code that’s easy to delete: repeat yourself to avoid creating dependencies, but don’t repeat yourself to manage them."

This is such a hard thing to explain to someone, so I'm impressed the writer described it so accurately and succinctly.

Rule Of Three covers steps 2 and 3. This they used to teach in our 1st year computer science program: https://en.wikipedia.org/wiki/Rule_of_three_(computer_progra...

I love this article. I have been at war with some of the tech teams at my company that force reusablilty. Personally I find it a nucance, and in most cases can write a new function in less time. You have articulated this ideology very well, thanks for the insight

Another behaviour which makes these steps hard to follow is: someone wrote code and ships it, and thinks, because it works and is being used, it is great and doesn't need to be changed. These feeling get even more in the way when the code took a while to write!

I keep telling my team to write bad code: write something that's correct (but not great) quickly, with an eye to replacing it with something better next sprint. Then, once you have something which works end-to-end, work out which bit is most terrible, and replace it. Repeat & launch when the quality is acceptable (and keep repeating after you launch).

I might not have understood the article, but avoiding to write code is dangerous, and will lead to code that after a while will be impossible to understand because there will be hacks upon hacks. Your boss loves this though.

And about cleaning up the code after you got it working, ha ,ha, like that would ever happen. It doesn't have to look nice, but make sure you get it right from the start, and that it covers all those twenty edge cases. It will cost more time ... So most software projects fail anyway!? Maybe it's because of your shitty code ? (grin)

Over the past year I can't stop thinking about this article: http://250bpm.com/blog:51

A neat response to the same problem posted yesterday:


Basically, layers with numbers and preprocessor directives to inject (tangle) code in the proper places without worrying too much about language abstractions.

I stil think it would be better to guide programming using principles, not practices. To avoid fragile code, remember the open closed principle. To avoid wrong abstraction, think single responsibility and substitution principle. The author is right about interface segregation. :)

This is awesome. Easier said then done, but, yes. Coding is a craft, and always will be.

I read this and thought it sounded a lot like my reality. Then, I expected to come to these comments and hear arguments against these thoughts. Glad to see a lot of us agree with much of it.

really appreciated this b/c I've lately felt bad a/b not "getting it right the first time," and one point here seems to be that that never happens, exactly; that development always involves some amount of evolving a body of code, at one level or another

Jean-Paul Sartre’s Programming in ANSI C

I am pretty sure that is at least a misattribution.

I believe all (or at least most) of the "quotes" have been re-purposed/altered by the [blog's] author. e.g. "Every line of code is written without reason, maintained out of weakness, and deleted by chance" is based on Jean-Paul Sartre's actual quote[1] "Every existing thing is born without reason, prolongs itself out of weakness and dies by chance."

[1] https://en.wikiquote.org/wiki/Jean-Paul_Sartre

It's kinda sad that no one else seams to be picking up on this. To lead off with that intentionally altered quote and looking at code in terms of how existentialists frame reality is the most significant insight.

edit: Radical Freedom in my programs intensifies

Wasn't it Sartre who famously wrote Three o'clock is both too early and too late to start refactoring?

“Every line of code is written without reason, maintained out of weakness, and deleted by chance” Jean-Paul Sartre’s Programming in ANSI C.

I just started the article and I already have problems with it, not a good sign. While it may be obvious when you research the timelines of JPS (he died a few years before ansi c 89 was established) and C , not to mention the miles of metaphorical distance between Computer Programming and JPS's work) . I guess the author was trying to be cute?? but that fabrication should be made clear as such. he's undermining his own inherent credibility as an author, however much the reader decides to put in. Serious problem in my book.

I hope you are being serious cake42, because if you're not you're only mildly funny, but if you are, you're hilarious.

I'm sure you're joking, but for those who are curious, Sartre's actual quote is "Every existing thing is born without reason, prolongs itself out of weakness, and dies by chance."

I thought it was pretty funny. Lighten things up from the start.

Agreed, I usually warm up the engineering crowd with a few quotes from Shopenhauer or Malthus, but Sartre is always good for a few yucks.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact