Hacker News new | past | comments | ask | show | jobs | submit login
Technical Debt: Why it'll ruin your software (labcodes.com.br)
204 points by yannikyeo 9 months ago | hide | past | favorite | 157 comments

I'm being a bit blunt here. It's easy to say Technical Debt will ruin your software when you pick a contrived thing and label it as Technical Debt.

> John knew his code had a few bugs. But nearing the deadline, another project appeared and it took a lot of John’s time, so he couldn’t go back and fix the problems.

This is a problem, but this is not inherently technical debt. That's just delivering a bad/incomplete feature (Not necessarily John's fault if he's been asked to make changes). Code with Technical debt works or mostly works, but relies on non-maintainable or less ideal than patterns.

Simplified example. Adding a "Cancel" Button to a form. This button involves adding a new "secondary" class style to the button.

* Technical Debt - Button is in place. Works correctly. Instead of a <Button> component and tweaking it to handle the new "secondary" class. John simply copies and pastes the <Button> code directly into the implementation.

* Bug/Bad Feature - John references the button correctly, but fails to handle certain state resets properly when the form is cancelled.


The prior is technical debt because it works correctly, but doesn't follow a good pattern or does something unexpected (refactoring that button will be much, much harder). The latter is likely just bad code.

The case where the latter is not bad code is if the company has made an intentional decision to not handle cancel edge cases. Even here, I would argue that is product debt - not technical debt.


Every software product in the world has debt. Just like every software product in the world has bugs. You need to account for your debt as you grow. Otherwise it will eat you.

There's a grey area though, where you properly define a new style (yay) at the time the first Cancel button is created, but then down the road you find yourself with a whole bunch of new styles for things, and it turns out over time that the styles are maybe repetitious or subtly inconsistent or whatever. The debt has become the reality that the initial abstraction ("all our buttons are the same") hasn't kept pace with the needs of the product as it has iterated forward. Paying down the debt is creating new abstractions and updating all the buttons to be using them.

The worst case scenario is probably partial paydowns, where the abstractions get changed but some of the non-aligned stuff remains and so that you end up with a jumble. Especially combined with inadequate documentation (or a culture of inadequate documentation, where people don't even look because they don't expect it to be there), the old practices can end up being cargo culted forward even when someone attempted to create a better way.

On a slight tangent, my team had very good results when we finally brought a style guide online. Our org was big enough that people would be working on pages without context for the bigger design themes (especially related to sales/marketing). We required any UI variations to be added to the style guide, and we required a minor justification to add styles. This added just enough friction that a lot of the unintended variation went away, and a lot of the variation just to be different became explicit so discussions about the design could be more deliberate.

How did you enforce this style guide, e.g. to make sure that sales cannot simply write a proposal in Tahoma instead of your company font or marketing publishes a newsletter in Arial using underlined words (gasp)?

A couple of years ago I had to spend a considerable amount of time PoC'ing platforms and tools claiming to solve this issue and I remember there were virtually none that did this on-prem using established Office programs, without a four to five figure investment. The organisation in question ended up using password-protected Word files with forms (a royal pain). Before that, they had been using Excel files (apparently it's easier there to lock parts of the sheet you don't want users to mess with) but this was an even bigger pain as Excel still has this decade-old bug even in its latest version which distort graphics when row height is changed...).

Long story short, I'd be curious how you solved this and what software you used.

Mostly we enforced it adhoc. It helped that the whole design team was pushing for print standards at the same time.

It ended up being a bullet point that needed to be talked about during sprint planning. It helped that designers were part of some of these meetings. If our company had been bigger it would have certainly been more challenging.

Mostly we tried to make it part of the culture that we stick to what’s in the style guide. And anything that deviates had a task of adding it to the style guide.

I have to say, that I wasn’t completely on board for a long time, since I was sure company culture would make it hard to succeed. We had a champion who got us about 90% of the way there before leaving. But there was enough momentum that another dev managed to bring together the right personalities and solidify the mechanics of dealing with the style guide that it was a no brainer to make it part of the work flow.

We used git to store the html and css examples. Made it trivial to copy. And since devs controlled the repo, devs controlled the pull requests as well, and required review just like other PR. It helped that the designers we had/hired were willing to deal with git, even if they had low tech knowledge.

I’m not at that job anymore, so I don’t know how long before entropy wins. It’s the type of thing you have to keep tending for it to continue to be useful. But enough process was in place that it will likely continue to be useful for them for several years at least.

I would argue copy/pasted code is usually not technical debt because it's very isolated and easy to change, and it's cheap to consolidate it later. Crappy abstractions are a way bigger source of technical debt than repeating yourself IME. It's a lot easier to create a new abstraction than modify an existing one, so you generally want to wait as long as possible.

I think you are being downvoted for the first part of your first sentence, with which I also somewhat disagree: copy/paste usually creates technical debt and even though each pair is easy to fix, with time they almost always diverge and are forgotten about. Which is bad.

But I strongly agree with your second part: bad abstractions are a much bigger source of technical debt. Abstractions that do not match the problem lead to convoluted solutions, reduced functionality and are very hard to break out of with incremental fixes.

I think that's why I'm being downvoted too, but frankly if the codebase is so bad that something is being copy/pasted hundreds of times to the point of unmaintainability, the abstractions that same team would build would probably also be disastrous, and much harder to deal with. Is that preferable?

But tbh, I'm skeptical of the premise to begin with. It's still easier to gradually consolidate 100 slightly different uses of a pattern than blindly create an abstraction that adequately covers those 100 uses. If your team is useless, it's better that their code be fragmented and atomic anyway, because big mistakes are inevitable. You don't want to make it hard to touch those mistakes. You want that code to be uncoupled & disposable.

But at that point the argument is getting pretty abstract. Depends on the specific codebase.

A single copy and paste isn't likely an issue. I've worked in code bases where that pattern ends up happening in dozens or hundreds of places - all with slightly different style/layouts. It becomes a night mare to change anything consistently across the code base because it's done differently in every place.

Copy/pasted code becomes technical debt at scale, especially if there's an unknown error in the original. Now you have 10s or 100s of variants out there, each individually tweaked and so not readily identifiable as true copy/paste repetition, all containing some variant of this subtle bug. You find out about for interface A, and fix interface A. Maybe you remember that B-C were copy/pasted versions of A and check them out and correct it there. But D-Z were all copies made by others, following your example, and you miss them.

It's not scalable, and produces severe consequences when used without care (which is most of the time, it seems).

Copy-paste is not a tech debt, but just bad practice. Bad practices can increase tech debt.

If code problems exist because someone made a strategic decision to prioritize a different raft of work, then it's tech debt.

If it was a tactical failure by John, then we just call that Tuesday.

It seems like your definition of technical debt is whenever anything is not completed (to your satisfaction), which seems a bit broad. I look at technical debt as the results of deliberate (architecture, design, or prioritization) decisions which have adverse impacts on maintainability.

I don't think there is so much of a difference between the deliberate decision and the accidental implementation of an incomplete or broken feature.

I have written software with tests and without. The latter start like prototypes and become quite useful over time. At first, everything looked good and there where just very few minor bugs which could be fixed easily. But over time, there appeared more edge cases which cannot be tracked down. As there are no tests, there are no specific definitions of what each part of the software is supposed to do. Yes, there is a big picture of how everything should work together, but obviously it doesn't catch every edge case. The problem is, that as a developer I can't act as effectively on my code anymore, because I took some shortcuts in the past.

On the other hand, there are other projects where I have test suits and where I can be totally confident, that everything has a well defined spec of what it should do and where I can't break something accidentally without noticing it.

So in the end it doesn't matter if your shortcuts have been made deliberately (most of the time it is a mix anyway). What matters is that due to the shortcuts you can't act as effectively on your code anymore.

> where I have test suits and where I can be totally confident, that everything has a well defined spec of what it should do and where I can't break something accidentally without noticing it

I'm often in the middle. Where I have tests, they're not always (or rarely) the result of a "well defined spec". But... the tests do reference whatever the understanding of XYZ was at that time. That's often as well-defined as I can get. Clients will come back months later and say "this is broken". Well, no it's not. The tests work just fine. "No, that's wrong, it's not supposed to work that way". Well.. there's a test with notes indicating "do ABC then XYZ" and the test ensures that's working based on what was known/agreed on at that time. That's often not the same as "well defined" by a long shot.

Bugs can also be technical debt, but then it has to be that people make a conscious decision that the bugs are not that important - like John says this will have a problem if people do X and the decision is made that X is an edge case that won't happen and John should work on something important.

The point of the post gets to the need to account for the debt and handle it. I think you're in agreement. And to watch out for reckless debt- it gets you into more trouble than it's worth, by definition.

The subject has been written about many times. This is a pretty good summarization of it.

Not called out: * Reckless debt is misfeasance on the part of developers. * "Prudent" debt is often avoidable- arbitrary deadlines are arbitrary. Management may believe that enterprise data models are as malleable as CSS. So "reckless" needs to be applicable to management as well.

I'm swimming in many years' accumulation of both reckless debt and reckless mgt. The deliberately considered debt I can handle.

I know your example is contrived but its amazing how complex we make the web

<input type="reset" value="Cancel">

I think the technical debt in your example is called CSS.

notice the "deadline" there...so the must be magic to hit deadline and no technical debt?

In my experience, I have seen much more problems because of over engineering by well meaning people than because of technical debt.

Simple but under engineered systems are much easier to rewrite than to simplify over engineered ones.

One thing I've learned is that what often looks like overengineering is actually underengineering -- it's much "easier" to throw together a complicated system than really think things through and develop something that accomplishes the same objectives in a simpler (but possibly more complex) manner.

A blogger I read when people still did such things asserted that this sort of overengineering is fear. It's really dreadfully obvious once you know to look for it.

People are so afraid of making a bad decision that they refuse to make any decision at all, and then make a giant mess in the process. What you should do is spend your energy on finding reversible decisions, and then not spend a lot of effort on actually making them. We know where the paint store is, we know they can make up paint in 15 minutes, fuck it, paint it blue, we can always paint over it later. Hard no to black, though, since you can't paint over that shit.

People are so used to avoiding decisions that on a few occasions I've entirely flustered someone who wanted to tear into me (sometimes with an audience) but cutting them off and saying, "Yeah that was a mistake, and here's how we're going to fix it." None of them had any idea how to recover from someone saying "I was wrong" and going on to try to fix the problem. I still have a little video in my head of one guy's eyes bugging out when he realized what I just said.

    “I apologize for such a long letter - I didn't have time to write a short one.”
― Mark Twain

I've seen that attributed first to Pascal:


There may be better information.

I think poorly engineered is a better term than under engineered in this case, but I agree with you.

Logging is my go-to example for this sort of behaviour. People write stupid amounts of log statements in their code, so much that it’s hard to even read the code to understand what it does, in the hopes that it’ll make debugging easier. Use a damned debugger! Take a traffic dump! Use strace!

What’s more, it’s extremely rare that libraries provide logs themselves. So the actual complex parts of your application, like say the HTTP library or (God forbid) the TCP stack, can’t be debugged this way.

If you find yourself writing a bunch of statements like “DEBUG: updating balance from 1 to 2”, stop and write some tests instead.

> Use a damned debugger! Take a traffic dump! Use strace!

I have encountered a fair amount of cases where that does not work.

- debugging a super large program ? loading gdb may take two minutes while recompiling to add a printf only 3-4 seconds.

- not an admin on the machine you are and the person with the admin account is not around ? sorry, you can't debug on macos (and likely in some linux distros)

- likewise, no traffic dump (and I'd assume no strace) if you don't have root access

And how about highly congested multi threaded programs?

I've got a program with log statements like that all over the place. Since stepping through it with a debugger would not even be possible. My IDE takes care of hiding the debug statements, since they're all encapsulated in different regions and if-def statements to log different types of things.

Isn’t that what tracepoints are for though?

Flipping your argument, you’re basically saying that you can’t debug multithreaded code if the program doesn’t log.

I certainly wouldn't want to debug this ball of mud if there were no log statements, no :)

But, I think we're depending on different programs in our workflow. I much prefer a logfile with statements that focus on what I want to inspect at the time.

I'm quite proficient with debuggers on both Windows and Linux, but I tend to use them less when dealing with my own multi-threaded code.

that works maybe with multi threaded code but not with greenthreads that are all over the place. the calls stacks are a pain in the ass.

> loading gdb may take two minutes while recompiling to add a printf only 3-4 seconds

To be clear I’m not talking about adding some specific print statements to the code while debugging, I’m talking about keeping those statements in production code.

Of course, there’s cases where logging might be your only option. In that case: Go for it!

> - debugging a super large program ? loading gdb may take two minutes while recompiling to add a printf only 3-4 seconds.

Relinking a super large program may take minutes, whereas loading gdb can get me a backtrace in 3-4 seconds! VS natvis files and other visualizers can hot-reload while I'm looking at a crash dump from production that took hours to trigger/gather/repro!

There are cases for logs - where sufficiently efficient conditional breakpoints are particularly painful to setup, for example - but the only time I've waited minutes for a debugger to respond has been with VS after major system updates as it refreshes a universe of symbols from the symbol server.

Do you have some really slow python scripts auto-loading in GDB or something?

> Relinking a super large program may take minutes, whereas loading gdb can get me a backtrace in 3-4 seconds!

well, we have definitely opposite experiences :) the main software I work on creates a ~1gb binary in debug mode and lld links that in a few seconds. Bud gdb and lldb, even with gdb-index, and all the optimizations I could find, both make startup slow enough that I can go for a coffee and it's not always finished loading when I'm back

On the contrary, I've spent months recently working on a system loaded with tech debt that has fallen victim to a half done rewrite. I've used a debugger on it plenty of times.

It was only when I added proper logging to the critical junctures in the code that I could finally see very clearly how it behaves and see what input results in what code paths and output. I learned more in an afternoon than I did in a month.

Sometimes it's hard to tell the difference.

for instance one smart and experienced guy I knew said if there was an error, your program should just fail instead of giving a ton of messages and recovering.

and if you're just throwing code together, that might be what you do.

on the other end of the scale, a well written "second pass through everything with cleanup" might have tastefully written code, a few relevant comments and consistency.

and then I've seen a lot of code that appears to look like that, but is copy/paste garbage. (sort of the coding equivalent of a wordpress theme)

I've come round to the "fail hard and early" view, provided you can be sure in your design that it doesn't leave junk lying around. Because otherwise contaminated data and invalid states start propagating everywhere..

This can be fun if it's a particular client request that causes a hard failure, and the client retries several times before giving up. Before you know it, all your instances are restarting and no traffic is being handled...

So fail early, but not too hard.

Interestingly I prefer failing hard specifically for data consistency reasons. I've had to deal with systems that swallowed far too many errors and let themselves get into an inconsistent state which then caused other operations to not fail but instead persist invalid data. It's very hard to clean messes like that up (restoring a nightly backup is easy, but that's a last resort).

> your program should just fail instead of giving a ton of messages and recovering

I would bet that he was right in that situation, but that he was also making a nuanced statement. Not all code should fail loudly, but when it should, "recovering" can sometimes be a distraction from a very real problem that needs to be addressed by the right person.

Totally agree. That complicated system is called a big ball of mud. Pretty easy to make.

Yeah, to me there's a natural progression in understanding of system design.

When you start out, you just write code and don't think about how things interact with each other too much.

Then you learn that you're rewriting a lot of code, you overcompensate and start to over-engineer to avoid duplication and often to handle more scenarios than you need. You abstract out possibly too much.

Finally, you get to a phase where you realize a lot of abstractions that you've created aren't actually used and you're writing more "meta code" than you need. In this phase you learn to engineer just enough for what you need at the time and design things to be easily changed in the future. You trust that even if you're system design can't handle everything right now, you've designed options for yourself to expand as needed.

Nowadays every CS student reads HN and blogposts like this one so by the time you end up having to work with them they're so indoctrinated into the "must write Enterprise(tm) ready code" mindset trying to teach them to be more relaxed is a real pain.

Yup, adopt to the business you’re in. As in, the company. Make suggestions if you think it’s worthwhile but in the end accept that there’s a “that is the way” in any company and you’re getting paid to learn and work with that “way”, and build it out :)

this last phase in your coder evolution reminds me of a very rough translation of the wabi sabi view on manufacturing. "Nothing lasts, nothing is perfect, nothing is finished."

TDD helps keep this "abstraction abstraction" thing in check.

I prefer TDD. More entertaining.

I think a lot of today's massively tested, microservice and dependency injected systems will end up as huge piles of technical debt in the future. If you think modifying a COBOL codebase today is hard, wait another 20 years and see how much "fun" it will be to rework today's systems once their platforms are viewed as outdated.

I'm sure you would love working with currency conversion in a Kubernetes operator :)

I would totally count over engineered a technical dept. Any design choices and implementation choices that make a code base limited in its future abilities to change and evolve, and which causes it to require more time and effort to be understood, I consider technical dept.

The dept is that you take time and effort from your future to save time and effort in your present.

I would call it bad investment not a technical debt. As an investor you could invest in a lot of businesses which fail if we use finance metaphor.

Agreed. I once worked on a project which was a microservices based system for managing various types of infrastructure. Progress was painfully slow.

There was a separate testing client that was used for a lot of the demos. The demos on the 'real system' would fail so often that somebody came up with the idea to just take a lot of the code from the microservices and drop it straight alongside the testing client, as a monolith.

It was so much more stable that after becoming the default way to demo, it became the default way to use the tool.

5 years on and I believe it's still running that way. In this case microservices were massive over engineering.

> In this case microservices were massive over engineering.

They often are.

Well-meaning engineers and architects seem to reach for them while forgetting that microservices are about scaling your organization, not your software.

There are very few technical deficiencies in a monolith that are solved by microservices. Indeed, they typically bring technical hurdles: distributed systems are hard to reason through and IPC over the network introduces latency. And you need a very strong ops team to implement them without major headaches.

But where they really start to shine is when regression testing takes ~days and your development cycles are screeching to a halt because you've got too many in flight features and not enough runway to land them.

What takes real skill is recognizing which of your monoliths should be grown and which ones need to be sliced down the middle.

Under and over engineering are relative to your understanding of future requirements, which by definition are both unknown and subject to educated guesses. Making such guesses correctly is what distinguishes a good engineer from a bad one. Nobody ever spells out the current requirements to any level of detail even; so forget about knowing about future ones. You just have to know, recognize, and adapt to that constantly.

A lot of over engineering is due to getting this completely wrong: you prepare for something that never happens, at great cost and while underestimating the difficulty level. So you end up wasting endless amounts of time on stuff that has no business value.

On the other hand, a lot of under-engineering is due to not having a clue about what is obviously coming next and getting caught by surprise by completely obvious requirements.

> understanding of future requirements, which by definition are both unknown and subject to educated guesses. Making such guesses correctly is what distinguishes a good engineer from a bad one

I would rephrase this as make guesses based on probabilities and urgency estimates from past experience. You can't know the future but you have stats of the past. So a good engineer is one that has relevant experience and applies it appropriately.

Basically a perfect description of the problem.

Seems to me the default is obviously getting things wrong whether that be over or under engineering.

How would one go about reliably making choices that strike exactly the right balance consistently?

One thing I'm starting to really internalize is that to do so requires a deep understanding of software engineering as a domain, the domain and existing system, and the technical vectors in the business.

It's kind of amazing that "over engineering" in software means the exact opposite of what the term should mean.

To me, engineering is taking a problem and the corresponding constraints and building the solution that satisfies it with the least amount of resources.

"over engineering" in software is about trying to make something adaptable? Or maybe just satisfying someones sense of beauty.. Or maybe it's just about programmer convenience at the expense of customers/user experience and hardware resources..

It's also a bit of an abuse of the term. Overengineering in engineering (mechanical, civil and other similar disciplines) means building a component to withstand more than the absolute minimum. So, a bridge you'd want to be overengineered by a factor of 2 or 3 at least, whereas airplanes are hardly overengineered (lower factor of safety).

But isn’t that how the term is used? The system is over engineered when it is built to scale to a million concurrent users while we current only have a hundred. Or when it is built to handle use cases that no one has required.

In this context I read it more as being excessively complicated to some unspecified end rather than engineered to withstand a load way larger than necessary.

Often it is excessively complicated because it’s divided into dozens of services to be able to scale, or having extra abstraction layers to deal with imagined future requirements.

Sure, but the fact that those requirements are imaginary instead of real is why it’s not actual overengineering. Overengineering is done with a specific dimension of scalability in mind for something that probably will happen, and there’s a budget for it too, etc.

Ah I was not aware of the general term and its' meaning. As you say, quite the abuse of the expected meaning.

I use the term to describe the introduction of unnecessary abstractions in the name of hypothetical future use cases. For example, using an enumerated collection of values where a boolean would suffice.

It could also refer to building resilience to load before it is necessary. Example, distributed databases with failover and recovery capabilities for a side project.

Poke holes all you want, those are bad examples, but they should illustrate the point. Over engineering is a real phenomenon.

The definition of "technical debt" is so pliable that we can easily regard all the parts of the system that are overengineered to be a form of technical debt.

Agree with this 100%. I've seen both in older codebases.

It's even more frustrating when you work with engineers who refuse to believe that any code they write could become technical debt in the future. These tend to be people who overcomplicate systems to anticipate future requirements

> These tend to be people who overcomplicate systems to anticipate future requirements

I don’t think I’ve ever anticipated a future requirement that did not ultimately turn out to be necessary.

Conversely, I’ve had a lot of people tell me something was not necessary only to find that, surprise, surprise, it was necessary after all.

“There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.” ― C. A. R. Hoare

This is how I approach software, but there are so many people out there who won't approve simple code, because it doesn't have enough configs, or classes, or whatever. It seems that the person who wants the most over-engineered code tends to get their way. Psychologically the absence of 'things' is always inferior and harder to argue for than having more 'things'.

The mode code you have the mode code you have to maintain.

C.A.R. Hoare: 'There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.'

I'd call React and Redux both massively over engineered. There's a ridiculous amount of lines of code being executed to do the simplest of things. You do something like `a.b = c` and really `a` is a proxy so that setting `b` the setting gets tracked going through 100+ lines of code all of which has tracking structures that get setup and torn down between components. They claim this will make your life easier but when the facade breaks you end up having to design your entire app around the fact that `a.b = c` is not simple, it's complex and so you have to avoid it as much as possible my piling on more complexity.

A simple react/redux app is definitely much harder to understand than a simple vanilla js app. But a large, complex react/redux app is far easier to understand than 99% of large, complex vanilla js apps.

All GUI frameworks hide a lot of what’s happening in the insides. There is a scale to it but personally I think that reasoning about a “reactive” program is quite easier than untangling a ball of callbacks written in say, GtK.

Reactive frameworks indeed hide a lot of complexity but that does have positive impact on the user code. (IMO)

Neither React or Redux use proxies or setters at all. Are you thinking of a different framework?

some lib recommended to remove all the redux boilerplate of react, uses proxies. It's mostly irrelevant though, the point is there's too much code running to do what should be simple things.

I think I write code the way people describe Jazz. You have to listen to the notes he isn't playing.

I leave breadcrumbs and openings in my code all the time so that the next time I'm in there, I'm set up, invited even, to ask the question I left unanswered before, and maybe do something about it. I leave the option of adding a feature, instead of building a config framework to support it and then defining one implementation. Which will end up not being the correct API when (if) I write the third one.

Sometimes people beat me to it, and get excited because they had an idea they're now invested in. I usually just let them have it, don't point out that I put the idea in their head. It's so infrequent people get that invested in functionality, you have to encourage it. There's too many things I'd like to do and never enough time anyway. They've just nominated themselves as a candidate for maintaining that module when I get tired of looking at it, freeing up my time for things nobody else cares about until it's done.

It's like if you were designing a house with plans to expand it later. You'd be very careful about certain decisions, like whether you should put a bedroom in the logical spot for an addition, because now that bedroom also has to be a thoroughfare. For 20% extra effort, you've extended the calcification threshold for that house/code by 80%.

Totally agree. As long as the system is easy to change, I don't care if it's "technical debt". I'll even go further, I don't care if it doesn't give us "flexibility" and "modularity". I don't care if it's not "generic". There is nothing that pisses me off more than people designing systems for future requirements that are not even being discussed by product leads.

Also, under-engineering can sometimes mean writing more code, or take longer to implement.

Which may be a great thing if it saves you time refactoring or modifying code down the line. The hardest thing about code is not writing it but reading it, and clever succinct code can be very hard to read.

I've become skeptical of people talking about technical debt and clean code and all the standard Fowler/Uncle Bob/PragProg-isms. Not because they're wrong. If anything they're all right, but in a rather vacuous, well-yeah way. Yes technical debt is bad. Yes we should start with a design. Yes we should do refactoring. But using the analogy of the author, that's like telling someone who's overweight and eating a lot of junk food that they should "eat healthier and exercise". Well yeah. They probably knew that already.

And so many people try to make it a dichotomy. Like that diagram of a project with a design versus without. That's not a fair comparison. What does a design even mean? Wireframes? A written document of features? How do we change this design? What's the way in which we get feedback? And what sort of projects are we talking here? A programming language is very different from a website. And what even counts as no design?

Some other notes:

  - I'm not sure the author correctly used the Pareto principle. It's not just a generic "split things into 80/20" but a specific observation on how 20% of the causes result in 80% of the effects.

  - The post could have used a quick proofread. There's quite a few spelling mistakes and poorly phrased sentences. I totally get if the author is not a native speaker, but they could easily enlist one to help (if the author wishes, I'd be happy to proofread—contact me).

  - I get the whole "make this post fun with cats!" but honestly I'd prefer a straightforward example with no funny images. Maybe that's just my grumpiness.

> Yes technical debt is bad.

Technical debt is like any debt. It's bad if you come due on it and don't have the resources to pay it off. Unlike cash debts, technical debt rarely comes due in a tangible fashion.

If getting to my next round vs. failing means leveraging technical debt, I'm going to do that by all means. It's not necessarily easier to fix tech debt at scale, but having more resources means it's less impactful on bandwidth.

Financial debt is okay if it gives you an important advantage now, and you can pay it off later, and you remember to actually do it.

If you get into debt so that you can have a nice new TV quickly, and you forget to pay it off, so it keeps increasing, and you only pay attention to it when the debt collectors start calling you... that is a problem.

In my experience, most software debt resembles the latter case. The goal is to meet a deadline sooner, and there is absolutely no intention of spending resources to fix it later, unless the customers start making too many tickets.

When a manager says "later" or "low priority", it usually means "never, unless we absolutely have to". Technical debt is always a "low priority" that will be addressed "later".

Yes, there are situations when meeting a specific deadline is critical, and it is a rational decision to cut some corners and fix things later. The problem is that after the deadline is met, the "fix things later" part is usually forgotten. A new deadline is set, and new corners need to be cut, etc.

This is my preferred way of thinking about it too. I wouldn't say debt in general is universally bad, and so I wouldn't say the same thing about technical debt either.

Technical debt can save a company by pushing something forward and launching a product sooner than a competitor. It can save a big customer by fixing a bug for them quickly that if not fixed immediately would have caused them to cancel a multimillion dollar contract. It can also eat away at a company if you let it accrue for decades and never pay it down.

The latter being much more likely to happen to the average company than the former, in my experience.

I'm skeptical of any Uncle Bob-isms (and to some degree of Fowler's stuff) as well, but if you want a good description of technical debt and why it's actually a problem, Ousterhout does it nicely in his "A Philosophy of Software Design".

That book is a little gem, short and well-written

Yeah. The probably rarely is the debt itself, but the processes and circumstances that lead to the debt being accumulated in the first place.

Looking back and going "this is bad!" at your 'technical debt' isnt all that interesting. It's much more useful to understand what lead you (the team) to take those shortcuts in the first place.

There is bad technical debt and there is acceptable technical debt. The mark of a good engineer is knowing where to cut the corners - which you will always have to do.

Hardcoding something in a file rather than pulling it from a config or a database has a pretty good ROI in terms of time saved, and it's not that hard to change later, but tying yourself to, say, a kind of storage that you know is not going to work long-term, that will cost you later.

Also, sometimes doing MORE results in more technical debt. If you are going to spend your time building a Kubernetes cluster because "your home page has to be webscale", you are an idiot.

> Payments couldn’t be processed in different currencies > If the delivery system is offline, the code wouldn’t work > Users with deactivated accounts could still access the system > No automated testing > And this, my friends, is what we know as Technical Debt.

I do not describe that as technical debt. That are mostly lack of features. The code could be fantastic and it is just that to support different currencies, disable accounts or similar things new code is needed.

The lack of automated testing is more in line on what I think that it is technical debt.

Details aside it is an interesting read.

I wouldn't even say lack of testing is technical debt - it's just another feature that isn't implemented. Once you implement it, problem's solved.

It doesn't make the system worse/more expensive as it grows, and it doesn't become more costly to "fix" the bigger the system gets. IMO those are the operative features of tech debt.

If you always have a perfect vision of your software and the team is very small then I would agree. (I.e.: I don’t write tests for my hobby projects)

But testing is mainly important so a developer not familiar with the program will not break it by modifying something seemingly unrelated.

Lack of tests will absolutely bite you.

I agree, I just wouldn't call it tech debt. It doesn't become more expensive to fix over time, except for the bump when you forget how the original code works. There's no metaphorical "interest" that you draw down on.

> I wouldn't even say lack of testing is technical deb

You are right, it depends on the kind of application and circumstances testing can be done in different ways. There is occasions were instead of automated testing there are other ways of working with quality. But, in this case automated testing is in the list of things that needs to be done. So, I think that to want automated testing and not have them is technical debt.

I said this below but I like to distinguish technical debt as having metaphorical "interest" - specifically, it becomes more expensive to pay back the longer you leave it, the ball of debt grows over time. It's stuff that gets baked into a product, and then is difficult (or impossible) to un-bake, and gets more difficult the more the product grows around it.

Aside from the initial increase in cost when you forget how the original code worked, tests are no more expensive to write in a year as they are today. Not significantly, at least. So I'd just class them as incomplete product - the effects are negative and can be expensive, but it doesn't dig you into a hole.

Technical debt is like credit card debt: it's high interest; it's not a huge issue if you use it wisely; and you have to track its usage and level against your total code/assets.

Like credit card debt, too many people turn around and they've got $20k of credit card debt or 20k LOC of tech debt.

I don't have too much of a problem with tech debt as long as: the benefit is clear; the debt is well understood; the payoff date/cost is planned. Just last night I was talking with a developer who works for me: "This way will require 5 queries instead of 1. Can you help me figure out this whacky ORM API to do it in 1?"; "Just do it as 5 right now and, as we learn more about the ORM API, we'll fix it later..." Tech debt: we get the product done faster; we've got a 1-5 line change to make (when we know how to make it); and we'll do so in a few months.

>and we'll do so in a few months.

Ah, famous last words that I myself have also spoken from time to time.

Yup. Mentally I pipe everything any devs, including myself, say through a sed command like this:

s/for now/forever/g

Yes, this is totally on the money. This is exactly what we want to see when assessing tech debt on an acquisition assessment (I do tech diligence for acquisitions). If you do it for good reasons, and you track it, and you proactively allocate resources on an ongoing basis or make tech debt remediation a first class road map item, it's not a problem, it might even be a good thing. If you wracked it up playing fast and loose and being all "go fast and break things", it could cost you the deal, or millions of dollars off the deal price.

I love this analogy. Dealing with a minor set of updates ... that now need to be done in 60+ all slightly different codebases for this weekend. I'm using it in our next chat with the business side.

Seems like if you want to make that statement (it'll ruin your software) you need more than a hypothetical example. There are some good ones in memory.

1. Netscape Navigator - was such "spaghetti" that they had trouble re-architecting it to allow it to keep up with evolving web standards, and then put everything into Navigator 6.0 which ended up running like a snail. (Second system effect). I'll note that according to jwz, it was written by people working lots of all-nighters.

2. Microsoft Word for Windows. Apparently had a ponzi-like accumulation of technical debt as they kept adding features to the release and writing code that looked technically correct but was known to be flawed, relying on testers to find the bugs they had known they wrote. Took version 2 for the bugs to get ironed out to a usable state.

I've tried coding with some open source programs that are mostly being used for a company's internal process that are basically piles of technical debt, but it'd be rude to mention.

I'm sure there are other great ones, but now that so much software is SaaS, we're probably not going to hear about it, because the release dates aren't as obvious. To me, eBay appears to be one, as they are trying to remake their web application to be more modern but it has tons of warts all over it.

The problem with "technical debt" as a meme is twofold: It's a lousy analogy, and people tend to label absolute garbage as "tech debt". The garbage is not a tool to enable quick growth. It's not a shortcut. It's just garbage.

"Technical debt" conversations usually go something like this:

Me: "Could we spend a few hours getting a handle on what the data model should look like? Maybe rough out a crude ER diagram?"

PHB: "Oh slumdev, you're such a Boy Scout. Just blam it into a document store. If something changes, we'll just be Agile about it. Technical debt is a tool!"

Cue 300% increase in development time.

Overtime and on-call duty come to save the day.

Ah, yes, if we just work 160 hours per week we can just about get it done at the originally estimated time.

I guess that means you can take an hour long nap every day to recover before your next shift.

The title the author chose does a bit of injustice to the contents of the post.

The post uses a lot of words to describe an understanding that experienced engineers and managers should have. Technical debt is a trade off. It exists in nearly all software of decent size as it's being developed. It is vital both for management and engineers alike to keep it at a reasonable amount, or you'll wake up one day with something that can't release new features, function, scale or be maintained.

It is also worth stating that tech debt can have little to no siginifcance to the success of a startup.

Technical debt starts to matter when the initial project changes focus or patches start mountig.

These days rewrites are popular. They make sense in some cases (when the project is smaller or introducing new framework). But in others where the original developer and business people have left and no one knows exactly how things work aside from they need to continue to do the same. In those cases fixing technical debt is extremely important.

A startup is more likely to fall into the above category if they ignore their technical debt because rewrites are hard and messy.

Seems to me that technical debt needs to be dealt with as a function of the revenue the company/product is bringing in.

The wisdom comes from timing the investment so it’s not destroying the “mature” product due to entropy or strangling the new product due over engineering.

Startups do seem to have bigger issues than technical debt to deal with. Because relatively speaking, there are bigger mountains to climb early in the startup’s development.

It is moreover not entirely unworthy to state that tech debt can have little to no siginifcance to the success of the founder of a startup. /s

You know what ruins software? Refactoring old stuff you don't like so the code base is "nice", which incidentally introduces 100 bugs that didn't need to exist, pushes deadlines back cause someone has to fix those bugs, and loses customer confidence to the point that investors and customers don't care anymore and then you have no one to build for (but at least your code base is nice and scale-able, to bad 10 x 0 customers is still 0).

I've seen more issues because of nonsense like that, or people insisting that code should be DRY and making ridiculous mental contortions in the name of a three letter acronym, or personal taste (the amount of fashion in software development is insane) but I've yet to see a project truly fail because someone cobbled something together quickly, and introduced tech debt. So far anyways.

> I've yet to see a project truly fail because someone cobbled something together quickly, and introduced tech debt. So far anyways.

Projects like that don’t fail in the sense of the word. They just die quietly or get replaced.

How do you figure?

Part of a codebase I work on suffers from a quality hotspot where half of it got rewritten so it was "nice" and the other half was so hard to reason about it stopped them dead in their tracks and it has persisted in this Frankenstein state for a reasonably long time.

On a project with an external deadline this year I had to make some modifications for compliance reasons. It was very painful. The parts that are rewritten are conceptually a lot simpler. They fit in my head. The parts that didn't get rewritten have incredibly high cyclomatic complexity and really don't fit in my head.

I want to refactor it so it's nice.

I also don't want to introduce lots of bugs nor push out deadlines.

A decision was made upfront that we would try to extract this component out of the main system, so it could be independently modified and deployed, as the capability to do so would actually provide decent business value. Sadly this particular decision didn't turn out well, and really, really pushed the deadline. Ultimately we spent about 70% of the engineering effort on the project trying to extract some functionality. I'd say about 10% of is actually extracted. So now it's even more of a Frankenstein split across 2 codebases now.

The project is coming to a close, and we will have a clean up session early next year. Our next project is related to this particular component also, so there is scope for us to be making changes to it.

The goal is to make this component more stable for customers, and safer to deploy.

I think a big bang rewrite is 80% likely to fail. I strongly suspect it's actually worth succeeding at though. I'm trying to chart a path through to a successful reworking of this critical subsystem. The first failure was quite humbling, but not unexpected.

Probably the best move I made on this project was getting good logging in place so I can tell exactly what it does, and made it a lot easier to reason about whether our attempted extraction does the same thing. Also I've managed to collapse some of the cyclomatic complexity, so it's easier to mentally keep track of as you're trying to read the code. It's still very, very hard to modify the functionality of the old stuff.

My current line of thinking on the best way forward is to abandon the attempted extraction, and repatriate that functionality to the main codebase. Then use characterization/approval tests and finish the original rewrite. I'd be happy to do it piecemeal too, even one ticket a sprint spread out over the course of the year, so long as we actually complete it. In addition I'm planning to try a a tiny, reversible experiment at a different way of extracting it that's much more likely to succeed and deliver the benefits we're looking for while simultaneously charting a way forward on another major piece of tech debt that causes a lot of stability problems. If that experiment succeeds I want to extract all of it that way, and would only attempt to do so once the entire thing was in a reasonable condition first. Getting it all the way to that state would represent a major win in my mind.

It's hard when we can't afford to fail at this, and we're not likely to succeed. Really trying my best to find a viable plan.

At the same time, if it ain't (that) broke, don't fix it right? But that leaves no road to new capabilities for us as a business.

Software is just a hard problem.

Your approach sounds fine. Boy scout rule all the way, when possible. Try to leave it better than you found it. If you tackle it all at once you're dooming yourself to misery.

I would love to see a sample of this code.

There is a balancing point between under-analysis and over-analysis. It's hard to know the true requirements until the system is actually in production. You can't realistically pre-learn many domain lessons: you just have to dive in and see what floats by setting sail.

Finding the best balancing point is hard to turn into a sure-fire formula. I use experience (age) to make a best guess based on on past successes and failures and project types.

One tip is to collect and keep a list of questions and suggestions, and make sure they are sufficiently explored. You can't answer questions that have not been asked, but AT LEAST answer those that are asked. Make sure staff is not afraid to ask questions or make suggestions.

You have to release the software, with hardcoded inputs, limitations and bugs. That is fine. Having a clean code helps you then quickly polishing it and evolving it based on the customer needs. I prefer that to an over designed system, that is 80% clean and 20% crap, because that is hard to change as time goes by, and that 80% will stay the same, but will become 50% and then 10% of the total codebase.

Nobody agrees on what "clean code" is. I agree there are rules of thumb most will accept, such as try to be modular (group related things together), manage constants in a table or central class instead of hard-wire them, factor heavily repeated patterns into subroutines/methods/classes, comment at least the "big picture" of sections, and the oddities.

I don't buy this concept. It's like saying "Financial Debt: Why it will ruin your company"

Yeah it can if you abuse it. It can also be incredibly helpful.

Absolutely. There are payday loans (poorly developed systems) that are clearly always a bad idea, and then there are mortgages (well-architected w/ focus on minimum viability) that get you much farther in terms of realized value than you could by saving up for a house (over-engineering).

Can we stop using the term Technical Debt?

First objection: Managers and Business persons understand Debt, like it and use it. I use credit cards and have a car loan. If we, as engineers, present the problem as "You can have what you want now, but you will incur Technical Debt" then any sane person will say "Sure! Good deal!" Ask yourself, how many times a manager or business owner has asked "What is the interest rate?" Zero times.

Second Objection: It's not debt.

Debt would be if someone said "I need to build a bridge across here that can handle an Army, and I've got $1m", we engineers reply "It will take $2m", and they respond "Ok, I will borrow $1m so you can build the bridge I need".

Instead what happens is they say "Well, build what you can for $1m", and you say "Ok, we can make 'a bridge' for that", and then either (a) your infantry can cross, but the tanks have to get diverted 20 miles out of the way, or (b) the tanks end up in the river along with the bridge. Since (b) is bad, you then have to spend a lot of time planning the routes for the tanks, and making sure the tanks have the right air cover, etc etc, i.e. doing more work.

It's not debt. It's just (at best) an incomplete solution or (at worst) a bad solution that fails at the worst possible moment - e.g. database collapses during registration for the largest event of the year.

Ah, but surely, if you build the lightweight solution for $1m, and acknowledge the increased costs of managing the problems that it doesn't solve, then thats fine? Sure, but that's not technical debt either! That is scoping: we (engineers + business) identify a workable solution that provides some business value. And then we do that well.

Cunningham makes a convincing case for the teerm:

https://youtube.com/watch?v=pqeJFYwnkjE (4m44s)

He made a convincing case for why its like "debt", but he fails to make a convincing case for its use, or any case at all that using the term has a positive impact on a team's success.

He even calls out that some teams think they don't have to pay back the debt. I think that is the norm, and that's the problem of debt. When financial people use debt, that debt comes with structure and consequence: it might be monthly payments or it might be due sometime in the future. Offer such people a "debt" that has nobody enforcing it, nobody quantifying it and of course it wont be paid back.

Agile engineers making scary "booga booga" noises about "we need to refactor" just don't carry the same weight as not being able to make payroll because your note is due. And yet that seems surprising to engineers.

he fails to make a convincing case for its use

Incorrect. He notes that debt (finanial or technical) is useful in getting things done more quickly, but must be repaid.

Technical debt enables getting a minimal product or prototype out to gain experience and further refine design. It is specifically contrasted with bad software. Deficiencies are to be addressed later. And yes you're correct in that this last step is often omitted.

My point is that the "debt" metaphor is what doesn't work, not the strategy of doing less work upfront. Specifically, having a scoping conversation, or a YAGNI conversation, or a "debt" conversation results in less work being done upfront.

However, if the conversation is one of "debt", then, in my experience, the necessary work is not done later, because there is no organization demanding repayment.

I keep thinking the same, did you find a good solution for it? My thinking is to argue to the business to schedule some dedicated time for "investments" to make the iterations faster.

I'm sorry to tell you that the only solution that has ever worked for me is to leave and join a company where engineers make those decisions.

On the other hand a lot of efforts to avoid technical debt end up producing more technical debt because in the end you don't know where things will be going. I think refactoring just needs to be an ongoing effort and should almost never be an explicit task.

IMO, a more pernicious form the tech debt starts with one system. A second person comes along and builds their own system/software b/c of some real or perceived issue with existing software (for example someone thinks something is poorly designed or not the right(tm) technology, etc). This second system then forces integration, APIs, TCP connections between apps.

No big deal, right? But the next person comes along and says "I don't like those systems - I want to put tech3 on my resume (a little resume oriented architecture, anyone?) and they build a 3rd system.

Now it's really hard to develop features across those apps. One has to be an expert in all those techs. Integration becomes 80% of the work. Now ppl have to create new or buy software just to get anything done. And so it goes.

There's a name for this, used a lot on this site.

Technical debt is not a thing.

It’s made up by people who want things done in a different way than they currently are.

All currently good design recommendations will become wrong ones in the future. Today’s best practice is tomorrow’s technical debt.

There is just code that works and makes the company money, and code that doesn’t work.

Code that is readable and code that is not readable.

Starting out with “the right design” will become the wrong design when the market changes and the specs change under your feet.

Moving fast and making software that meets market needs requires going back and finding parts that need to be consolidated.

It’s just how building software in the non-trilllion-dollar-company-or-in-academy world works.

Interesting. I would say that John didn't build a solution that met all of the use cases that the business needed.

* To ensure accuracy of transactions, he should have built reporting to meet the needs of accountants.

* To support change, he should have kept his components separate and provided modest testing examples.

* To support developers and operations, he should have found and documented the external dependencies along with steps to verify that they are in place.

Whether product managers didn't uncover these requirements or the business didn't prioritize them, these aspects of the definition of "done" have more to do with thee environment the engineer is working in and less to do with his or her work.

I see an issue in the web industry that there is sometimes an economical incentive to maintain a solution with technical debt. Just ponder the idea that you hire a bunch of contractors to build you a house, and then you keep them on payroll to come back and fix every broken pipe that they installed to begin with. Even if they feel honor for their work the'll soon drift in to ugly insustainable work due to the nature of their incentives. We might need a proper system to hold engineers responsible insted of hiding behind a company, and in the long run be a solution that provides _space_ for engineers, plausibly.

If, for example, there is no requirement to support multiple currencies then there is no need to spend time on designing for multiple currencies "just in case".

This is not choosing an easy solution, this is doing just what needed instead of wasting time and efforts on something that may never actually be needed.

Now, the design should be sound and follow good practices. This is how to limit the rework needed when new features are added because it is impossible to design to cover all possible future features: Do the minimum but do it cleanly.

Technical Debt only really counts if you have to pay interest on it. Many times I see new engineers wanting to re-write old working code in their new favorite language.

My litmus test for technical debt is things we should have done a while ago, but since we didn’t it’s hurting us. Continuing on same path will hurt even more. Better to fix the pain and then move on.

Working code that does what it needs and doesn’t require much maintenance is great code. Old != technical debt.

Discussion of technical debt always seems to devolve into an augment about what technical debt actually is.

Are there specific patterns of code behavior, metrics which can be derived from analyzing code, quality scores that are non-controversial, which can be used to define this problem? Things that can be measured and managed in a canonical way? How can some knowledge of experienced CTOs be distilled so that we can start to automate some of this wisdom?

I would say: technical debt is anything that makes the system more expensive to maintain (and grow) over time, and becomes more expensive to fix the longer you leave it. If it's just a bad or incomplete aspect of the system, that's not sufficient. It has to become more expensive over time.

So e.g. lack of functionality isn't tech debt, but a bad abstraction is usually tech debt.

Bad tech debt will ruin your software, but good tech debt is a smart investment in your business. It takes a good team working together to balance their "tech debt budget" against the other revenue needs of the company.

If I've run on Version X of some DB software for a while, and tomorrow Version X + 1 is released, yes, I should recognize that staying on Version X is tech debt. But that may absolutely be the right thing to do for now.

I started to use technical liability and technical asset in lieu of technical debt. Easier to explain and provides more insight into good engineering choices.

I like the metaphor technical tax, as it's the cruft that usually permanently sucks up x% of your development capacity. If it ratchets up permanently, after some time you're at a complete standstill.

This article lacks nuance around the tradeoffs inherent in shipping complex software at maximum velocity. Much more useful is evaluating the long-term impact of specific instances of technical debt, particularly “contagion”: https://technology.riotgames.com/news/taxonomy-tech-debt

I've worked on two seperate old code bases with pms who read the lean startup and they believed that technical debt does not exist. Seriously aggravating the situation thus. They did not seem to understand that the main thesis for that book is assuming that you have one in a million change of becoming a unicorn or selling the company to another party and cashing out.

> Let’s dive deeper into the problems of John’s code:

  - Payments couldn’t be processed in different currencies
  - If the delivery system is offline, the code wouldn’t work
  - Users with deactivated accounts could still access the system
  - No automated testing
> And this, my friends, is what we know as Technical Debt. Why?

Ah no, this is what is known as a broken software development process.

There are an uncountably large number of pathways one's efforts toward a goal can take. Given that the future is unknowable, any decision one makes can turn out to be an inconvenient (not necessarily wrong) decision. That is "technical debt". It is unavoidable.

If you are ignorant of the possible consequences of decision, or choose to ignore or hide them, that's something else.

Does anyone have any examples of companies that died because of too much technical debt? For me its kind of one of those things I acknowledge exists but I try to avoid any projects where their goal is purely "decrease technical debt" or even when that goal is on the periphery. And I see that kind of thing way more frequently than you might think!

Not a company, but a project. It has not died yet, it is instead being rewritten while the old one is on life support.

The technical debt accrued due to:

1. Poor code practices (C++, poor use or understanding of pointers and threads created most of the actual errors in the program that make it unstable)

2. Excessive copy/paste (so bad decisions propagated quickly)

3. Poor architecture (didn't scale, performance is technically fine but development costs exploded as new capabilities were added)

4. Poor testing (almost entirely manual, as the system grew this grew with it to the point of creating major delays in releases)

They're rewriting it now keeping these lessons in mind. They're surviving because they're part of a larger org that can afford to keep them around for the rewrite, but essentially this now 10-12 year old system is dead. The other reason they're surviving is they happen to have only one competitor, whose product costs 10x as much per seat. So their customers want them to survive if they can hit the quality improvements in the rewrite, failing that they'll be disbanded.

Just like any debt, technical debt is not "always bad".

The point of incurring debt is to make a "large payment" (deliver software) quickly and immediately, in exchange for paying interest over time.

If you can deliver a working prototype, scale, get funded, and grow your business quickly in exchange of technical debt, that's not really a bad deal.

One thing I'd like to point out: The tower in Piza was leaning, and gonna collapse. Then an architect devised a way to fix it. They of course said "hell no" because literally the only thing going on in Piza is the tower. To fix/destroy it would destroy the entire economy in Piza.

So instead they chose to lift/stabilize it.

Technical Debt is not a thing when you are coding a new feature. It’s like a stock future. It might be debt, but it might be something no one uses or gets dropped (or rolled into another feature). There’s a cost to speed that has to be considered. Maybe it’s debt, maybe you save the time and spend it on other more valuable tasks.

There are many dimensions to technical debt, but the biggest one is the sociopolitical one: developers hate fixing each other's bad code, but the people who wrote it are often in a position where they can transfer the ownership of this code to someone else, either by being their manager or by quitting their job.

I find it fascinating that buggy product is considered as delivered. That is the root of the problem.

I don't think debt is really a great analogy. Lot's of people are comfortable with debt, if nothing else.

I think it's more like a rushed major remodel of a house. It's kinda bad, but there's really just no practical way to fix it without a teardown.

Yup. It's actually technical tax.

I like to think of Technical Debt like an accountant or CFO would think about Financial Debt.

It provides leverage to incur debt at early stages, incur too much and it can really come back to bite, but a startup that incurs no debt will struggle to grow.

One questions I ask myself to bring clarity is,

If I am the one paying bills, would I really pay another developer for this refactoring? Yes: Technical debt it is. No: just busy work.

I've never seen a project or company killed by tech debt. I've seen projects and companies killed by obsessive-compulsively trying to eliminate it. YMMV

In the early days of computers the first working version was always so slow rewriting it a few times was an obvious step. You would end up improving a lot of things not necessarily related to speed. I still often crank out a really shitty version that works just to get to the points that require some deep thought. You can get a lot of work done if nothing really matters. The first rewrite is helped a lot by having done things before. The second full rewrite sadly never happens anymore. Just incremental changes.

Much of the discussion and worry about technical debt stems from a few fallacies programmers seem to fall into:

1. Some objectively "right" or "best" way to write software exists. We refer to "patterns" and "best practices" as if those had the force of science behind them, even though at the same time we can't, as a profession, agree on what we mean.

2. Code we (or someone else down the road) has to maintain wasn't written correctly or perfectly in the first place. Programmers tend to hate maintenance work.

3. We imagine that we could write the perfect program if only our managers and customers didn't impose time and budget constraints, or interfere with their stupid product and marketing directives.

4. Failing to write perfect code that lasts forever indicates a failure on the part of the programmer or the team.

5. Perfectionism (a symptom of obsessive-compulsive behavior) and the increasing worry among programmers about how others perceive and critique their code.

Writing perfect code that requires minimal maintenance in the future would require knowing that future, and all of the changes to requirements and constraints that will happen during the lifetime of the code. We can only work with the requirements we know, and those we can reasonably anticipate. Trying to code for requirements we don't have is usually called "overengineering," which means something else in other engineering contexts (as a few other commenters pointed out).

I have worked in software development for 40 years so I know that almost no one likes doing maintenance work, especially on someone else's code, and especially with languages and tools no longer in fashion. Maintenance programming is often given to new hires and junior programmers, while the senior developers get to write new code and play with the latest toys. This class division among programmers, often expressed in terms of what makes a programmer junior or senior, gets exacerbated by the personal quality of programming. Programming is a craft, not a science, not even engineering, but we forget that and try to express aesthetic preferences in terms of objective quality, even when we don't have agreed-upon ways to measure objective quality of code.

Most software has a short lifespan, which means the "technical debt" will disappear when the code gets rewritten or replaced by a SAAS product or the requirements change. Bridges are expected to stand for decades or even centuries. Jumbo jets have lifespans measured in decades. Very few software projects will stay in use that long. I routinely work on web sites that will stay online for less than a year, because they support short-term marketing goals. American companies have an average lifespan of 18 years, and startups are more likely to go out of business (or get acquired) than to stay in business for even a few years, so focusing on building the equivalent of the Great Pyramid in code at a startup or small company in order to avoid "technical debt" is almost always wasted effort.

Businesses factor maintenance into the lifetime cost of software just like they factor maintenance (and depreciation) into the cost of buying a fleet of trucks or a truckload of copying machines. It's the programmers who impose unrealistic goals of no/low maintenance and frame that inevitable maintenance work as a failure of the original design or implementation, i.e. technical debt.

We should try to do our best work, to make code that works (i.e. meets requirements and doesn't suffer from preventable bugs), and we should try to anticipate likely changes and make those easy for the next programmer. We should stop imagining a perfectly right way to write code, and stop thinking that maintenance work is beneath us, or a sign that the last team didn't know what they were doing.

I think programmers would enhance their value and learn a more realistic (and maybe humble) approach to their craft if they didn't think of programming as an isolated activity disconnected from the business and customers/users their code is meant to serve.

James Joyce wrote “What makes most people’s lives unhappy is some disappointed romanticism, some unrealizable or misconceived ideal. In fact you may say that idealism is the ruin of man, and if we lived down to fact, as primitive man had to do, we would be better off....” Joyce wasn't talking about programming, but I try to keep that in mind when balancing technical decisions against the larger set of business and customer concerns.

I think this assumes you want to write perfect software, when sometimes it just needs to be good enough.

This would be more interesting if it told us more about what actually happened with the Tower of Pisa.

Assessing tech debt is basically my job these days, as I'm an architecture and infrastructure consultant for technical due diligence on acquisitions, mostly by PE firms. Some fun things you might not know from the field of technical due diligence:

- Your startup is much more likely to succeed by being purchased by a private equity fund than anything else these days. Most tech acquisitions are now PE, like over 75% of big ones. (Even up to north of a billion)

- The PE fund will send a diligence team. Depending on who they hire, your diligence team could be clueless consultants with checklists, or they could be real hackers who will look at your code, ask all the hard questions, and know a lot about tech debt. They will talk to you for many hours about tech debt. We typically interview for 12 hours and ask for a lot of metrics and docs. We do this for a living, so we're pretty good at sniffing out BS and knowing how to get devs to talk about their software realistically. Most devs are pretty happy to talk shop once they know we are actually developers who have been through the same process ourselves. Sometimes I feel like it's more like Developer Therapy. :-)

- If you have bad tech debt, it won't necessarily sink a deal. Deals usually go through or not on other reasons, like business model revenue numbers - stuff that can't be fixed with the application of cash. However, we make a table for them of estimated cost to mitigate debt, and that can be serious cash. (i.e. X extra senior devs and QA for Y months/years)

- This cash will come off the price. Or as Amazon apparently puts it... "a haircut". A haircut can easily run over a million on even a small company. So the money saved on devs for the years of accruing debt just comes off the acquisition price, but with a terrible interest rate.

Also of interest: we don't actually say it's always bad. We differentiate between intentional and unintentional debt. If you accrued debt to get to market quickly and are then doubling back to attend to it reasonably, that can be a very smart decision and may actually reflect well on the assessment. On the other, debt accrued because the founders only hired interns and new grads and let them run the asylum for years does not.

If I were to encapsulate in one sentence what I've learnt about tech debt in the last two years and 50 odd companies examined it would be this: nobody saves money by trying to hire mostly junior developers. Nope nope nope nope nope. If you are running a start up, you need at least a couple of six figure salary, 5+ years experience devs, and a CTO or lead architect with like 10+ years or who really, really knows their stuff. Do not cheap out on this! At a guess, I'd say it's the most obvious differentiator between the hits and the really big haircuts. That said, I only look at companies who have gotten pretty far, so the main differentiator is likely different at early VC stage. We're basically doing the first exit, and if we're talking to you, you have a successful business that some professional big time investors want in on.

Hope that's of interest to some folks! :-)

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact