Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: As a CTO, what is your most frustrating problem with technical debt?
212 points by waterlink on July 24, 2018 | hide | past | favorite | 221 comments
_or as a person in a CTO-like role_

Ah, the technical debt.

While the concept is sweet from the business perspective, the result of leveraging technical debt, most often, is a big mess.

The technical debt is accruing every week, after each iteration, sprint, what have you. Your developers are whining about it, and they want to do a significant rewrite or solve it with microservices altogether.

And you know this rewrite, like others in the past, will be a significant sink of time, and likely a catastrophe.

This is how I see it from my humble developer/team lead perspective. I’m really curious to know how people in CTO roles perceive it.

I agree that this is a useful quality lever that can get the whole company out of a bind, or could let business leverage a market opportunity before the competitors do.

What I’ve found is that people in various project-manager-like roles tend to overuse this lever. I believe this lever should be used only when there is a significant gain for the business, and the debt should be paid shortly after that.

What I’ve personally seen, and what I’ve been told by my fellow developers, is that it is common to use the technical debt lever just to:

- reach the “sprint commitment,”

- meet the artificial deadline,

- increase “developer’s productivity,”

- show manager’s power.

These don’t sound like critical high-value goals to reach, given the price of the Technical Debt.

That is my perspective. Now, what do you have to say?

As a CTO (or in a CTO-like role), what was your most consistent pain, frustration or problem with Technical Debt in your company in the last 2 years?

Have you ever had any?

Thank you for your time reading this and giving a thoughtful response. :)

I think a lot of the time when a developer shouts “technical debt” what they are really shouting is “code someone else wrote that I’d rather rewrite than understand”. (The rest of the time is the same but they’ve understood it enough to think it’s a disaster area.)

I have found it’s best to not take tech debt complaints very seriously and instead look at actual success metrics. For example if every change to a bit of code introduces new bugs then that might be a reason to tidy it up.

I’ve never seen technical debt referred to such at any workplace - all the time I’ve seen the term used to refer to poor or outdated design decisions that compromise the ability to deliver on business objectives in serious ways.

Complaining about technical debt complaints is an easy way to potentially bias oneself against recognizing real issues before it is too late and everything is on fire all the time. Be wary of any such blanket assumptions in engineering.

Lucky you :) I think there's a reason that post is currently voted to the top.

I'm lead engineer on my startup's project, not the CTO but reporting directly to him. Team of about 30.

There are of course many legitimate things that can be described as tech debt, files that got too large and convoluted over many patches by many people, that sort of thing. But I find I have to watch out for the concept of "tech debt" because a few engineers will abuse it if given the chance.

For instance they will insist that code that was just written fresh by a colleague is "tech debt" because they would personally have written it differently, when there are no meaningful differences between the two approaches.

They can claim that a design they don't understand the reasons for immediately (regardless of whether it's documented) is tech debt.

They can claim any feature they don't personally want to do or which is perceived as hard work will create tech debt. And so on.

I wrote the initial code and I'm close enough to the codebase still to be able to fairly thoroughly examine claims of tech debt. I'd say most such claims are actually not a big deal and do not justify any non-trivial investments of time to resolve.

Moreover the sort of engineers who pay down more than their fair share of legitimate tech debt are invariably the ones who complain about it the least.

Sure, but are you differentiating between run of the mill bitching, and actual serious concerns? Because if I whine about x internal library in stand up, that is probably not important. But if I come to you in person and tell you that x internal library is introducing more bugs than saving time, I probably really mean/am concerned by that.

Freshly written code can easily be technical debt. An example of this could be "put a junior in a comolex piece of code and let him add a feature". That feature will work and probably is technical debt, because of bad design choices.

I've been on the other side of this. I was developing the frontend part to a series of http endpoints and my colleague was working on the http endpoints. He was creating some terrible code but I had no power at the time. When he was done, the code was functional, had various bug and a new feature was needed. I spent a week studying the code and managed to add a feature, which however was buggy due to how the code was written originally. I fixed all the bugs, and ended up totalling 4 weeks of work. This colleague was let go after this due to a series of circumstances, so I became the owner of the code. Then a new feature needed to be added but to do that i'd need to change various things in such code. Two weeks in, I still wasn't done. At some point I asked to be able to rewrite the code. I did, it took 2 weeks and was able to add the feature easily. I'm no genius but code that cripples you down from the get-go is tech debt. Tech debt is failure in software design, which ends up in unmaintainable code. I'd listen to anyone pointing at tech debt in my code, it means my code sucks, I made a big professional mistake and I need to correct it.

There is also voluntarily added tech debt, that's the one added by taking shortcuts. As long as you don't build new features on top of it, you are good.

  > For instance they will insist that code that was just written fresh by a colleague is "tech debt" because they would personally have written it differently, when there are no meaningful differences between the two approaches.

  > They can claim that a design they don't understand the reasons for immediately (regardless of whether it's documented) is tech debt.

  > They can claim any feature they don't personally want to do or which is perceived as hard work will create tech debt. And so on.
Yeah, those things are not technical debt. But that doesn't mean that technical debt is a real issue. Technical debt isn't about whether you like or understand a particular approach, it's about whether the codebase as a whole is still maintainable, whether the bugs make sense and are fixable, whether new features can be added without stabilising the whole code base.

It's never the new feature that's the problem. If the new feature can't be added in a good way, it's the old codebase that has unhandled technical debt.

> Technical debt isn't about whether you like or understand a particular approach, it's about whether the codebase as a whole is still maintainable, whether the bugs make sense and are fixable, whether new features can be added without stabilising the whole code base.

It's not about that either ... it's about things that were put off, generally for valid reasons at the time.

> It's never the new feature that's the problem.

What do you mean by "problem"? Many new features are inherently problematic.

> If the new feature can't be added in a good way, it's the old codebase that has unhandled technical debt.

That would only apply to waterfall development with infinite foresight. In the real world, wise people employ the YAGNI principle.

Ugh. Instead of redefining common usage you could just use a different word or words.

Edit: Is saying "bad design" or "ineffective design" going to lead to some fearful confrontation?

What common usage was redefined?

Where would you substitute "bad design" or "ineffective design" into the comment?

Perhaps this will help: http://wiki.c2.com/?TechnicalDebt

Common usage is "I'd rather rewrite than understand the problem", as well as "I disagree with decisions that were made."

> Perhaps this will help: http://wiki.c2.com/?TechnicalDebt

Seriously, it won't. That's the point. People overload these words. It's a cute ideal definition though.

You are so full of dead beef. You wrote "Instead of redefining common usage you could just use a different word or words", but he used the word as correctly defined. It's a pity that you aren't helped by knowing what the word means, but that's your problem, and the consequence is that sensible people write you off as a silly troll.

Over and out.

P.S. Yeah, go ahead and downvote me, troll.

Hey it's nothing personal. I'd apologize for trollishness, but I really believe what I'm saying.

Edit: I apologize for beating a dead horse.

OP's usage and explanation of the term are spot on, and you miss the whole point if you conflate technical debt with bad design. One of the causes of technical debt is a change in requirements, and changing the initial requirements doesn't mean that the design was bad.

Common usage misses the whole point as well.

Just because you personally misundertand a concept it doesn't mean everyone else is wrong.

> I think a lot of the time when a developer shouts “technical debt” what they are really shouting is “code someone else wrote that I’d rather rewrite than understand”.

Is it obvious to you that it's not just me?

> Moreover the sort of engineers who pay down more than their fair share of legitimate tech debt are invariably the ones who complain about it the least.

Oh really? I recently contracted with an industrial firm where most of the programming staff copy and paste at the drop of the hat and have no idea what DRY is, and think that technical debt and refactoring are some fancy university CS theory that "practical" programmers like them have no use for. I explained to the manager at some length the development and maintenance costs of the massive technical debt of their 30 year old code base, while at the same time I tried to reduce it a bit with every submit.

Good for you :)

I think your situation sounds a bit different, where the manager either isn't very technical or isn't technical enough. Otherwise they'd be able to see the problem for themselves.

What I'm talking about are employees who have managers who understand the value of refactoring. What I find is that the people who get stuck in and most unambiguously improve hairy bits of the code (with full support of management) are usually not the ones who complain loudly and publicly about tech debt.

Speaking as a manager for a sec here, what I'd watch out for in your case is to not be seen too much as a complainer. Rather than say "this codebase is shit why does nobody care" look at it as, "hey guys! why not come to my interesting 10 minute lightning talk on my favourite refactoring technique" and maybe don't phrase it as about tech debt if they aren't responsive to that kind of analogy.

Managers in particular probably know their codebase is shit even if they aren't very technical, but labelling things as "tech debt" is always tricky because it's entirely possible that the debt was created by colleagues or those very same managers themselves, and whilst financial debt is well defined and measurable, tech debt is just an analogy. They may not even agree that any given bit of code is a problem.

> I think your situation sounds a bit different

Even if so, that's not relevant; you said that people like me "invariably" don't exist.

> where the manager either isn't very technical or isn't technical enough. Otherwise they'd be able to see the problem for themselves.

No, that's entirely wrong. As I clearly stated, it's the programming staff who can't see the problem. They are the people who never complain about technical debt while also never reducing it, contrary to your statement. That's the point, and your "good for you" is point-missingly dismissive.

> Speaking as a manager for a sec here

Just don't; I've been developing software since 1965 and I don't need your lectures (which are based on a complete lack of knowledge and erroneous assumptions about the software manager at that firm and my relationship and interactions with him), I simply pointed out that your sweeping generalization about who reduces technical debt and who points it out is erroneous. That's all. Period. The end. I won't respond further.

It might be upvoted because people find it catchy. But if you have ever wrote code in your life, you know that person who wrote about it got no idea whatsoever about software development.

I'll upvote it as well, as it is possibly the thing with the least amount of intelligence that I've read so far in my life.

> Complaining about technical debt complaints is an easy way to potentially bias oneself against recognizing real issues

Would you agree that we should think about specific issues that are slowing us down, and fix them one by one, starting from the worst ones?

(worst in sense of biggest slowdown/pain levels)

Sure, but my point was one should consider complaints as signal that the situation should be monitored/investigated so that the general outlines of an actionable plan can be crafted if there are potential areas of risk, or in the most serious situation, an actual plan.

Blanket painting of a class of complaints doesn't help them or you accomplish business objectives better long term.

We agree more than you think.

Most complaints about tech debt are not worth actioning, but that doesn't mean you shouldn't listen to all of them and try to filter the signal from the noise.

This doesn't match my observations at all. In fact, way more importantly, often what is being shouted is, "we are creating more technical debt than we can possibly handle by doing X in Y way to meet Z deadline, this is going to bite you in the ass after we all switch teams/companies because nobody wants to deal with this."

Edit: If developers are warning the CTO that there is an issue with technical debt that should be a warning flag. Particularly if these are senior/experienced developers who understand the insane costs of re-writing production code.

> we are creating more technical debt than we can possibly handle by doing X in Y way to meet Z deadline, this is going to bite you in the ass after we all switch teams/companies because nobody wants to deal with this

This one doesn't require you to read between the lines, people usually just tell me that directly :)

Your mileage may vary - in my experience this is another one of things where it's not actually true most of the time. Note how it's a really easy to claim to make, and basically impossible to disprove ("at some unknown time in the future something bad will happen unless you let me have my cake now"). To me that's a warning sign that it might be wrong.

As a leader (or CTO or whatever) your job is to make the call. And if it turns out you were wrong and there is literally nobody who wants to touch the mess, well the buck stops with you and you have to clean it up.

I'm happy to take that bet!

Or you know, the CTO could quit. I’ve seen this happen multiple times. Teams are in max feature mode, introducing a ton of code duplication, inconsistent abstractions and hidden assumptions. Team members get fat bonuses. For the next round of features, they can’t be as fast since they have to fix the mess they created to add new things without breaking old ones. So one by one they start switching to the team working on the shiny thing. New developers are hired, they realize the mess and nop out.

The harsh truth, which is evident from the thread, very few companies/managers actually reward fixing code debt.

> a lot of the time when a developer shouts “technical debt” what they are really shouting is “code someone else wrote that I’d rather rewrite than understand”

There's a paradox in this viewpoint. On the one hand you believe that your developers are so immature that they automatically shout "tech debt" for any code they didn't personally write ("only MY code is good"), then on the other hand you're confident that the code produced by these immature developers is actually just fine and doesn't have any technical debt.

I know you're not speaking in absolutes, so you may be right. You especially may be right about your own specific situation. However, I think a lot of people who have this viewpoint are simply kidding themselves into believing that "everything is awesome, surely there couldn't be any problems because I am the team lead".

Just remember the general form of the argument is "my team is great and talented, but I can't trust what they're telling me". It might be the case, but be careful of your own biases.

This is not a paradox (or a contradiction). Many developers prefer writing code to reading code. They may be perfectly good at both, and still err on the side of claiming that they need to write code rather than read it.

You mean contradiction, not paradox. (A paradox is a contradiction that can't be resolved because both branches seem to be undeniable.) The contradiction stems from hyperbole (a type of lying) and other forms of nefarious (insincere, intellectually dishonest, aimed at winning rather than truth-seeking) rhetoric.

> “code someone else wrote that I’d rather rewrite than understand”

I'm not disagreeing, but there is something to be said for writing code that any idiot can understand so that future developers don't even want to rewrite it, and can maintain it.

I've landed in code bases on both ends of the spectrum, and I never asked management to rewrite the well-written ones.

In a wicked way, people who write that kind of code never get credit for it, and often lose leverage of their job.

I am not saying that writing convoluted code is good, no, but I can tell, even if based on personal experience, that barely anyone talks about "wow, this is such a straight forward code" or gets promoted for "everyone can understand your code!".

It is a weird dynamic that gets worst by having non-technical managers, and the reason why I believe team leaders should be hands-on in software development and involved with code review.

Let me get this straight:

Person A, a programmer, who writes easy-to-understand code, but takes a bit more time to finish their tasks

Person B, a programmer, who writes dirty code, but takes much less time to finish their tasks

The current non-technical manager will see only one part of the picture and promote Person B. Moreover, Person A might get in trouble, and pressure to “compromise on quality.”

Within a short-term, it is understandable.

Within a long-term, this is a disaster scenario!

So should the “code quality” be part of a performance portfolio of the developer when it gets to promotion time? And how do we get there?

I've also seen

Programmer A writes straightforward code that's easy to change and understand.

Programmer B writes convoluted "clever" code that completely falls apart when requirements change, or even just in production because their clever solution didn't account for all of the edge cases.

Now Programmer B has to save the day consistently, and therefore is seen as the person saving the company, when the problems were all of Programmer B's construction to begin with.

If you are a CTO and you haven’t had the “no more heroes” conversation with your employees, then it is my long reasoned opinion that you are being negligent. I can’t trust “leadership” like that.

And if you don’t take “I can’t trust you” as the highest insult a developer can give, then you are in the wrong line of work.

I would expand upon this and say that Progammer B doesn't ship the code at all, it wouldn't pass code review muster.

Unless we're talking about a seriously silo'ed org, or Programmer B is actually "Team B"

A lot of teams are a cult of personality around one developer. Particularly if there are a bunch of juniors on the rest of the team, Programmer B can usually get anything through code review with the 'ol "of course it's complicated, that's why _I_ am the one working on it. It's not for mortals to understand".

Treat every question about your code as feedback. If multiple people ask the same question you have a problem that needs to be fixed. Documenting the gotcha should be your last resort. Implying incompetence isn’t even in the board.

Yeah, that is probably a few first months of a start-up (or even first year).

^ This is pretty much standard at any startup in my experience.

This is not just bad. It is also unfair.

You have multiple engineers working with each other on the same code base, reviewing each other's code before it can be merged into the main branch.

Then you start to find which engineers the other ones don't like to work with.

The real value of that sort of person is in the leadership they can provide and not the mountain of dramatic code that everyone mistakes for brilliance.

I would and do advise anyone who possesses that sort of clarity to push for more responsibilities. They should be involved in all of the major problem solving discussions.

Yes, good code is a lot harder to spot than bad code.

At the risk of sounding too glib or reductio ad absurdum:

The best code is impossible to spot, because it's not even there.

Put another way, a truism I've often heard from the most senior programmers (especially in the context of critiques of LOC as a measure of productivity), is that their best work is often writing less code, or outright deleting existing code. It stands to reason that this kind of work is much more difficult to spot if all one looks at is the codebase in its current state.

See "The Most Beautiful Code I Never Wrote" by Jon Bentley.

It seems like:

“rewrite” is in sense of “I don’t even want to understand it, going to just delete it, and implement anew”,


“refactor” is in sense of “I had to spend quite a while trying to understand this code, and I think it can be improved, so that the next developer doesn’t have to spend this time.”

See also "Chesterton's fence"

That is a fantastic tidbit, TIL. This is pretty much what I had in mind to refute the assumption the earlier parent made about rewrites being desired so code doesn't have to be grokked. IME, the tech debt we complained about was in situations where we knew what we had to do to write a new feature, and why it wouldn't work because of the decisions made previously, and the monumental amount of work involved in putting just one more bandaid on that system, instead of just fixing the dang data model.

That makes me wonder: if I fix something, if I set up Chesterton's fence, how can I make it a bit more obvious why I'm putting that fence there for the benefit of future us?

At the risk of sounding snarky, I feel like a block of code comments explaining the naiive approach that was rejected, as well as the particular edge cases that this is handling which the simpler generalized case doesn't handle, would help a lot for that.

I realize that as developers we often glaze over the comments (my editor even shows them as low contrast grey), but docstrings and code comments can be tremendously valuable when trying to understand why something needs to do things a particular way.

The usual code-comments/self-documenting code/git commit descriptions, no? If a long-form explanation is justified, wikis are cheap.

I see no excuse for leaving the reader to guess at what on Earth you were thinking. Implementing a wacky solution is sometimes a necessary evil. Incomprehensibility isn't.

I am downright militant about people not relocating code and breaking the commit history.

When I see WTF code, I want to know what story it was associated with so I know why the fences were erected. Sometimes they are right and I need a complex solution. Sometimes they were accomplishing something the hard way and there was an easier spot to achieve the goal.

Basically I am fixing two bugs. The old one again, and the new one.

I just love doing that part. I wish I could be paid to just refector shitty code bases all day.

There is a level of luck involved in producing well written code. Sometimes your assumptions turn out to be correct and sometimes a big requirements change comes and turns your code from simple to super complex.

Sure, a giant-ball-of-mud codebase might be the result of changing requirements and assumptions, but it could be the result of laziness/unskilled development.

There's a skill to writing readable, ideally self-documenting code. If that skill is wanting, even the most stable requirements won't save you from the mud.

> could be the result of laziness/unskilled development.

Or not being allowed/kept too busy to fix it.

Correct. But don't forget that it can also go the other way. The best code writing skills can be defeated by ever changing requirements. Besides code writing there is a also a skill in being disciplined about requirements changes.

From this point of view, I guess you should make an exception and treat tech debt complaints more seriously when it is coming from the devs who wrote the original code.

Sometimes in order to meet business needs and get things out the door you have to cut corners that you need to go back and take care of later, or you are crippling the future growth and stability of the project. I like your final point "if every change to a bit of code introduces new bugs then that might be a reason to tidy it up".

I think sometimes proper testing is the first thing to go in a time crunch, and good tests improve velocity. If you know that the entire app is being tested automatically, you can develop a lot faster and more fearlessly. Plus writing good tests inherently makes the original code better. You find and fix hidden bugs, and you refactor the code to be more testable which makes it better in other ways.

> From this point of view, I guess you should make an exception and treat tech debt complaints more seriously when it is coming from the devs who wrote the original code.

Oh, that is quite an interesting idea. If the author themselves acknowledges there is a problem—there should be a real problem.

the other way works too. any developer should be able to discuss at length the compromises made and the weak aspects of any project. every piece of software has a long list of potential improvements in clarity, verification, efficiency, generality, etc.

when I first heard the term 'technical debt', I thought it was fantastic that we had a shared name. but as someone else pointed out in this thread, the normal compromises one makes because of schedule and lack of importance are really a different thing entirely than a codebase that is failing structurally.

normal compromises can be ignored, and often aren't even compositional. thats the kind of linear, easily repairable stuff I think should be called 'debt', and it isn't scary at all. you overcome it at some later date when and if it makes sense. maybe debt isn't the right word, because you aren't borrowing failure from some abstract ideal, you're just choosing to trade of time for quality, which every effort in every field has to do.

structural issues that compound exponentially, and make it increasingly difficult to make any changes at all are a different thing (call it 'crippling debt', idk). its a metastatic cancer and can easily be fatal.

Yep. I’ve seen this in a number of startups. Didn’t see it as much at big firms.

Reading code is harder than writing code. So, IMO, lazy or slow engineers would rather complain that code should be rewritten than refactored. Or that different frameworks / languages should be used.

Ive entertained them a few times before realizing that it was a waste of time and often things came out worse.

I don't think I would agree with the usage of the word "lazy" to describe coders that want to rewrite entire chunks of code. They may have their invalid reasons for wanting to do so, but I wouldn't call that being lazy.

Yeah, I feel that this is by and large the scenario I've witnessed. Not necessarily pushed by slow/lazy engineers but those that are bored with tried and true solutions that have some 'difficult corners to chamfer'. I think the problem with developers is that they want to solve a problem in a very recognisable pattern playing into the Asperger's like nature held by most perfectionists but unfortunately ends up much like your have outlined (but now they have some hot new technology on there resume so moving onto different problems is usually more desirable than re-fixing the new problems they've introduced to solve again)

> I think a lot of the time when a developer shouts “technical debt” what they are really shouting is “code someone else wrote that I’d rather rewrite than understand”

IME it can be one of many things, including but not limited to: was written by someone who is no longer with the company, doesn't have (relatively comprehensive) tests, doesn't have any documentation, commit messages don't explain the codes' reason for being, is a blocker to getting more lucrative features out, nobody understands it ergo nobody dares touch it, written in a language that is hard to recruit new devs for, is working fine now but will be a problem in x months/years time, contains potential WFIO points but is business critical, is difficult to scale, and possibly more.

What i'm saying in the above is that it's possible to write something in a easy to recruit for language, that scales well, is comprehensively tested, has no nasty bugs, but could still be considered technical debt because the original developer(s) wrote useless commit messages, no documentation, and then left the company.

> "code someone else wrote that I’d rather rewrite than understand"

I've never seen this happening but it seems to be common in hipster environments.

Do not hire people that want to rewrite something just because it's not written in the popular language/framework of the year.

Totally agree. And often if you make the effort to understand the code you can refactor it with reasonable effort to address the real problems.

A lot of rewrites I have seen just traded one set of problems with a different set of problems.

I like the boy scout rule. Clean up and leave it nicer than you found it. Focused mostly on things related to your change. Big rewrites cause big issues.

> Big rewrites cause big issues.

Those issues are very much constrained by a very good test suite.

I'm just going to ask which company you've been at that a) needed a big rewrite and b) had a "very good test suite". In my experience there's not much overlap here, especially because one of the reasons why a) is necessary is often that there's so little modularization that b) is not possible to exist.

> a) needed a big rewrite and b) had a "very good test suite"

Desk.com has a gigantic Rails codebase and a test suite that runs for... well when I left them it was 40 minutes long but it's probably longer now. But it did save us many many times.

I'm working at another company right now (can't disclose) and they also have a rather large Rails codebase and a fairly good test suite.

I think the Rails community historically has been pretty great at communicating good test/design decisions, so the modularization issue you mentioned is largely moot.

Like I said, you wouldn't want a huge refactor of the code there, because it's already tested and probably very testable the way it is. Otherwise it wouldn't have this good test suite.

Modularization is the issue, not size. If you have a big project and it's nicely modularized you probably already have good tests for it, but don't need to refactor. If it's big and needs a huge refactor it's probably hard to test and doesn't have appropriate tests.

Unless your test suite needs a rewrite, as well…

(Better refactor, of course)

Your test suite should essentially be your requirements. If it's out of date, it's the first thing you should update. Adding features or changing your requirements without updating your tests is the essence of tech debt: you're delivering features quickly at the expense of future work and brittleness.

I agree it's pretty common though. I've also seen huge test suites of only unit tests, no integration tests. As a result, you have no assurance that your refactors work because changing the inner workings fundamentally requires you to change the unit tests too, and nothing checks the top-level interface.

You either rewrite the tests, or the code.

Not both at the same time, in my opinion.

You mean a test suite that will take longer to write than the rewrite?

He means one that slowly goes out of date because nobody has the time to change it when the requirements change.

Whilst I find that this is definitely a good portion of technical debt, the biggest section I see comes from the intersection of reusing code and moving requirements.

Often when some code reuse gets nested (re-using something that is re-using something), then having to tweak it a little at the end to get to the right state.

Basically a bunch of development is done towards a local maximum. But looking at it as tech debt can bring it up to a global maximum.

>I’d rather rewrite than understand

The people with this attitude seem to be the best sources of hard to understand legacy code.

That is true.

While I have to agree that it takes the pain and struggle to understand some code out there, it is still worth doing, and a good indicator of the quality of the code, and the fact that it needs improvement.

Also, I believe leaving some “bad code” without understanding it, and just rewriting it, _might be_ unprofessional…

100% agreed.

Most code complaints are this code sucks. But honestly most code looks like shit, and battle tested code that's been heavily patched to work well and cover lots of special cases especially lookS like shit.

But when there is a documented problem with a piece of code that is important and worth refactoring because it can't be mediated otherwise then of course you need to fix it.

For example recently a previous dev split the data model across persisted HTML and rows in a database. Which means it was very easy for the domain critical computations(driven by the database) to display one set of inputs to the user, but calculate with another.

This isn’t a comparative literature class. I don’t have the luxury of spending all day savoring the clever hack you put in your code. I’m trying to fix a bug or add a feature. All the frippery just pulls me away from the problem at hand.

There was an agile coach I heard an interview with who insisted that Tech Debt was the wrong analogy. He preferred Wear and Tear. I think the problems of kludgy or overclever code are more apparent if you think of it that way.

At the very least, the main thoroughfares of the codebase need to be clean and free of dangerous obstacles.

I more see technical debt as "// TODO: Refactor this into its own class rather than do it here" written by a dev 4 devs ago

Interesting that this reply is voted highest, as it doesn't actually answer the question, but rather suggest to ignore the entire issue.

If you think it's mostly unfamiliarity, then that's a great opportunity to suggest some time spent to document that code (and/or operations thru that code). It helps w/ the understanding, and should be a precursor to figuring out if a redesign could help that "tech debt".

I'm sorry, but if you are going to dismiss a co-workers opinion like that, it seems like you don't value their work at all. Do you trust them enough to write decent code, but not enough to value their concerns? That doesn't seem fair to them.

This is _so_ true. I have a unique career, no CS education, built a startup to scratch my itch, had no teammates on my codebase, worked on it for 5 years, sold it, now I've been at (insert FANG) for 2 years.

On my own, I would look at code I wrote 6 months ago, wonder what idiot wrote that, then figure out if it was actually bad, or I just didn't understand it.

With staff turnover and a lack of incentive for engineers to prioritize shipping (thats the manager and PM's problem), there's a heavy bias to "It's bad, rewrite it" rather than "Oh, it gets the job done, I just didn't understand it, let me research and comment the code". _Tons_ of time is wasted.

I think technical debt is a bit more complex than merely "somebody else's code". I've also written code that increased technical debt. I hated it, but when there's a hard deadline and not enough time, it's unavoidable. Better planning and coordination can prevent a lot of technical debt.

But mostly, it's decisions that seemed to make sense at the time, but now turn out to be wrong. Or simply the accumulation of fix upon fix, feature upon feature, without without doing a redesign based on changing requirements or lessons learned. If you don't work to stop it, entropy will add up and eventually turns the code into an unmaintainable mess.

Nothing like the smell of software design truth bombs in the morning.

I'd add "this code I wrote is technical debt because I'd rather re-write it then create a proper test harness for all its bugs and edge cases".

If that's the case, it should be documented better. Not documented more though. Too often, people add too much documentation to the point nobody reads it.

Instead of rewriting the code, there should be some effort to show figure out how it links and how it works together.

What success metrics do you use?

Hard to give a one size fits all answer, it depends on your project. But some examples:

- speed of delivery (not so much hitting a public deadline as shipping it as fast as you thought you could internally)

- quality of final product (were there delighters that didn't get built because they were too hard)

- number of follow up bug fixes shipped shortly after the main project ships (did we ship a bunch of bugs and not realise until users found them)

- number of new users acquired from shipping the feature

How are you measuring these things?

1. It's one of my dreams to work at a place using prediction markets for internal estimates. Anything else feels worthless and engineers are incentivized into the "quadruple what you're currently thinking" estimate so they more often meet deadlines -- or try to "under-promise, over-deliver" except management rightly catch on to that one immediately. (As an engineer I still want to be a La Forge and work with other La Forges, even if Scotty[0] sometimes has a point; modeling managers as demanding children has some gaping model gaps.)

2. What I've seen is that many delighters aren't that hard, but they're often treated as "polish" that goes into the p3/p4 priority backlogs and maybe might resurface if a customer ever brings up a desire for that to the product management. What delighters have you seen that weren't shipped because they were too hard? Were they actually too hard, or was it just too much to get them done by some release date alongside all the other stuff so they get shoved to the backlog just the same?

3. Are you measuring bug count based on what customers report or internal bug reports? If internal, even if people aren't gaming this directly, it's still easy to get into a bad cycle where reported bugs get ignored and go unfixed which leads to fewer reported bugs. That might be mistaken for higher quality.

4. Is this one measured from surveys? Obviously not all features are directly customer visible so let's take something concrete like shipping an Observer mode for an RTS video game that shipped its big release a couple months ago. How would you measure the impact of getting new users from that one feature?

[0] https://www.youtube.com/watch?v=8xRqXYsksFg

As a Lean practitioner, I love the last one:

> number of new users acquired from shipping the feature

You sir. Have no idea about software development.

Extensibility is just as important as Bugfree software for some businesses. Actually, for most of them.

Technical debt hurts much more the extensibility of your code rather than raises the number of bugs.

Do new developers in a project really use that term?

In my experience it is used by senior developers having dealt with the same systems for years.

But I don't work in a fancy startup or similar. YMMV.

I've been the CTO at two tech startups. Here's how I think about it.

"Debt" is a good metaphor because by accumulating it, you get advantages right now in speed and ability to focus on other things, in exchange for having to pay it off later. You can either pay it off by stopping what you're doing (uncommon) and fixing stuff, or you can do lifetime installment payments of more complexity, bugs, etc.

Tech debt is just one aspect a CTO has to manage. Time to market is pretty damn important. Features are important. Many other things too. This is not to say that tech debt should be ignored, but elegance of the system is not the paramount concern. Just as startups borrow money and take financing to move quickly, so too a responsible amount of tech debt can be a good thing.

In engineering we tend to over-emphasize how important the technical qualities of the system are because that's what we see every day. There are a great many things that are also important, and various business scenarios where they become MORE important than tech debt. But if you're in the stage of trying to find product-market fit, you probably have bigger fish to fry.

The biggest tech debt I've struggled with always comes from the same patterned source. You think a system is intended to have a certain feature set for a certain user base, and so you make a set of architectural decisions to match. The world moves on quickly, you adapt, pivot, whatever you want to call it, and suddenly your carefully made architectural decisions are no longer correct or wise for your new scenario, but there is no time to go back and have a "do-over".

So in essence, failure to be able to tell the future, paired with a constant need to move forward, is the source of the problem. Both of those things are going to keep happening to startups forever.

In other words, it’s not a problem to be avoided, but rather a problem to be embraced.

If this is indeed true, how does one fully embrace it? Optimize for easy refactoring? What does this look like in practice?

No, I wouldn't put it that way. It's not a problem to be embraced, it's just another variable to manage.

Just as there is the trilemma of "better, faster, cheaper: pick 2" there are others as well. Speed and tech debt are balanced against one another. It's not that I think we should embrace tech debt. I actually rather hate it. But sometimes I'm willing to trade to get to a business objective that is not related to technical elegance.

In terms of how to pay it off, it may sound like a cop-out but it just depends. The core problem is that things are always changing and we can't predict the future. So staying supple and flexible I think is the way to go. Pay off as much of it as you can when you can, but I think the main point for me is to keep the eyes on some business objective (where tech debt appears as only one variable in the equation) -- the main point is not to keep your eyes on the tech debt all the time.

I’ll give my perspective (CTO at a small start-up), and I’ll start by saying that I fully agree with the GP.

Technical debt for us manifests as new cards in the backlog (and/or TODO comments in the code), because when we cut corners we usually have the presence of mind to flag it. I find that as long as these cards don’t rise to the top of the backlog, then it’s safe to ignore them. It means that you can afford to pay the interest of that technical debt.

If however you find yourself repeatedly thinking “To do this feature I really need to fix this issue first”, well then it might be time to repay some of the principal, so to speak.

In practice, the first time we build a new type of feature, we’ll do it quickly and dirtily and accept that it may be throwaway code. If we end up doing the same thing a second or third time, then we’ll refactor. But then we’ll know there’s a need to do it cleanly, and we’ll have two or three examples from which to try to generalize what the architecture should be. On the other hand, we really try to avoid over-engineering and premature optimization. If it’s good enough for Donald Knuth...

It is a tool to leverage when appropriate.

You can write tests when prototyping or you can borrow a little tech debt and skip the tests during the prototyping phase. At this stage it is like a line of credit.

Then, as the feature/product starts to take shape, you can pay down the debt by writing the tests and refactoring the code to be simpler. This would paying off the line of credit.

Or, you can roll the line of credit into a term loan and move on. This will increase your long term debt.

I'm not in a startup, but I always try to budget ($/time) for 20% overhead to undo stuff technical debt. When/where that isn't possible, it just gets harder and harder to fix.

These issues are "perfect is the enemy of the good" scenarios. If you don't have "technical debt" or something you don't like or got 80% right, that is a far bigger problem. How to manage really depends on your scenario.

This is where most of what 'debt' i manage comes from. We have a very large ecosystem of interconnected systems many of which have roots 15 years in the making. While the design decisions made back then were valid the expectations of our clients and industry have of course evolved. We're a small team and it's impossible to stop and rebuild the system so I try to manage it now by building services we can plug into the various systems. It helps reduce the scope of future debt and often allows updates without the stop and rewrite nightmare. Doesn't work for every case but it has helped and hopefully whomever I pass this on to will find it easier to manage.

This sound very familiar, the software development company I work for uses at least 10 different platforms internally across Linux, Windows & Unix...SQL server, mysql & oracle dbs. 10yr old PHP customer support systems, 8 yr old jboss & tomcat servers with deserialization vulnerabilities at every turn, .NET V3 testing tools, mixed up with a sprinkling of modern platforms all talking to each other through rest, direct DB connections, microservices & a hearty dose of black magic. Management ignores vulnerability reports due to mostly non-technical backgrounds. Most documentation is at least 3yrs old, when 80% of dev team sacked & outsourced to India. Mostly keeping the lights on now during an acquisition...can't wait to debrief the new owners. Have acquired some great experience in investigating technical debt in the process which will hopefully be useful in the future.

Technical debt is less analogous to financial debt, and more like an anchor. You want to move fast to keep up with the market, but your debt slows you down.

That sounds exactly like financial debt to me. Too much (financial or technical debt) is an anchor, healthy amounts of debt result in not being able to do everything / having some limitations but overall you're in a good place, and no debt probably means it's time to actively pursue opportunities. No?

With technical debt, each new feature now costs $nx where n is the multiplier for the accrued technical debt. A feature that you need in 2 months to support a market shift now needs 4 months to complete due to technical debt.

The issue with this oversimplified formula is that you cant accurately determine which debt affects which features. For some features it could be zero, and others it could be 100.

However, I do agree that all teams should be carrying an amount of technical debt to be healthy. It shows a certain quality of decision making to balance it well.

All too often though it becomes an excuse to procrastinate and that might as well be gambling.

I see this as interest on debt. When you have $1K of credit card debt, a $500 payment goes a long way. When you have $20,000, that same $500 is mostly covering interest, so you pay down the principal slower.

The additional complexity and work that comes with tech debt is the interest you pay on it.

Not really, I have seen whole features dropped. Like payment processing, business had an idea to use stripe, 6 month later all code deleted because our business model was not for people who would pay with stripe.

We also pivoted other application where only database stayed roughly the same. Loads of legacy stuff was still there but it is going to be phased out soon.

So I have seen situations where tech debt was never paid back. I have also seen one guy that is not paying me back my money, but I still remember he owes me. Some tech debt will go into oblivion in next year or two...

That anchor effect of being slow is what I meant by "installment payments". Financially, if you have enough debt you can make $300k a year and still live in a crappy apartment, because all of your income (in this weird metaphor, that's your available dev hours) are being spent on debt (fixing things that wouldn't have occurred had you not had the debt)

I will concede some points in your favor for that. I will counter with the fact that almost no CTO will change there deadlines accordingly, which leads to team burnout/turnover.

If you believe technical debt is developers wanting the code to be "perfect", you completely misunderstand what technical debt is.

Technical debt is a collection of choices you and your team make that can have real world impacts on your companies ability to deliver features in the future, can open your company up to hacking, and could make your software less competitive than the rest of the market.

Technical debt is not a loan, it is an anchor. Instead of deferring the debt payment until later, you pay for it with every new feature and eventually your development team will slow to a crawl or the product your building will no longer be cost effective due to rising maintenance costs.

Technical debt is a unknown risk. Look up any "X company was hacked" story and they have all come down to bad choices made by the product team. It could be a developer on your team left there machine unsecured or put in some bad code, or it could be a deferred decision to fix that technical debt until later.

Technical debt confines the space in which you can operate. If the market pivots, but your software is too rigid to keep up with the changing shape of the market you will lose market share or the ability to capitalize on new things.

All companies will have to make hard choices when it comes to technical debt, and no product can be debt-free. But understanding that technical debt is more than your developers whining about code is paramount to being an effective leader in your organization.

> Technical debt is not a loan, it is an anchor.

This is a great analogy. You can still sail a ship with one anchor, maybe two, but eventually you are going to have so many anchors that ship isn't going to ever get to its destination.

I try to use an economic model for all software. All code is a loan against taken against entropy (a liability).

Best solution is no code or a simplification of the problem to existing solutions. This results in zero liability, while the revenue the solution generates is pure assets.

When a code liability does need to be incurred, consider both the amount of code written (the principal of the loan) and the architecture of the solution (the rate of compound interest). A good programmer will work to reduce the principal, an excellent one will work to reduce the rate of compound interest with a simpler architecture.

The architecture that is does more by adding complexity and conditionals / code contains a horribly high rate of compound interest, even if it seems easier to do at first.

Architectures which do more / everything by virtue of being simple may be more difficult to implement and understand (especially without some background knowledge) but can have a vanishingly small rate of interest. Double entry bookkeeping, graph theory, pipelines, normalisation/denormalization, map/reduce, unidirectional data flow etc.

For any serious business investment (not an experiment or proof of concept), architecture as rate of interest always trumps quantity of code as principal.

Order of priority is 1. Don’t take loans 2. If you must, design until you have a low and manageable rate of interest. Revisit often. 3. Reduce amount of code.

> Best solution is no code or a simplification of the problem to existing solutions. This results in zero liability, while the revenue the solution generates is pure assets.

Not exactly zero liability, but the source of the liability is pushed outside the bounds of your own code, such that the burden is shared among far more people than you can employ on your own. Point being, you may still find yourself contributing code to these shared solutions, but that's a good thing, because such external work will have a wider-reaching impact than anything you build internally.

I run a bunch of websites on top of popular, open-source CMS software like Wordpress or Drupal.

An under-appreciated source of technical debt in those sites are plugin/module upgrades--the ones that are not security patches. Most developers and agencies seem to take the approach "if it's working for us, and not a security issue, leave it alone."

That approach works fine until there is a security patch. Then you have to apply it fast, and guess what--it comes with all the previous code changes you chose to avoid. Now you're getting all those changes all at once, and under time pressure.

Keep your plugins and modules up to date! It might mean some level of ongoing development to deal with changes in functionality. That's a good thing. That's a muscle you need to flex in order to keep it in shape. If you only update modules when there's a scary security issue, some of that technical debt is going to come due when you least want it to.

Another underappreciated source of technical debt is the differing preferences among developers and agencies on how to use the CMS + plugins/modules to build a site.

Ok you're building a Drupal site... do you build your layouts with custom template files? Context module? Panels and Panelizer? Paragraphs module? Do you handle embedded media in fields or WYSIWYG? Do build your sidebar with blocks or with Views + nodes + Display Suite?

You can build the same website different ways, and it will work fine. But a legitimate choice can often look like awful technical debt to someone else, who would have made different choices. You have to take this into consideration when hiring new developers or agencies to work on an existing site. Otherwise you can waste a lot of resources rebuilding functionality with no discernible improvement to the end user.

That's what backports are for.


Backports are very rarely provided by the community for older plugin/module versions.

Backporting by hand, on your site, makes technical debt worse. Now, not only are you not up to date with the community code, but you've got extra custom code that will have to be integrated or torn out to catch up.

In terms of competencies: if a team is struggling to integrate and deploy community-provided code, how good a job are they likely to do writing, testing, and deploying custom code? Especially in a hurry.

That's what I mean by flexing the muscle. A team that updates their site's code frequently is going to be a team that understands their application better, has more complete test coverage, and deploys with more confidence. And if custom code is required (like a backport), they're going to be better equipped for that too.

Backporting is a legitimate choice, and could be the correct one. But it can also a form of adding tech debt. You save a lot of time. In mature codebases where you're never going to do a major-version bump of a core lib it makes sense to backport.

As a counterexample, in Javascript land no one wants to support older versions and you'll very quickly fall behind. If you're lucky they'll leave the docs for a few previous versions online, but good luck getting any community support. In these situations you lose most of the advantages of using open-source.

Sure, but do you really expect the one-person maintainer of a random Wordpress plugin to be backporting security fixes? In an ideal world I'd love to see that, but my guess is that that's not often going to be the case.

Then the question becomes: is it more effort to backport this fix into a codebase I don't know? Or is it more effort to bring everything else up to using the new version. Either way, it's going to suck :)

It's true for any kind of library you use. Updating dependencies to the latest version is a form of regular maintenance you have to do.

There are a few side-affects of technical debt that drive me nuts. These come from companies where developers are autonomous, self-choosing of time investment:

* continuous complaint without action (i.e. pretending to be a victim of the system you’ve built)

* lack of measurements for a system (technical debt or not, how is it working? Are you improving your measurement system after each failure to further understand the system?)

* unbuilt is better than unmaintained (this builds off the previous two — tell me what we learned from our previous attempts. If you didn’t understand the quality of the previous attempt, then how do we know if we actually improved?)

I’m down with fixing technical debt, just show me some metrics that prove that it was an improvement.

> pretending to be a victim of the system you’ve built

Nicely put! I haven’t thought of it this way.

> Are you improving your measurement system after each failure to further understand the system?

Oh my god. Yes. This is a quote that made my day.

So far, it feels like we need to:

1. Have a concept of area (and maybe sub-area) of the code-base

2. Do frequent checks how painful it is to work with a certain area. Maybe, after every feature (ticket, what have you), make developers record how they felt about the areas that they’ve touched, and why. For example, this was really painful because this class is just too coupled to this other one or something like that. Track this information.

3. Track frequency and size of changes required to different areas in the code right now (+ anticipation in the next period, whatever it is).

4. Regularly have a technical retro, where you look at the areas, and prioritize areas that:

  * give the most pain AND
  * need to be frequently changed now and in the future
5. Then go through the list of problems and choose the most horrible ones to resolve

6. Schedule time to resolve these, just like we do with features/tickets/what have you.

This is just a draft of what comes to my mind.

This is very similar to what my startup is looking to help with.

For the last 5 years I've worked primarily with startup CTOs and enterprise directors/VPs to build teams practices XP, lean product development, and user-centered design.

My advice to them, and my philosophy on this topic, is that technical debt is just that- debt. It's not a dirty four-letter word to be avoided, it's something to be managed the way you manage actual financial debt. There are some projects where the software you build will be self-contained (no dependencies), will require minimal changes in the future, or it may have a short shelf life (e.g. it's a stopgap before you cut over to a new system). In these scenarios, accruing technical debt isn't the worst thing in the world if it allows you to reap business value faster.

When you're building software that's going to live long-term, or that drives core parts of your business (and therefore will need to change frequently as your business changes), technical debt needs to be tackled early and often. It's ok to accrue technical debt in the short term ONLY if you make that decision consciously knowing that you'll need to address it later (e.g. I had a client that needed to release their software for a trade show, so we accrued tech debt and created tasks to track every piece of debt we'd want to tackle later, and then prioritized that work immediately after the trade show).

That said, it's also important to be clear about what constitutes tech debt vs what is over-optimization. Outside of libraries and frameworks, I don't really need the software to be "perfect". I need it to be clear enough, tested thoroughly enough, and easy enough to change to maintain a high velocity. If a part of the code has become spaghetti-like, or if we see code duplicated more than 3 times, or if we have disparate code that's too closely coupled to easily make changes, that's technical debt worth tackling.

The vast majority of organizations I've worked with typically think much shorter term, which is why they find themselves repeatedly accruing technical debt to the point where changes to the system occur at a snail's pace. It takes a lot of discipline, and coordination between IT and business, to practice what I'm describing, but that's the ideal that I strive for and advise my clients to strive for.

"It's not a dirty four-letter word to be avoided, it's something to be managed the way you manage actual financial debt."

As others have pointed out, technical debt operates the same way an unhedged option works. It can go catastrophically debt. Compare that to issuing bonds. If you issue $10 million in bonds, you know exactly how much you will have to repay, and you know the interest, and you know when you will have to repay it. With technical debt, you don't know when you'll have to repay it, and you don't know how much it'll cost you. You just go along knowing that someday something catastrophic might happen. See this previous discussion:


Technical debt is very similar to financial debt.

Don’t get fooled. There is very small, and very important difference:

With financial debt, the business takes on it in hopes of delivering more value quicker that will pay for interest and principal, and have a lot of (monetary) value left after.

Financial debt, if anything, is speeding the process of delivery of value up.

On the other hand, technical debt is slowing the delivery down. And this delivery is what business hopes to speed up.

Well, maybe it makes this one iteration/release quicker, but it slows down the next one, and next one after that, and so on.

So, it seems that they are deceivingly similar, but you need to adapt the tactics and strategies.

I fail to see the difference. Financial debt is also a brake on future activities, in the sense that you have to pay back with interest, so less funds are available for other activities.

So in your words: it makes this one iteration quicker, but slows down the next one...

I see what you mean, and I agree partially.

Follow my thoughts (and please do poke holes in them!):

Financial debt allows you to deliver more -> sell more -> pay down debt quicker -> deliver more (if you got it right and haven’t failed).

So the money itself becomes a mechanism by which you deliver faster, and, as a result of faster delivery, earn more money to deliver even more and faster.

Now, the technical debt.

Earning more money from more feature delivered in current iteration might result in more revenue NOW. But does it result in more/faster delivery after that revenue happened?

Not really.

You can potentially hire more people. But they will only slow the team down at first. There is a delay between “more people” => “faster delivery.”

So financial debt can enable a positive reinforcing loop. Can technical debt do that?

Technical debt (like financial debt) DOES NOT slow the delivery down AT FIRST. That is why people incur it! It is pure "GSD" (Getting Shit Done) now, make it clean later.

Financial debt "slows profitability down" later. Technical debt "slows productivity down" later. Both affect company profit. LATER. But arguably increase it NOW.

The analogy is in fact apt.

I personally don't like the term debt. It's a simple analogy but it doesn't go very far.

The biggest issue by far around tech debt, which has been covered thoroughly throughout this thread, is that measuring it is tricky.

Measuring financial debt is trivial, you can put a dollar amount on it. Measuring tech debt is very subjective, as it depends on the level of seniority of your team, their ability/authorization to make changes (in banking, you can change nothing, in a startup, you can change everything), attrition rate, etc...

It's such a complicated equation, that using the term debt tricks upper management (non technical) to think it's easy to predict: it really isn't at all.

The most frustrating part is explaining technical debt. From a business perspective you have a running software. You can sell it - it has all the features needed. Now you come along and tell me you need to fix something, that no client can see, that does not add any features and it might even lead to bugs. And you need two people and x weeks. I can imagine how "technical debt" is perceived as "snake oil" or "developer fun time" for management.

To avoid frustration it is best to fix technical debt in small steps, mix it in with others open issues. Devoting developer time for a big technical debt rewrite is hard to communicate.

I've used a similar approach for technical debt: 1) Get buy-in from the stakeholders that it's OK for developers to spend 30 minutes a day making their life better; 2) Communicate a vision for what the end goal is to the developers; 3) Get the developers buy-in.

If the developers aren't willing to do it in small chunks, then it probably isn't bothering them that much.

The way that I've explained it with higher degrees of success is: "Feature X is used by a high percentage of users and also contributes to a high amount of bugs. We can invest two people and x weeks to bring those bug numbers down and product quality up."

Totally agree with you. As other people commented here. The key to communicating about technical debt is numbers. You need some metric to at least show that you have a problem and a number to show that you improved quality after reducing technical debt.

Speaking of technical debt, you might want to remember the following 'handy rule' from the book Team Geek[1], chapter "Offensive" Versus "Defensive" Work:

[...] After this bad experience, Ben began to categorize all work as either “offensive” or “defensive.” Offensive work is typically effort toward new user-visible features—shiny things that are easy to show outsiders and get them excited about, or things that noticeably advance the sexiness of a product (e.g., improved UI, speed, or interoperability). Defensive work is effort aimed at the long-term health of a product (e.g., code refactoring, feature rewrites, schema changes, data migra- tion, or improved emergency monitoring). Defensive activities make the product more maintainable, stable, and reliable. And yet, despite the fact that they’re absolutely critical, you get no political credit for doing them. If you spend all your time on them, people perceive your product as holding still. And to make wordplay on an old maxim: “Perception is nine-tenths of the law.”

We now have a handy rule we live by: a team should never spend more than one-third to one-half of its time and energy on defensive work, no matter how much technical debt there is. Any more time spent is a recipe for political suicide.

- - -

Previously discussed here: https://news.ycombinator.com/item?id=16810092 ("A Taxonomy of Technical Debt")

I'm one step below the CTO, but here's my perspective for what it is worth. It really depends on the stage of development your company is in. When you're first starting out, writing "good quality code" is about the worst thing you can invest in. The fact is, the code you're writing is almost certainly the wrong thing. A new startup is a hypothesis on a business model, and the first bit of code is to make the most basic tests of the hypothesis. It's VERY rare you get it right your first try. Writing good quality code is a long term investment, but if there's no one willing to buy it, it's a bad investment that will never have an ROI. On the other hand, if you're rewriting the system because now you know the market, and you know what it wants and you're scaling to meet it, now tech debt is bad. I think it's critical to understand the first system can and probably will be thrown away, so it makes little sense to build it to "scale".

Given the uncertain business direction, you will definitely need to make the code easy to change. Low quality code inhibits that.

Since there are already quite a lot of responses to your original question I'll instead address a specific aspect of your post: when you have an old legacy system that you don't want to build on any more it is almost never the right decision to try a one-shot migration or rewrite.

I have seen so many people try and solve technical debt with a full rewrite and it just never works. The better way of solving this problem is to gradually peel off individual use cases and API calls. It doesn't matter if your legacy system is riddled with bugs and is impossible to build on; it's at least a known quantity. Your new system will likely have less bugs in it, but until it's been used in the wild for a while you won't know where those bugs are. If your new system has a systematic problem, you are in some serious trouble, but if you migrated over slowly at least it will be manageable.

People here talk about technical debt as equivalent to financial debt, but I think it is wrong. Differences:

1.) Financial debt does not slow you down, technical debt does slow down work. Depending on how messy the system is, it can kick as soon as 3 months into development. It can make project fail or slow to develop and it is insidious - it is sometimes hard to see that things could be faster from CTO position.

2.) If you have money it is easy to pay financial debt and it happens instantly. Even if you have time, it may be hard to impossible to fix technical debt. The engineers might just create bigger mess in their attempt to clean things up, they may cause bugs, miss whole features etc. But most often, they just don't end with all that much cleaner system, because the mess is result of broken process or other systematic reasons or missing knowledge.

3.) It is easy to know when you have financial debt. It is harder to know (for CTO) when you have technical debt. Some engineers complain about pretty much everything someone else wrote. Many decisions are actually subjective calls where there are multiple valid options and the issue is not so much "it is bad", but "I dont agree with decision". They will never all agree. Some technical debt complains are attempts to make oneself look like more attentive coder, to excuse perceived length of development or part of power struggle between engineers.

I think the analogy may still be useful/interesting. It's just that with financial debt you have borrowed money, with technical debt you have borrowed time. Financial debt reduces the efficiency of future available money (interest payments), technical debt reduces the efficiency of future dev time.

I agree with your point that technical debt is a lot harder to identify, the way to pay some of it off may not be obvious and taking on that debt is not usually a conscious decision.

Sometimes you'll cut a corner to get a prototype done or hit an important deadline and there it seems more of a clear trade off, time now for pain later. But, as you say, the creeping (and inevitable?) technical debt of aging projects is less obvious. More like maintenance costs for assets and investment in rust proofing?

An individual engineer can know whether this cut corner is nicety, something with mild impact or something that will stand in away. CTO does not and should not make these low level decisions. Managing fuzzy information is the key to managing technical debt.

Not all technical debt is result of time pressure. A lot of it is lack of knowledge/experience (doing something new), organizational mess, maverick coder, competing visions among coders over architecture, confident senior with influence that is forcing others to produce bad code, demotivated coders, incentives etc etc.

Deciding whether this specific cut corner is not so much CTO job, big picture is CTO job.

> Financial debt does not slow you down, technical debt does slow down work.

It slows down your spending, or it ought to.

Instead of spending money, technical debt is spending time.


Then, in one month of time, you can potentially have 5000$ or 5M$ in finances available.

In time though, in one month, you can have only one month.

It is true, it is spending your time. But it seems like this time is much more precious resource than money.

What do you think?

It also causes bugs, up to infinite cycle of bugs making you being able to finish the thing. It causes higher turn over with precisely engineers who care the most leaving the most likely. It makes it easier for less capable or qualified people to stay and it creates culture that makes it impossible to create good code even when you all actually have enough time.

It makes accountability harder and politics among team uglier.

The Effective Engineer[0] has a chapter on technical debt, where he goes into how a lot of crappy code decisions are there for a reason, and how rewrites can be chaotic. If you have the time, I'd recommend giving it a read.

I frankly got a lot more pragmatic about asking for rewrites after reading it, and I feel it helped me to grow (mature?) as a developer.

Either way, I think there's a need to balance the need for employee self actualization needs which they often push as rewrite requests "Oh since we're rewriting this, we should (do it in)/use/etc X instead". I have often realized that a lot of requests to rewrite something are really tinkering desires camouflaged as a business related request, which is not to say that the code that does exist could have problems, it could, but having a period of debt repayment would improve it as well. So finding a way to allow your employees to tinker without letting their desires torpedo your products would be positive.

Either way, it's a complex subject, and I don't really think there's a single "right" response to it. Best of luck!

0: https://www.amazon.com/Effective-Engineer-Engineering-Dispro...

EDIT: I'm not a CTO, I'm a developer.

I notice at my job there were a lot of i need feature x to which we commonly replied i would like to build x but the horrible old y system is stopping me or we would just make things much worse to try and meld feature x into legacy/messy y system. I was a dev dealing with the legacy systems at my company for years and then i took over management and said we need to remove or clean up these legacy weak points so we can build features faster and more maintainable, so yes we started at the first day of this year cleaning up from the worst / weakest links first on. In 6 months we are done using 2 new hires to complete new work while 2 senior devs did clean up. 6 months compared to years of spinning tires was well worth it. Eventually the CEO gets sick of hearing excuses of why we cant build this or that or are wasting 50% of our time fixing bugs. Productivity is soaring now for multiple reasons at my company, testing, documentation, removing old systems and even though we have a small team every senior dev is leading a junior dev. Side note this is the first time we have hired junior developers at this company and its been a big pay off giving a lot more free time to the senior guys to work on the most important things.

I don't have much frustration with tech debt itself. It's a normal, acknowledged, often understood and predictable side effect of moving fast OR weighing tradeoffs.

I actually get pretty annoyed when it's conflated with product debt. I don't know if that has a real definition, but for me that's where your market/audience has grown to need things you did not envision/never built. This is often labeled as tech debt, even though it has nothing to do with your coders. This is an organizational issue, and one I haven't fully conquered yet.

Tech debt should not be accruing every week, it should be a conscious decision. MVP's are riddled with tech debt, intentionally. You want to get the thing out there and start gathering feedback/information, you acknowledge you don't know the right thing to build, and will do so once you have more info.

I'm also annoyed at the junior dev that doesn't take time to understand a system, and shouts that it needs to be re-written. Most times that's just because the dev is junior, or lazy. It's actual tech debt when the system is too convoluted to understand, or you haven't documented.

I have no idea what "sprint commitment" or "manager's power" are supposed to mean. They both sounds kinda sick.

If you know a rewrite is going to be a time sink and catastrophe, why on earth would you do it? Also sounds sick.

Your org should be focused on business results. If you're focused on optics, things are not healthy.

You are not in a race with your competitors unless you're in a market that is itself racing to a commodity. This is a business strategy problem vs an engineering one.

Technical debt seems far less a problem to me if you have a moderately capable software engineering team. It happens, yeah. But as your experience grows you know the proper balance of paying it down vs new work. It comes down to measuring quality of your services from an end user perspective vs how much work on engineering and support teams to maintain the desired quality.

~7-8 years cto'in

> I'm also annoyed at the junior dev that doesn't take time to understand a system, and shouts that it needs to be re-written.

That's something that your senior devs should be correcting through training and mentoring. Junior means he doesn't know any better, after all.

Speaking as a senior dev, one great method to impart this wisdom is to let the junior dev try and do it and see for himself what happens.

One big issue I have (and I'm trying various solutions around it) is code upgrade.

Let's say we are doing something one way - step 0, and we realize "hey, guess what' there's a better way" - let's call it step 1). Now going forward all new code is step 1, and touches to old code is step 0. Later you realize - there's an even better way - let's call this step 2. Now going forward you have step 2, step 1 and step 0. This process continues - but people who worked on this leave - leading to technical debt. This carries on for a few iterations and now you are on step 50 (49, 48, ...).

I have no idea how to deal with this problem.

I personally feel more micro-services would be better - but I also don't want to jump the gun since we were not too big a company at that point (and being profitable with our runaway took all priority)

It's a hard problem to deal with. I try to make sure the "better" way is "better enough" to warrant a change, knowing the technical debt that will ensue.

The biggest problem is that Product and/or Sales are looking for constant growth via constant new features. It is impossible to get them to truly understand that by meeting some arbitrary deadline that you've had to take shortcuts. To their mind, done is done and now we need the next Y so they get their Q3 bonus.

The next problem is that too often the tech team wants to do a complete rewrite every 12 months because they want to use hindsight to remake last year's system/feature today. The issue is that in another 12 months this rewrite suffers from the same problem and now has several more things to consider. Some technical debt will always exist and I would argue always necessary. Perfect is the enemy of good enough and software that isn't live isn't adding value to the customers or company.

My perspective for this was to treat some percentage of work each quarter to resolving technical debt but do this in an agile prioritized way. Just like always focusing on the feature that will add the most value to the next release, also pick the biggest manageable issue that will reduce the debt the most to the next release. In this way, you know certain technical debt will never disappear but you have a ranked list to always decide the things that are most urgent and correct them before the house falls down. If your team is large enough, it is also helpful to have a person or standing task to update and document this list with WAG options and costs. Of course, as in all things, ymmv.

Technical debt should be quarantined, not structural. That's the one, most important thing to take away. A really messy view with tangled if statements and hacks to get things to work isn't great, but it's leagues better than a really messy controller because it isn't going to introduce ACL errors. A really messy controller is leagues better than a fucked schema / data model because it doesn't spread the poison across every endpoint in your application.

Some easy wins are to default to making something a "many", even if it is only a "one" or a "one or none" because it makes adding a second case much easier. For example, a user should have multiple roles at multiple projects by default, so business users can do fine grain control over permissions.

Another useful trick is to have default behaviour, but introduce a set of tables for custom settings. This makes it easier to make changes because you don't need to run a migration to add a column to a table, you just write a new record to the "user_overrides" table.

But at the very end of the day, taking on technical debt of any kind can be warranted. It all depends on the stakes. If you need to get something done in a week to close a company-saving deal, then fucking do it. Fix it later. The implied interest rate might be 1000%, but if it saves the company it is worth it.

One of my friends came up with a tongue-in-cheek name for your "default to many"-advice: https://theharemrule.com

After being given the advice couple years ago, I found it to be applicable to quite a number of scenarios me and my colleagues faced. In my experience its pretty solid!

Yeah, it's one of those tricks that I find a lot of developers in their 20s haven't learned.

There is another trick that I use when I don't control the dev team. For example, say the company has outsourced a pricing app to a third party dev shop, in this case any decision I can make that will make the app more "table-y" I make. So instead of defining control flow to determine price for any given object, what I do is I get them to make a "rules engine" that can operate on the major variables that are typically used to make decisions. That way fine tuning changes can be done on our side, and it doesn't rack up billables. I think this is called "table based design" but Googling for it isn't showing anything.

I've been thinking how technical debt can be flagged / measured, and then managed. That is, a tool that can do a combination of:

Manual: Perhaps the person doing the code review can negotiate with the developer and agree that the code needs to be cleaned up at some point soon, but needs to be merged as is right now due to business needs, with some sensible score for the magnitude of the debt. When the debt does get paid later (stop laughing), the actual effort can be correlated with the debt score.

Semi-Automatic: The tool can take diffs from a pull request and show it to some other developers in the company and poll them on some subjective attributes like clarity etc. The score will be normalized based on other scores given by the person. Once in a while the app can even show nonsensical code in this poll to see how that is scored.

Automatic: 1. Flag the files that change frequently with issues, other pull requests, or 2. when there are long conversations and further commits to a file before a pull request affecting some files are merged (while accounting for some reviewers approving anything vs some reviewers nitpicking everything)... I have several more ideas like that.

Would you use a tool that does this? Do you already have something like that? Please feel free to email me your feedback, if you'd rather do that: techdebt@kirubakaran.com

We treat technical debt the same as any other issue. Put an issue on the board, decide on timing, assign it to a release, etc. I think the problem for some is that they don't realize they are creating technical debt.

Technical debt is inevitable. If you ignore it, it becomes worse. I've never been on a project without technical debt.

Of course not every little problem is technical debt. There are plenty of nice to haves, we could do X, etc. style improvements that aren't really that critical. My advice for those: just apply the boy-scout rule and improve things when you work on them. For the bigger stuff, somebody needs to step up and take decisions. Ultimately for big chunks of work related to technical debt, that's a CTO decision that involves taking into account the interests of both tech and business. A CTO must have the power to do this and be wise enough to use that power responsibly.

Stuff to watch out for as a CTO is when seemingly simple stories snowball into a lot of firefighting, bugs, or deployment issues. That's usually a good sign something is wrong. If that happens a lot, you are definitely experiencing technical debt. Firefighting eats away development budget. If your team spends half their sprint diagnosing weird issues instead of getting stuff done, cleaning that stuff up is probably time well spent.

"Tech Debt" and "Code Quality" are myths. I know that is an unpopular perspective, but allow me to explain. To start with, for me everything is just a problem described in terms of its properties.

What are some examples of problems I have encountered that were worth solving?

1. We have a server rendered site and an API. Changes to the server rendered site require matching changes in the API code effectively duplicating our efforts. If we re-implement the site as a single page app consuming the API we can kill two birds with one stone.

2. The lacking test coverage of a critical section of code matched against the frequency of changes to that code results in a frequent regressions and delays.

3. Unifying two divergent implementations will allow us realize a shared, multi-tenant deployment model for our partners. The load characteristics will give us the same point of presence for half the hardware.

Why are these worth solving? Because they have objective rationales. The third embodies a strategic direction with real bottom line consequences. The case can be made "If you want this outcome here are the steps" Then the only question becomes "Do we want this outcome?"

Many times "tech debt" and "code quality" are tossed around as easy justifications. They lack the intellectual rigor or organizational context to make a compelling case that its worth the opportunity cost. The person using those terms generally has strong feelings that its the right path, but can't really articulate why. In many cases there is no "why" and its more a of a "just because"

In my experience "Tech Debt" projects that are actually undertaken or not called "Tech Debt" they are called "Lets do this because its important to the business and here is why" projects.

The primary source of our technical debt is launch impatience.

Specifically, that most stakeholders in a feature or project accept "mostly works, with the exception of rare edge cases" as sufficient cause for launch. This usually takes the form of "get it out the door as-is so we can promote it, and then refactor later".

This is actually somewhat reasonable in terms of balancing business and technical interests, and I can see the case being made for doing it.

The problem, of course, is that "refactor later" never happens because there is always another project or feature demanding to be worked on. These same stakeholders might allow a day or two after launch to refactor, but then promptly decide that starting on Project B takes higher priority than refactoring Project A (which, in their mind, "works").

The above scenario covers at least 95% of our existing technical debt.

The core trade-off is "what does the business need now" vs "what are the long term goals". I only consider the technical debt that will block us in the future.

At any given decision point, you need to understand when technical debt is accrued and track it like any other issue. If you don't actively acknowledge technical debt from the onset, you place your business goals at risk.

In general, I see that the most politically charged projects are also accruing the most technical debt. They have high pressure, short time-frames, and scope creep out the wazoo. Those take the most of my time, and are extremely prone to failure unless there's a human shield to keep the heat off.

>> what is your most frustrating problem with technical debt?

Actually using the product that has a strong scent of technical debt behind it.

Vendor provides updates (could call them minor or major) which are essentially colored bandaid patches on top of most critical issues without any significant improvements.

Likely caused by waves of temporary hit-and-run consultants hired to "save the day" and each of them done that to minimize personal effort before collecting the paycheck and leave.

Eventually the product needs to be completely rewritten or risk to become obsolete due to healthy competition.

Poor or incomplete abstractions and code duplication. They usually go hand in hand. What typically happens is that one developer needs to solve a problem. He creates some abstraction (one or more classes) to help him solve the problem. He tries to make his abstraction reusable, but it's hard to do if there is only one use case for it. So the abstraction ends up with some essential functionality missing. Or worse, the abstraction may be poorly designed, so it's not reusable at all.

The second developer comes along with his own problem to solve. He looks around and finds that abstraction that solves a similar problem. He tries to use it, and it doesn't quite work for him. He could fix the abstraction, but that would require changing code where it's used. He doesn't want to touch someone else's code because he just wants to solve his problem. Or perhaps he just didn't realize how to use it correctly (because it wasn't well designed in the first place). So he goes ahead and creates a copy of the abstraction, modifies it slightly to fit his needs and voila - his problem is solved.

Couple of more iterations like this and you get poorly designed code being copied all over the place with slight variations. If there is a critical flaw discovered in all these copies, it becomes a major pain to fix. So naturally nobody wants to touch such code with a ten foot pole. In the mean time, the bad code gets duplicated more and more like cancer.

In startups a lot of technical debt is more organisational debt and lack of role specific hiring and poor decisions at a higher level early on when you do have to move quickly. For every "full stack engineer" you want a real systems architect, a technical writer, a testing manager/writer, a db administrator/architect, a UX designer. Plus at least one person for serious taking care of early stage devops/sysadmin/IT so that whatever you have isn't fully evolved out of the side 5 minutes someone gives it. You don't want to give the most agency, to the most opinionated and outspoken developer, but the least experienced. Especially if they argue for heavy handed engineering solutions they think are "correct" and this is almost the first ever real job they have had post university. But everything they say sounds great, and its the "right" thing you read online so they must be a good engineer, unlike these other silently working peons. You almost always end up with something brittle everyone else has to work around, leaving a pile of adhoc architecture and code that cannot be reused. The original hotshot "architect" you promoted above the others then gets annoyed and loses a sense of ownership because their "vision" has not been met by all of these other insufficient "full stacks". This is the recipe of every toxic BS startup out there. The solutions are simple, know what the fuck you are doing building a team and dealing with software. Have people who are head of software who have done more than a masters in writing a ToDo app in Java one time.

I have seen some very bad technical debt. Everyone talks about rewriting it, but what ends up happening is the company will just throw more people on it.

In a more perfect scenario, something like TDD would have been used and there would be a large test suite. There would also be some really nice well maintained requirements and technical specifications. Under this scenario it would be much easier to redesign or refactor the system. But given the reality of business, this is never the case.

As a senior sysadmin who has frequently been in the role of CTO because businesses often don't even have a CTO... my number one pain was always when the technical debt requires more personnel in order to manage properly but upper-management refuses to hire more people for the role. I know your question seems more oriented towards the SV-startup/dev cycle, so my answer is more for general business (even ones without programmers).

My pet peeve though has always been structured cabling. I don't know if yall understand just how bad the cabling situation is is many businesses, but it's bad and for some reason they really don't like spending money on that particular thing even though it's one of the most important things for keeping a business up with the times (and the data rates that match).

So, in general, it has been a frustration with the communication gap between the technical team and the C-level and the board. Too often you either get a CTO who wants to "program with you" or one who is too much MBA and not enough tech.

As a sysadmin I have realized my number one failure was not working on my MBA-side more and thinking purely technical proposals and solutions would win the day. AKA I should have spent more time golfing and going to lunch so I had more power with the Cs.

There are lots of reasons that TD happens but regardless of what they are, the easiest way to deal with it is to be open with each other. If the COO made a bad call on something, tell the technical team and work out the best way forwards. Same if a Contractor wasn't as good as you thought or a new framework promised lots but wasn't actually good at some of the detail.

When you can be honest about mistakes, tech people are generally more pragmatic than non-techies might assume.

In the other direction, I clearly communicate the "cost" of short deadlines or the need to finish something that is in a bad state so that the decision can be made with eyes open. If a quick fix now leaves us in a poor state to add new features then the CEO or whoever can accept that so that when they ask for something new, "sorry, system is too brittle".

In reality, there are many reasons things get rewritten so I also try and assure my devs that we don't live with these things forever, eventualy we remove the problems and create a load of new ones! I've rewritten our basic system 3 times in 5 years and only one of those was mainly down to tech debt (although we took the opportunity to update to latest tech etc.)

As a developer, I see Sprint ends as an artificial deadline that people end up strictly meeting to the bare minimum. I find technical debt accrues most quickly with smaller stories.

Agile coaches will blast you for estimating workload too high and blame you for technical debt and rushing. They will say we need to refactor as we go along. And that stories will need to be even smaller.

Quite frankly, bigger changes are easier to review as a whole and there’s less wasted time updating the test automation after each small change. Personally, I like to take a few days to review a change and I like to review a whole feature at once to see how it all fits together. It can be hard to see that until all the pieces are ready. Polishing it up before merging is also nice.

In the business world we are led by timelines and milestones and competitor products. We have fixed dates we need to get things out by. Ive not seen a business ever run on the agile principal of “take your time and release when ready and don’t commit to dates but commit to features blah blah”. And I’ve heard from a few coaches now that the whole organization needs to be agile for it to work because the business can’t be waterfall while engineering is agile.

Small agile rant aside..

I see technical debt as natural and acceues because we are given limited time to do something that keeps our paycheques coming. It’s normal for engineering to whine and for managers to push back. This conflict keeps both sides in check. Reality would stink if either side dominated the business. So don’t feel bad about telling your engineers “no”, but also give them enough independance to manage themselves as responsible professionals. Find that balance point.

Not a CTO, but in an architect role and spend my life cleaning up tech debt or building guard rails so it's harder to accumulate.

The most important thing is to teach decision makers (be they product or other senior engineers) that tech debt is a downstream result of the decisions they make. That's as simple as keeping them accountable for their actions instead of hiding the mess under the rug.


If a colleague uses the word 'later' in a product discussion, pretend they said 'never' and speak up if it sounds weird.

"We'll get the new feature working with our most popular product areas and circle back for the others later."

"The feature is ready on time, we'll fix those P3 bugs later."

If you don't call out the use of later->never here, they'll get away with it and you'll accrue tech debt for no reason. If people want to defer part of their implementation plan, that decision needs to be on the books somewhere with their name next to it. If management doesn't see it, it isn't happening.

My most frustrating problem with technical debt is how difficult it is to shoehorn this bookkeeping into whatever process management or task tracking system the company uses.

> If a colleague uses the word 'later' in a product discussion, pretend they said 'never' and speak up if it sounds weird.

I can't decide if this is really clever, or really misguided. Often "later" does mean "never", and sometimes that's ok, sometimes it's bad.

However, you can't do everything at once, and there are plenty of things that you really will do later, because you know that as soon as you implement one feature, a customer will request the logical followup, and you will do it.

I can't answer as a CTO-like person, but I'm in an Architect-type role and deal with our CTO-like person on a daily basis.

The high-level management cares only about "The Big Rewrite" because it is the only thing that gets visibility at that level of the organization and it comes with a large budget that everyone knows will be replenished for years as it gets spent. Our CTO-like person has fully bought into this idea.

We have too many projects and too few developers, so incurring technical debt is an everyday thing. It will be dealt with in "Phase 2" of the project, which all the developers know is something that never happens. From my vantage point - having one foot on the developer side and one foot on the management side, I see that the management thinks the developers are incompetent and the developers think the management is incompetent.

It's not a good position to be in, yet at the same time we have a pretty laid back culture, work less than 40 hours per week, great benefits, etc. So turnover is fairly low. Most of the developers have either a side gig or are going to school part time, or else they'd probably go insane.

Ironically to me, the most apt application of "debt" to any source you do is not the hack or shortcut code, but the libraries you bring in. You are literally borrowing against the knowledge and expertise of someone else.

The vast majority of the time, this is the correct thing to do. But if you find yourself scaling to absurd levels, this is the code you will have to worry about. You will find that you almost certainly will start to care about all of the details of the network that you were able to ignore in the past. Connection and thread pools you were able to ignore, now need you to invest directly in.

The hacks? Just make sure you aren't worried about aesthetic changes. Keep people on point solving problems, but don't berate them for wanting things to look at certain way. Developer happiness is important. Often that rewrite of a module that you thought would take too long, can help morale so that folks are moving faster by the end of it. Just make sure to keep it goal oriented and for the love of all things good, don't let them get away without keeping feature parity for your customers.

Not a CTO and I've never come close to having a conversation with one. As a developer, though, I've been burned by tech debt too many times to feel comfortable with it.

The biggest problem with technical debt is that everyone has a different tolerance for it. Some even have an appetite for it. This is a breeding ground for conflict.

The next biggest problem is that the people who take it on are rarely the same people who end up paying it off. The former tend to get showered with kudos whilst the latter experience stress and poor performance reviews.

IMO the worst kind of technical debt is that which is taken on to route around a business process ("Governance"). So it, sort of, contributes to faster delivery but the conditions which drive the choice tend to persist until a regime change so the compromise sticks around forever.

In an ideal world, you should only agree to take on technical debt if you are also presented with a repayment plan. And that leads to my final problem with technical debt: most people tend to assume that that last part will take care of itself.

> IMO the worst kind of technical debt is that which is taken on to route around a business process ("Governance"). So it, sort of, contributes to faster delivery but the conditions which drive the choice tend to persist until a regime change so the compromise sticks around forever.

Can you give an example?

Client access to database: too much effort to get stakeholder sign off for api changes and regression testing so we'll just grant the client full access to the database schema. Now we can't change the schema without breaking zero, one or many clients.

DBA deployment processes: insist on taking system offline for cold backup. Every. Single. Time. Solution: make application run dynamic code which is stored in a database table. Effect releases by issuing UPDATE statements. It's hard to argue that an Oracle database can't handle one of these reliably so the DBAs don't require cold backups for these. Still, we no longer have compiler support so our tests had better be solid. Bonus con: the DBAs cottoned on and started doing the cold backup thing for these updates so we lost business agility too.

Finally: a legacy web service. No toolchain but we do have WSDL and source. Ported to Weblogic in 30 minutes. Fits in with current infrastructure, more performant and gives company an opportunity to decommission the old box. But PM hasn't got a Jira for this work and we're already into integration testing so throws it out. The architect resigns.

Governance. A.k.a. Insurance-Against-Corporate-Liability. Responsible for at least 3 fucked up systems during my tenure there.

IMHexperience... There's no complete escape from technical debt, there's only reshaping / transforming and moving it elsewhere where it does not hurt (well, defects/bugs do not count as technical debt).

In most cases this fix is by rising complexity - as number of (meta-)levels of the system (mind you, that is always humans+machines system, stretched in time). So what is being used everyday by everybody becomes a piece of cake.. and noone is to touch the magic interpreter that lets it happen. And funny(or sad), that magic thing could also be done perfectly by itself.. so it looks like there's no technical debt... Just the experience-power needed to understand it, is like 100x what is needed for the easy stuff it has made possible. So, essentialy, technical debt has turned into "experience" debt. a Futures contract of a kind... or maybe i'm just too pessimistic..

Tech debt, whether creating it or paying it is a critical function of the business. Where this gets messy is when an engineering organization feels they own or somehow should prioritize the debt. This is especially true of paying it.

In every case that I've been involved in or had explicit oversight in creating tech debt has been when customer and revenue obligations are needed to be met. It's very rare in startups to have artificial or made up deadlines. I'm sure they exist at larger orgs, but as a startup executive, I can say with confidence that everything is constantly on the line, each day is critical, and so shipping product is paramount.

Of course, you have to pay that tech debt back at some point, but this is a business decision. It may turn out the business needs to kill that product or alter it in such a way that paying back debt is useless.

Technical debt should be paid one commit at a time.

You cannot pay it at once. Just make many small continuous improvements.

Slight tangent, but I personally don't like the term debt.

It's an attractive analogy but it doesn't go very far; worse, it leads people to think of it as easy to reason about.

The biggest issue by far around tech debt, which has been covered thoroughly throughout this thread, is that measuring it is tricky.

Measuring financial debt is trivial, you can put a dollar amount on it. Measuring tech debt is very subjective, as it depends on the level of seniority of your team, their ability/authorization to make changes (in banking, you can change nothing, in a startup, you can change everything), attrition rate, etc...

It's such a complicated equation, that using the term debt tricks upper management (non technical) to think it's easy to predict: it really isn't at all.

As the team gets larger AND/OR the product attains its necessary features, "technical debt" becomes more of a problem. I would love to address it at every stage of a product, but it is simply not possible, many times during early development our team is just figuring things out, seeing what works and what doesn't. If we were to analyze future development for each iteration during this stage, it would be a nightmare. I like to trust in my team that they have a good understanding of best practice and choosing between tough work and future work. When the product "feels" at a good stage, say a few bug fixes after release, we spend much more time refactoring.

Personally I find that the term tech debt gets overused. I've often heard it applied to relatively fresh projects for features/projects that just haven't been done yet, or to pitch the architecture of the day.

The times when I have seen a real breakdown where the term tech debt is legitimately used is when a scrum process breaks down deliverables below their smallest atom. E.g. different stories to write the code and unit tests, projects spread across devs such that no individual has ownership of any portion of the product etc.

In such situations we're not talking about debt as much as we're talking about junk.

Yeah I always laugh when there’s a project with no code written yet, but with plans for how we will “pay off the debt” that will inevitably be created.

It doesn’t need to be like that if you run your projects well.

Technical debt is code that makes everybody waste time (eg. because of its complexity), or that creates problems in production (eg. performance issues) BUT that you cannot easily replace because it is too tightly knit to the rest of the architecture, or it would be very long/risky. So you have to live with and pay its dividends every year.

The most frustrating thing for me is that you cannot throw it away and start from scratch (if you want to keep your users). You have to bite the bullet and find ways to make that ugly/smelly code go away in very small increments. This process can take years.

I think it is important to understand that technical debt is very difficult to define and quantify. Depending on your role, responsibility and skillset you will consider technical debt to be different things, and often attribute different (negative) impact of it.

Technical debt can cripple a development team, but I have also seen it being used as a continuous argument to invest into system development that really had zero positive business impact at the end. I am not saying to ignore it, but I also know that as a developer you tend to overvalue the things that are just in front of you.

I find that it doesn't get too frustrating if you don't let it get out of hand.

Routinely prioritize keeping dependencies up to date, clearly depreciate "old" / bad code, and focus on coming up with good product designs that have reasonable compromises between time-to-market and high quality engineering.

You just can't let stuff slip. Don't let 6 months go by without at least trying to upgrade to the new version of whatever or without making some some progress moving from the "old way" to the "new way" for some component.

I believe people are mis-framing the tech debt debate or at least missing part of it.

Tech debt has to be managed and triaged, this is true. Tech debt slows down product development and degrades stability, also true.

But tech debt is also a talent retaining and hiring tool. No one enjoys to work on a project that gets in your way for reasons that can be resolved. You need to pay down tech debt when people ask in a reasonable fashion otherwise, simply put, they'll leave.

As a follow up, what are the most useful methods for managing technical debt without incurring too high of a cost to team/product/project velocity?

Address issues as they are raised. Don't wait too long to get rid of debt, and most certainly don't rely on fixed time cycles of fixing things. A team has good sense of the issues they deal with, and you can capture their view in retrospective meetings. Depending on how bad the debt is, you should allocate time to improve. Velocity will only get worse as debt grows, so there is a clear business case for such stories: the more debt, the slower the velocity and the lower the quality. Spend time now, to speed things up. But just don't do it from a theoretical perspective. Address real issues, set real goals and measure them.

"And you know this rewrite, like others in the past, will be a significant sink of time, and likely a catastrophe."

To avoid this establish clear goals for the rewrite phase and measure results. Don't just rewrite for the sake of rewriting. Have a session where you identify issues and how to tackle them.

From a CTO-Like perspective, getting rid of technical debt should happen frequently, while keeping business goals in mind.

I think technical debt should be called "management debt" because management is the primary source of such debts. Technical debt implies it something engineers forgot to do or were just lazy to do it. While it maybe the case in some places , mostly its Agile/Project planning that lacks any appropriate place for fixing technical debt and refactoring.

I never liked this term because debt as we commonly relate to it, especially in business, is a quantifiable obligation that can be discharged.

In contrast "tech debt" is not a thing with a quantity, no one can know when or even if it will come due.

Legacy unsupported open source projects with outdated dependencies. (We work on a very large Java application (1.8M lOC) and were acquired by large financial institution that has a lot of policy around open source libraries and support.)

Not a CTO.

1. Not only legacy code eats resources that can be thrown on new developments. While legacy code is most troublesome, it still will be at maximum only twice as bad resource-wise, than new and pretty code.

2. As company grows, it may well be that a piece of software being used by by less than 5 people at the company, but require an own developer team to continue actively support it.

3. Maintenance resources vs resources spend on development is a never ending drama.

4. You should proactively think how to cull you internal projects, and be able to dictate to the rest of the company to stop using it.

5. Custom software for "business guys" - an easy way to multiply IT support budget. Hiring technically illiterate analysts, finance people, supply chain people costs a lot, a great great lot.

6. Last point. THERE SHOULD BE AN IRON WALL in between people making "money making code" and all other tech at the company. Can't stress this more.

> While legacy code is most troublesome, it still will be at maximum only twice as bad resource-wise, than new and pretty code.

This reflects an assumption -- perhaps one nurtured by engineers -- that paying down technical debt is about writing new shiny code.

But in my experience, new code is where most technical debt accrues. True technical debt reduction usually involves improving the behaviour of old code. It might also involve cleaning up that code; but mostly it's about cleaning up the workings of the running system.

2. As company grows, it may well be that a piece of software being used by by less than 5 people at the company, but require an own developer team to continue actively support it.

I recognize the scenario you’re warning against here — but worth being careful about conflating user count with value produced. Some really valuable tools have small numbers of users, e.g. tools for curating some internal dataset. Sometimes the number of users is small only because the tool is handling most of the drudgery.

Could you elaborate on the last point?

Well, the most extreme case will be for example: in an online ads company, somebody disrupts the team responsible for bidding software to do somebody's pet project, do a UI job, or making them do software for internal IT needs.

A point I often make about technical debt:


By biggest frustration with technical debt is that, at some point, it becomes so great, that the developers are more busy maintaining the existing code than they are building new, business critical functionality.

Given a software product with a modicum of market fit and or an established user community, AND a competent engineering organization, there is no reason to have technical debt in 2018.

Nine times out of ten the issues are being exaggerated and can simply be ignored. That's not what the engineers want to hear of course. They are eager to fix things, to stop their daily grind.

It's very rare to actually get stuck due to technical debt. The times I've seen it happen, we grabbed all the (relevant) developers, came up with a solution, and fixed it quickly.

As long as you don't have high turnover, technical debt is fine. You have a team who is aware of all the quirks of the codebase(s) and knows how to deal with it.

> As long as you don't have high turnover, technical debt is fine.

"If you tell them to shut up enough times, eventually they will."

> They are eager to fix things, to stop their daily grind.

Yes, because being ground down daily will eventually wear you down.

> You have a team who is aware of all the quirks of the codebase(s) and knows how to deal with it.

But that's not a good thing. If your technical debt adds an hour to a dev's day, every day, you're running at ~85% capacity. If you have tests that need to be babysat, or code that takes a long time to add new features to (or fix bugs in) because of tech debt, you're losing valuable time.

Even if it's not necessarily the sort of thing that brings a codebase to its knees, or the kind of thing that sinks a business, but it is the equivalent of refusing to tie your shoes and tripping and falling as you keep trying to walk to your destination.

> Yes, because being ground down daily will eventually wear you down.

This is true. If too much technical debt accrues unchecked over time, it can have a real effect on the quality of life and morale of your engineering team - which eventually manifests as high turnover, which in turn leads to more tech debt, in a vicious feedback loop.

I've worked at such a place, and the only relief from that cycle came in the form of lost clients. But then again, despite death spiral, the company was bought out, and the owners made out pretty well.

I think maybe the old adage about outrunning a cheetah is apt here... to outrun one, you only have to run faster than the guy next to you.

> which eventually manifests as high turnover

Or, arguably worse, a crew of beaten-down 9-5 lifers who don't particularly care about their job.

I question the motive to complain about tech debt with the “solution” already. For example, microservices will fix it.

"And you know this rewrite, like others in the past, will be a significant sink of time, and likely a catastrophe."

This statement can only be true if you're using a very specific definition of "microservices". The first time I wrote about small services, in 2013 [1], I had not heard the phrase "microservices" so I used the phrase "an architecture of small apps". I was thinking of apps that talked to each via Redis, or HTTP, or didn't talk to each other and pushed everything into a database -- my idea was broad. Then in March of 2014, Martin Fowler wrote his essay on Microservices [2], which brought the phrase into widespread use. Soon after that there arose a number of startups that were promoting microservices, and they were promoting a very specific ideal: a system of apps that spoke to each other using HTTP, finding each other via auto-discovery, and offering data via a RESTful interface, with Oauth for authentication. A large number of people then began to use "microservices" for this specific set of interlocking ideas. That's fine, but it is a bit rigid. Something so comprehensive can demand a full re-write. But there is the more general idea, of slowly breaking up a monolith, and replacing small pieces with independent apps -- that does not demand a full re-write. That can be done slowly, and you have the option to keep the original monolith in a limited role, often to handle the frontend. Martin Fowler said all this in his essay, but some of the more extreme proponents of the idea have ignored the nuances that people like Fowler or myself have written about. I myself used the phrase "microservices" for a few years and now I've given up on it as too many people are associating with a very specific set of ideas, instead of the general idea of small apps cooperating with each other.

All the same, the idea can be powerful. Check out "Two months early. 300k under budget" [3]

[1] http://www.smashcompany.com/technology/an-architecture-of-sm...

[2] https://martinfowler.com/articles/microservices.html

[3] http://thoughtworks.github.io/p2/issue09/two-months-early/

As with others, not a CTO, but an architect for many years, and I know the perspective I'd start out with, which is that the biggest problem at the CTO level is the mismatch between the people making decisions about what we work on next, which once you are no longer a 3-5 person start up is not the developers, and the people who understand where the technical debt is and what could be done about it.

Transferring the knowledge across that gap is very difficult, perhaps virtually impossible, because both sides of that transaction have a lot of incentives to mangle the message in the process.

The manager who is deciding what to do, and likely incentivized one way or another based on features, uptake, performance, etc. is going to always be very biased in favor of providing more to the customer, and will want the developers always visibly working on something that provides visibly more to the customers. They don't want to hear about why we need to rewrite this thing for no customer-visible change except maybe "fewer bugs", and probably don't want to hear about why the project is slower than expected ("just fix it! but without any additional resources") or what could speed it up.

The engineers have even more complicated problems. First, the engineer may not be able to correctly determine what technical debt even matters if they are not properly kept in the loop about where the product is going. That is, bad code on a feature we're about to dump or obsolete is way less important than on a core feature. Keeping engineers in the loop about future product directions seems to be a big challenge. (I mean, I've seen it done successfully, but it takes a lot of effort. It's just so easy for the manager to go to some four hour meeting with the CEO and the board about the product and all that ever gets back to the team is "We need feature #64." I mean, OK, great for distilling those 4 hours down to the next action item, but at least your senior engineers really could use a bit more guidance about where the product is going.) Second, the engineer is likely to have a personal connection to the code, which can have any number of effects ranging from being defensive about any suggestion that the code is less than perfect, to believing that everything, everywhere is "technical debt" and really needs to be rewritten.

To properly gauge the technical debt on a project takes an almost perfect merger of both the business side of the project and full engineering knowledge of the project. It's harder than even spec'ing out a new feature by far. I don't know that I've ever seen a great solution for this; the requisite bits of knowledge are structurally very hard to bring together.

(This is written from the assumption that technical debt is in fact a big problem. There's a distinct possibility that it really isn't, but I'm answering the question as written. In general if I was starting a new CTO job at anything but a very established software company, I would be surprised and pleased if "technical debt" really was my biggest problem, and not flawed processes, broken cross-team communication, or goodness forbid but all too likely, internal team communication.)

The idea of Technical Debt, though prevalent, is fallacious, because implicit in the idea is the belief that time-to-delivery is somehow _necessarily_ a function of code quality.

But this is not true. Here's why:

The first thing to realize is that code quality is an objective quality; it is the degree to which your code is decoupled, the narrowness of the interfaces between components and the relative size of each component. (This is objective because you can take two systems that deliver the exact same business function and compare them wrt their degree of coupling, cohesion, and interface width.) The first fallacy is to think that code quality is somehow subjective; it is not: not at least not for the sense of code quality that matters--which is how much _necessary_ business cost there is to evolve the code. (Note that _necessary_ is the key word; obviously it will always cost an inexperienced/novice a lot of time to contend with a code base regardless of its structural qualities.)

Another thing that gets conflated when talking about Technical Debt is the "flexibility" or generality of the system that is built. It is a mistake to consider a specific (ie, non-generalized) system as having higher technical debt. That would be like saying Gmail has high technical debt because it doesn't have Kanban features. A high-quality systems (low technical debt, so to speak) can be a system that is quite specific, quite hard-coded, etc.

These are two fallacies. Now that they have been called out, the last thing to consider is whether or not there is some _necessary_ cost savings in delivering a system with "higher debt" (ie, a system with more complexity).

As an experienced practitioner who has worked with many other experienced practitioners in large and small companies alike, I can say confidently that one's ability to deliver a simple system is a function of skill set not time. In other words, a skilled practitioner will take a no extra time to deliver a simple system albeit perhaps a specific one. There is a necessary time trade off if we are talking about making a system with more generalized capabilities--but at this point we are talking about features and priority and managing feature creep and the like.

This is where I think many startups fail when they think that 5 mid-to-junior engineers are better than 1 qualified* senior. The only caveat is that there are many (and, perhaps, most) "seniors" out there who have a not all cultivated the relevant skill set. Who have spent their careers leaning on this false notion of Technical Debt and the idea that "I simply didn't have the time to make a good system"--in this line of thinking, there will be no growth because implicit in it is the idea that there really is nothing new, no now skill set, to learn, that all there is just a shortage of time.

A tennis player would never think, "I would played a better game had I had more time." As if somehow there is a world in which tennis could be played in slow motion. No: the tennis player practices and practices until her muscle memory is tuned to play a competitive game in real time.

Likewise, a developer must _specifically_ train herself to play a good game in real time. So that the ability to deliver quality is part of their muscle memory. At that point it is actually harder -- takes longer! -- for this developer to build a complex system. Decoupling becomes instinct. Much like a skilled tennis player would have to go out of their way to play badly, with bad form, etc.

Unfortunately I don't see this training happening in either academia or in the practitioner's world. Everything is too optimized for immediate delivery and so people spend their careers learning how to hack it through and then only _after the fact_ come up with this idea that Technical Debt is somehow a necessary thing. The idea of Technical Debt is alluring because it is much easier to lean on than the idea that you have a serious lacuna in your skill set.

Now one might say I'm not being convincing because you still have to take the time to "learn" the skill set. This is true. But this is not change the above reasoning at all. Unless one thinks that learning and delivering can happen together in some optimal fashion. I don't believe that they can. I do not want the person building my home to be learning on the job, that is for sure.

As a dev, "or solve it with microservices altogether" hahaha. Good one!

I've been CTO (well, IT Director) for 3 years and I saw no particular problem with technical debt. I just handled it the same way I would handle financial debt. Repaying during good times and taking on debt in projects with high stakes and fixed deadlines.

It all depends on the industry as well. My CTO experience was in News and Media, where software is rewritten quite often and does not have any business value by itself.

I am now in a consulting position in the banking industry. Things are different there. None is willing to take on the risks associated with technical debt repayment. It accumulates endlessly. Once it reaches a point where no local developer would work on the project, the project is handed over to Asian sweatshops which support it indefinitely.

That's also a problem in medical software. People are so risk averse that they avoid relatively small changes until the whole thing comes crashing down eventually.

I have also seen the trend that good developers start to leave and the only people left are low quality developers or outsourcing companies who don't care about what they work on.

Another aspect is the size of the business and the related dilution of power and responsibility.

Repaying technical debt in production software means introducing new bugs, instability, change. Users will have to adapt, business KPIs will suffer. The upside won't be visible. No sane middle manager would agree to that, because this will negatively impact his career path.

As a solution, I advise organizing work in maintenance teams rather than by project. Having personal responsibility for each feature or process or product also helps.

In the end working on large long term projects that a lot of people depend on just is not much fun :). Working on smaller projects where you can make quick changes is much more satisfying.

> None is willing to take on the risks associated with technical debt repayment.

This risk is vastly mitigated with good test coverage.



> Repaying during good times and taking on debt in projects with high stakes and fixed deadlines.

I agree this is an amazing strategy when you can do it!

> None is willing to take on the risks associated with technical debt repayment.

What do you think they are afraid of? What risks are there? Is it possible to mitigate these risks in some way?

The downsides of the technical debt repayment are visible to everyone: new bugs are introduced, changes are made and users have to adapt. There is trouble for everyone who is not the developer. The upside is visible only to developers in the short term.

Sometimes a technical debt repayment may require a big change in business. I saw a banking system where the authorization framework was based on impresonation of users. Repaying the technical debt would mean going over all contracts and adjusting their conditions while developers implemented a more traditional role-based access control.

Such a change was so unrealistic that middle management did not even talk about it until some day the project got outsourced and most developers — reassigned to other projects.

>What do you think they are afraid of? What risks are there? Is it possible to mitigate these risks in some way?

Not OP, but usually there are no (trusted) unit tests.

IMO, unit tests, integration tests, TDD, BDD, etc do not matter as much as the size of the codebase. If your system has millions of lines of code, you are doomed and no amount of unit tests can help you.

I believe thats why modularization was invented?

Modularization per se does not work. What works is administrative boundaries. Once a product reach a certain size, the team should be split up.

"Once a product reach a certain size, the team should be split up."

That is also called modularization.

Refactor everything all the time, duh

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact