> John knew his code had a few bugs. But nearing the deadline, another project appeared and it took a lot of John’s time, so he couldn’t go back and fix the problems.
This is a problem, but this is not inherently technical debt. That's just delivering a bad/incomplete feature (Not necessarily John's fault if he's been asked to make changes). Code with Technical debt works or mostly works, but relies on non-maintainable or less ideal than patterns.
Simplified example. Adding a "Cancel" Button to a form. This button involves adding a new "secondary" class style to the button.
* Technical Debt - Button is in place. Works correctly. Instead of a <Button> component and tweaking it to handle the new "secondary" class. John simply copies and pastes the <Button> code directly into the implementation.
* Bug/Bad Feature - John references the button correctly, but fails to handle certain state resets properly when the form is cancelled.
The prior is technical debt because it works correctly, but doesn't follow a good pattern or does something unexpected (refactoring that button will be much, much harder). The latter is likely just bad code.
The case where the latter is not bad code is if the company has made an intentional decision to not handle cancel edge cases. Even here, I would argue that is product debt - not technical debt.
Every software product in the world has debt. Just like every software product in the world has bugs. You need to account for your debt as you grow. Otherwise it will eat you.
The worst case scenario is probably partial paydowns, where the abstractions get changed but some of the non-aligned stuff remains and so that you end up with a jumble. Especially combined with inadequate documentation (or a culture of inadequate documentation, where people don't even look because they don't expect it to be there), the old practices can end up being cargo culted forward even when someone attempted to create a better way.
A couple of years ago I had to spend a considerable amount of time PoC'ing platforms and tools claiming to solve this issue and I remember there were virtually none that did this on-prem using established Office programs, without a four to five figure investment. The organisation in question ended up using password-protected Word files with forms (a royal pain). Before that, they had been using Excel files (apparently it's easier there to lock parts of the sheet you don't want users to mess with) but this was an even bigger pain as Excel still has this decade-old bug even in its latest version which distort graphics when row height is changed...).
Long story short, I'd be curious how you solved this and what software you used.
It ended up being a bullet point that needed to be talked about during sprint planning. It helped that designers were part of some of these meetings. If our company had been bigger it would have certainly been more challenging.
Mostly we tried to make it part of the culture that we stick to what’s in the style guide. And anything that deviates had a task of adding it to the style guide.
I have to say, that I wasn’t completely on board for a long time, since I was sure company culture would make it hard to succeed. We had a champion who got us about 90% of the way there before leaving. But there was enough momentum that another dev managed to bring together the right personalities and solidify the mechanics of dealing with the style guide that it was a no brainer to make it part of the work flow.
We used git to store the html and css examples. Made it trivial to copy. And since devs controlled the repo, devs controlled the pull requests as well, and required review just like other PR. It helped that the designers we had/hired were willing to deal with git, even if they had low tech knowledge.
I’m not at that job anymore, so I don’t know how long before entropy wins. It’s the type of thing you have to keep tending for it to continue to be useful. But enough process was in place that it will likely continue to be useful for them for several years at least.
But I strongly agree with your second part: bad abstractions are a much bigger source of technical debt. Abstractions that do not match the problem lead to convoluted solutions, reduced functionality and are very hard to break out of with incremental fixes.
But tbh, I'm skeptical of the premise to begin with. It's still easier to gradually consolidate 100 slightly different uses of a pattern than blindly create an abstraction that adequately covers those 100 uses. If your team is useless, it's better that their code be fragmented and atomic anyway, because big mistakes are inevitable. You don't want to make it hard to touch those mistakes. You want that code to be uncoupled & disposable.
But at that point the argument is getting pretty abstract. Depends on the specific codebase.
It's not scalable, and produces severe consequences when used without care (which is most of the time, it seems).
If it was a tactical failure by John, then we just call that Tuesday.
I have written software with tests and without. The latter start like prototypes and become quite useful over time. At first, everything looked good and there where just very few minor bugs which could be fixed easily. But over time, there appeared more edge cases which cannot be tracked down. As there are no tests, there are no specific definitions of what each part of the software is supposed to do. Yes, there is a big picture of how everything should work together, but obviously it doesn't catch every edge case. The problem is, that as a developer I can't act as effectively on my code anymore, because I took some shortcuts in the past.
On the other hand, there are other projects where I have test suits and where I can be totally confident, that everything has a well defined spec of what it should do and where I can't break something accidentally without noticing it.
So in the end it doesn't matter if your shortcuts have been made deliberately (most of the time it is a mix anyway). What matters is that due to the shortcuts you can't act as effectively on your code anymore.
I'm often in the middle. Where I have tests, they're not always (or rarely) the result of a "well defined spec". But... the tests do reference whatever the understanding of XYZ was at that time. That's often as well-defined as I can get. Clients will come back months later and say "this is broken". Well, no it's not. The tests work just fine. "No, that's wrong, it's not supposed to work that way". Well.. there's a test with notes indicating "do ABC then XYZ" and the test ensures that's working based on what was known/agreed on at that time. That's often not the same as "well defined" by a long shot.
The subject has been written about many times. This is a pretty good summarization of it.
Not called out:
* Reckless debt is misfeasance on the part of developers.
* "Prudent" debt is often avoidable- arbitrary deadlines are arbitrary. Management may believe that enterprise data models are as malleable as CSS. So "reckless" needs to be applicable to management as well.
I'm swimming in many years' accumulation of both reckless debt and reckless mgt. The deliberately considered debt I can handle.
<input type="reset" value="Cancel">
Simple but under engineered systems are much easier to rewrite than to simplify over engineered ones.
People are so afraid of making a bad decision that they refuse to make any decision at all, and then make a giant mess in the process. What you should do is spend your energy on finding reversible decisions, and then not spend a lot of effort on actually making them. We know where the paint store is, we know they can make up paint in 15 minutes, fuck it, paint it blue, we can always paint over it later. Hard no to black, though, since you can't paint over that shit.
People are so used to avoiding decisions that on a few occasions I've entirely flustered someone who wanted to tear into me (sometimes with an audience) but cutting them off and saying, "Yeah that was a mistake, and here's how we're going to fix it." None of them had any idea how to recover from someone saying "I was wrong" and going on to try to fix the problem. I still have a little video in my head of one guy's eyes bugging out when he realized what I just said.
“I apologize for such a long letter - I didn't have time to write a short one.”
There may be better information.
Logging is my go-to example for this sort of behaviour. People write stupid amounts of log statements in their code, so much that it’s hard to even read the code to understand what it does, in the hopes that it’ll make debugging easier. Use a damned debugger! Take a traffic dump! Use strace!
What’s more, it’s extremely rare that libraries provide logs themselves. So the actual complex parts of your application, like say the HTTP library or (God forbid) the TCP stack, can’t be debugged this way.
If you find yourself writing a bunch of statements like “DEBUG: updating balance from 1 to 2”, stop and write some tests instead.
I have encountered a fair amount of cases where that does not work.
- debugging a super large program ? loading gdb may take two minutes while recompiling to add a printf only 3-4 seconds.
- not an admin on the machine you are and the person with the admin account is not around ? sorry, you can't debug on macos (and likely in some linux distros)
- likewise, no traffic dump (and I'd assume no strace) if you don't have root access
I've got a program with log statements like that all over the place. Since stepping through it with a debugger would not even be possible. My IDE takes care of hiding the debug statements, since they're all encapsulated in different regions and if-def statements to log different types of things.
Flipping your argument, you’re basically saying that you can’t debug multithreaded code if the program doesn’t log.
But, I think we're depending on different programs in our workflow. I much prefer a logfile with statements that focus on what I want to inspect at the time.
I'm quite proficient with debuggers on both Windows and Linux, but I tend to use them less when dealing with my own multi-threaded code.
To be clear I’m not talking about adding some specific print statements to the code while debugging, I’m talking about keeping those statements in production code.
Of course, there’s cases where logging might be your only option. In that case: Go for it!
Relinking a super large program may take minutes, whereas loading gdb can get me a backtrace in 3-4 seconds! VS natvis files and other visualizers can hot-reload while I'm looking at a crash dump from production that took hours to trigger/gather/repro!
There are cases for logs - where sufficiently efficient conditional breakpoints are particularly painful to setup, for example - but the only time I've waited minutes for a debugger to respond has been with VS after major system updates as it refreshes a universe of symbols from the symbol server.
Do you have some really slow python scripts auto-loading in GDB or something?
well, we have definitely opposite experiences :) the main software I work on creates a ~1gb binary in debug mode and lld links that in a few seconds. Bud gdb and lldb, even with gdb-index, and all the optimizations I could find, both make startup slow enough that I can go for a coffee and it's not always finished loading when I'm back
It was only when I added proper logging to the critical junctures in the code that I could finally see very clearly how it behaves and see what input results in what code paths and output. I learned more in an afternoon than I did in a month.
for instance one smart and experienced guy I knew said if there was an error, your program should just fail instead of giving a ton of messages and recovering.
and if you're just throwing code together, that might be what you do.
on the other end of the scale, a well written "second pass through everything with cleanup" might have tastefully written code, a few relevant comments and consistency.
and then I've seen a lot of code that appears to look like that, but is copy/paste garbage. (sort of the coding equivalent of a wordpress theme)
So fail early, but not too hard.
I would bet that he was right in that situation, but that he was also making a nuanced statement. Not all code should fail loudly, but when it should, "recovering" can sometimes be a distraction from a very real problem that needs to be addressed by the right person.
When you start out, you just write code and don't think about how things interact with each other too much.
Then you learn that you're rewriting a lot of code, you overcompensate and start to over-engineer to avoid duplication and often to handle more scenarios than you need. You abstract out possibly too much.
Finally, you get to a phase where you realize a lot of abstractions that you've created aren't actually used and you're writing more "meta code" than you need. In this phase you learn to engineer just enough for what you need at the time and design things to be easily changed in the future. You trust that even if you're system design can't handle everything right now, you've designed options for yourself to expand as needed.
The dept is that you take time and effort from your future to save time and effort in your present.
There was a separate testing client that was used for a lot of the demos. The demos on the 'real system' would fail so often that somebody came up with the idea to just take a lot of the code from the microservices and drop it straight alongside the testing client, as a monolith.
It was so much more stable that after becoming the default way to demo, it became the default way to use the tool.
5 years on and I believe it's still running that way. In this case microservices were massive over engineering.
They often are.
Well-meaning engineers and architects seem to reach for them while forgetting that microservices are about scaling your organization, not your software.
There are very few technical deficiencies in a monolith that are solved by microservices. Indeed, they typically bring technical hurdles: distributed systems are hard to reason through and IPC over the network introduces latency. And you need a very strong ops team to implement them without major headaches.
But where they really start to shine is when regression testing takes ~days and your development cycles are screeching to a halt because you've got too many in flight features and not enough runway to land them.
What takes real skill is recognizing which of your monoliths should be grown and which ones need to be sliced down the middle.
A lot of over engineering is due to getting this completely wrong: you prepare for something that never happens, at great cost and while underestimating the difficulty level. So you end up wasting endless amounts of time on stuff that has no business value.
On the other hand, a lot of under-engineering is due to not having a clue about what is obviously coming next and getting caught by surprise by completely obvious requirements.
I would rephrase this as make guesses based on probabilities and urgency estimates from past experience. You can't know the future but you have stats of the past. So a good engineer is one that has relevant experience and applies it appropriately.
Seems to me the default is obviously getting things wrong whether that be over or under engineering.
How would one go about reliably making choices that strike exactly the right balance consistently?
One thing I'm starting to really internalize is that to do so requires a deep understanding of software engineering as a domain, the domain and existing system, and the technical vectors in the business.
To me, engineering is taking a problem and the corresponding constraints and building the solution that satisfies it with the least amount of resources.
"over engineering" in software is about trying to make something adaptable? Or maybe just satisfying someones sense of beauty.. Or maybe it's just about programmer convenience at the expense of customers/user experience and hardware resources..
It could also refer to building resilience to load before it is necessary. Example, distributed databases with failover and recovery capabilities for a side project.
Poke holes all you want, those are bad examples, but they should illustrate the point. Over engineering is a real phenomenon.
It's even more frustrating when you work with engineers who refuse to believe that any code they write could become technical debt in the future. These tend to be people who overcomplicate systems to anticipate future requirements
I don’t think I’ve ever anticipated a future requirement that did not ultimately turn out to be necessary.
Conversely, I’ve had a lot of people tell me something was not necessary only to find that, surprise, surprise, it was necessary after all.
This is how I approach software, but there are so many people out there who won't approve simple code, because it doesn't have enough configs, or classes, or whatever. It seems that the person who wants the most over-engineered code tends to get their way. Psychologically the absence of 'things' is always inferior and harder to argue for than having more 'things'.
Reactive frameworks indeed hide a lot of complexity but that does have positive impact on the user code. (IMO)
I leave breadcrumbs and openings in my code all the time so that the next time I'm in there, I'm set up, invited even, to ask the question I left unanswered before, and maybe do something about it. I leave the option of adding a feature, instead of building a config framework to support it and then defining one implementation. Which will end up not being the correct API when (if) I write the third one.
Sometimes people beat me to it, and get excited because they had an idea they're now invested in. I usually just let them have it, don't point out that I put the idea in their head. It's so infrequent people get that invested in functionality, you have to encourage it. There's too many things I'd like to do and never enough time anyway. They've just nominated themselves as a candidate for maintaining that module when I get tired of looking at it, freeing up my time for things nobody else cares about until it's done.
It's like if you were designing a house with plans to expand it later. You'd be very careful about certain decisions, like whether you should put a bedroom in the logical spot for an addition, because now that bedroom also has to be a thoroughfare. For 20% extra effort, you've extended the calcification threshold for that house/code by 80%.
And so many people try to make it a dichotomy. Like that diagram of a project with a design versus without. That's not a fair comparison. What does a design even mean? Wireframes? A written document of features? How do we change this design? What's the way in which we get feedback? And what sort of projects are we talking here? A programming language is very different from a website. And what even counts as no design?
Some other notes:
- I'm not sure the author correctly used the Pareto principle. It's not just a generic "split things into 80/20" but a specific observation on how 20% of the causes result in 80% of the effects.
- The post could have used a quick proofread. There's quite a few spelling mistakes and poorly phrased sentences. I totally get if the author is not a native speaker, but they could easily enlist one to help (if the author wishes, I'd be happy to proofread—contact me).
- I get the whole "make this post fun with cats!" but honestly I'd prefer a straightforward example with no funny images. Maybe that's just my grumpiness.
Technical debt is like any debt. It's bad if you come due on it and don't have the resources to pay it off. Unlike cash debts, technical debt rarely comes due in a tangible fashion.
If getting to my next round vs. failing means leveraging technical debt, I'm going to do that by all means. It's not necessarily easier to fix tech debt at scale, but having more resources means it's less impactful on bandwidth.
If you get into debt so that you can have a nice new TV quickly, and you forget to pay it off, so it keeps increasing, and you only pay attention to it when the debt collectors start calling you... that is a problem.
In my experience, most software debt resembles the latter case. The goal is to meet a deadline sooner, and there is absolutely no intention of spending resources to fix it later, unless the customers start making too many tickets.
When a manager says "later" or "low priority", it usually means "never, unless we absolutely have to". Technical debt is always a "low priority" that will be addressed "later".
Yes, there are situations when meeting a specific deadline is critical, and it is a rational decision to cut some corners and fix things later. The problem is that after the deadline is met, the "fix things later" part is usually forgotten. A new deadline is set, and new corners need to be cut, etc.
Technical debt can save a company by pushing something forward and launching a product sooner than a competitor. It can save a big customer by fixing a bug for them quickly that if not fixed immediately would have caused them to cancel a multimillion dollar contract. It can also eat away at a company if you let it accrue for decades and never pay it down.
That book is a little gem, short and well-written
Looking back and going "this is bad!" at your 'technical debt' isnt all that interesting. It's much more useful to understand what lead you (the team) to take those shortcuts in the first place.
Hardcoding something in a file rather than pulling it from a config or a database has a pretty good ROI in terms of time saved, and it's not that hard to change later, but tying yourself to, say, a kind of storage that you know is not going to work long-term, that will cost you later.
Also, sometimes doing MORE results in more technical debt. If you are going to spend your time building a Kubernetes cluster because "your home page has to be webscale", you are an idiot.
I do not describe that as technical debt. That are mostly lack of features. The code could be fantastic and it is just that to support different currencies, disable accounts or similar things new code is needed.
The lack of automated testing is more in line on what I think that it is technical debt.
Details aside it is an interesting read.
It doesn't make the system worse/more expensive as it grows, and it doesn't become more costly to "fix" the bigger the system gets. IMO those are the operative features of tech debt.
But testing is mainly important so a developer not familiar with the program will not break it by modifying something seemingly unrelated.
Lack of tests will absolutely bite you.
You are right, it depends on the kind of application and circumstances testing can be done in different ways. There is occasions were instead of automated testing there are other ways of working with quality. But, in this case automated testing is in the list of things that needs to be done. So, I think that to want automated testing and not have them is technical debt.
Aside from the initial increase in cost when you forget how the original code worked, tests are no more expensive to write in a year as they are today. Not significantly, at least. So I'd just class them as incomplete product - the effects are negative and can be expensive, but it doesn't dig you into a hole.
Like credit card debt, too many people turn around and they've got $20k of credit card debt or 20k LOC of tech debt.
I don't have too much of a problem with tech debt as long as: the benefit is clear; the debt is well understood; the payoff date/cost is planned. Just last night I was talking with a developer who works for me: "This way will require 5 queries instead of 1. Can you help me figure out this whacky ORM API to do it in 1?"; "Just do it as 5 right now and, as we learn more about the ORM API, we'll fix it later..." Tech debt: we get the product done faster; we've got a 1-5 line change to make (when we know how to make it); and we'll do so in a few months.
Ah, famous last words that I myself have also spoken from time to time.
1. Netscape Navigator - was such "spaghetti" that they had trouble re-architecting it to allow it to keep up with evolving web standards, and then put everything into Navigator 6.0 which ended up running like a snail. (Second system effect). I'll note that according to jwz, it was written by people working lots of all-nighters.
2. Microsoft Word for Windows. Apparently had a ponzi-like accumulation of technical debt as they kept adding features to the release and writing code that looked technically correct but was known to be flawed, relying on testers to find the bugs they had known they wrote. Took version 2 for the bugs to get ironed out to a usable state.
I've tried coding with some open source programs that are mostly being used for a company's internal process that are basically piles of technical debt, but it'd be rude to mention.
I'm sure there are other great ones, but now that so much software is SaaS, we're probably not going to hear about it, because the release dates aren't as obvious. To me, eBay appears to be one, as they are trying to remake their web application to be more modern but it has tons of warts all over it.
"Technical debt" conversations usually go something like this:
Me: "Could we spend a few hours getting a handle on what the data model should look like? Maybe rough out a crude ER diagram?"
PHB: "Oh slumdev, you're such a Boy Scout. Just blam it into a document store. If something changes, we'll just be Agile about it. Technical debt is a tool!"
I guess that means you can take an hour long nap every day to recover before your next shift.
The post uses a lot of words to describe an understanding that experienced engineers and managers should have. Technical debt is a trade off. It exists in nearly all software of decent size as it's being developed. It is vital both for management and engineers alike to keep it at a reasonable amount, or you'll wake up one day with something that can't release new features, function, scale or be maintained.
It is also worth stating that tech debt can have little to no siginifcance to the success of a startup.
These days rewrites are popular. They make sense in some cases (when the project is smaller or introducing new framework). But in others where the original developer and business people have left and no one knows exactly how things work aside from they need to continue to do the same. In those cases fixing technical debt is extremely important.
A startup is more likely to fall into the above category if they ignore their technical debt because rewrites are hard and messy.
The wisdom comes from timing the investment so it’s not destroying the “mature” product due to entropy or strangling the new product due over engineering.
Startups do seem to have bigger issues than technical debt to deal with. Because relatively speaking, there are bigger mountains to climb early in the startup’s development.
I've seen more issues because of nonsense like that, or people insisting that code should be DRY and making ridiculous mental contortions in the name of a three letter acronym, or personal taste (the amount of fashion in software development is insane) but I've yet to see a project truly fail because someone cobbled something together quickly, and introduced tech debt. So far anyways.
Projects like that don’t fail in the sense of the word. They just die quietly or get replaced.
On a project with an external deadline this year I had to make some modifications for compliance reasons. It was very painful. The parts that are rewritten are conceptually a lot simpler. They fit in my head. The parts that didn't get rewritten have incredibly high cyclomatic complexity and really don't fit in my head.
I want to refactor it so it's nice.
I also don't want to introduce lots of bugs nor push out deadlines.
A decision was made upfront that we would try to extract this component out of the main system, so it could be independently modified and deployed, as the capability to do so would actually provide decent business value. Sadly this particular decision didn't turn out well, and really, really pushed the deadline. Ultimately we spent about 70% of the engineering effort on the project trying to extract some functionality. I'd say about 10% of is actually extracted. So now it's even more of a Frankenstein split across 2 codebases now.
The project is coming to a close, and we will have a clean up session early next year. Our next project is related to this particular component also, so there is scope for us to be making changes to it.
The goal is to make this component more stable for customers, and safer to deploy.
I think a big bang rewrite is 80% likely to fail. I strongly suspect it's actually worth succeeding at though. I'm trying to chart a path through to a successful reworking of this critical subsystem. The first failure was quite humbling, but not unexpected.
Probably the best move I made on this project was getting good logging in place so I can tell exactly what it does, and made it a lot easier to reason about whether our attempted extraction does the same thing. Also I've managed to collapse some of the cyclomatic complexity, so it's easier to mentally keep track of as you're trying to read the code. It's still very, very hard to modify the functionality of the old stuff.
My current line of thinking on the best way forward is to abandon the attempted extraction, and repatriate that functionality to the main codebase. Then use characterization/approval tests and finish the original rewrite. I'd be happy to do it piecemeal too, even one ticket a sprint spread out over the course of the year, so long as we actually complete it. In addition I'm planning to try a a tiny, reversible experiment at a different way of extracting it that's much more likely to succeed and deliver the benefits we're looking for while simultaneously charting a way forward on another major piece of tech debt that causes a lot of stability problems. If that experiment succeeds I want to extract all of it that way, and would only attempt to do so once the entire thing was in a reasonable condition first. Getting it all the way to that state would represent a major win in my mind.
It's hard when we can't afford to fail at this, and we're not likely to succeed. Really trying my best to find a viable plan.
At the same time, if it ain't (that) broke, don't fix it right? But that leaves no road to new capabilities for us as a business.
Software is just a hard problem.
I would love to see a sample of this code.
Finding the best balancing point is hard to turn into a sure-fire formula. I use experience (age) to make a best guess based on on past successes and failures and project types.
One tip is to collect and keep a list of questions and suggestions, and make sure they are sufficiently explored. You can't answer questions that have not been asked, but AT LEAST answer those that are asked. Make sure staff is not afraid to ask questions or make suggestions.
Yeah it can if you abuse it. It can also be incredibly helpful.
First objection: Managers and Business persons understand Debt, like it and use it. I use credit cards and have a car loan. If we, as engineers, present the problem as "You can have what you want now, but you will incur Technical Debt" then any sane person will say "Sure! Good deal!" Ask yourself, how many times a manager or business owner has asked "What is the interest rate?" Zero times.
Second Objection: It's not debt.
Debt would be if someone said "I need to build a bridge across here that can handle an Army, and I've got $1m", we engineers reply "It will take $2m", and they respond "Ok, I will borrow $1m so you can build the bridge I need".
Instead what happens is they say "Well, build what you can for $1m", and you say "Ok, we can make 'a bridge' for that", and then either (a) your infantry can cross, but the tanks have to get diverted 20 miles out of the way, or (b) the tanks end up in the river along with the bridge. Since (b) is bad, you then have to spend a lot of time planning the routes for the tanks, and making sure the tanks have the right air cover, etc etc, i.e. doing more work.
It's not debt. It's just (at best) an incomplete solution or (at worst) a bad solution that fails at the worst possible moment - e.g. database collapses during registration for the largest event of the year.
Ah, but surely, if you build the lightweight solution for $1m, and acknowledge the increased costs of managing the problems that it doesn't solve, then thats fine? Sure, but that's not technical debt either! That is scoping: we (engineers + business) identify a workable solution that provides some business value. And then we do that well.
He even calls out that some teams think they don't have to pay back the debt. I think that is the norm, and that's the problem of debt. When financial people use debt, that debt comes with structure and consequence: it might be monthly payments or it might be due sometime in the future. Offer such people a "debt" that has nobody enforcing it, nobody quantifying it and of course it wont be paid back.
Agile engineers making scary "booga booga" noises about "we need to refactor" just don't carry the same weight as not being able to make payroll because your note is due. And yet that seems surprising to engineers.
Incorrect. He notes that debt (finanial or technical) is useful in getting things done more quickly, but must be repaid.
Technical debt enables getting a minimal product or prototype out to gain experience and further refine design. It is specifically contrasted with bad software. Deficiencies are to be addressed later. And yes you're correct in that this last step is often omitted.
However, if the conversation is one of "debt", then, in my experience, the necessary work is not done later, because there is no organization demanding repayment.
No big deal, right? But the next person comes along and says "I don't like those systems - I want to put tech3 on my resume (a little resume oriented architecture, anyone?) and they build a 3rd system.
Now it's really hard to develop features across those apps. One has to be an expert in all those techs. Integration becomes 80% of the work. Now ppl have to create new or buy software just to get anything done. And so it goes.
There's a name for this, used a lot on this site.
It’s made up by people who want things done in a different way than they currently are.
All currently good design recommendations will become wrong ones in the future. Today’s best practice is tomorrow’s technical debt.
There is just code that works and makes the company money, and code that doesn’t work.
Code that is readable and code that is not readable.
Starting out with “the right design” will become the wrong design when the market changes and the specs change under your feet.
Moving fast and making software that meets market needs requires going back and finding parts that need to be consolidated.
It’s just how building software in the non-trilllion-dollar-company-or-in-academy world works.
* To ensure accuracy of transactions, he should have built reporting to meet the needs of accountants.
* To support change, he should have kept his components separate and provided modest testing examples.
* To support developers and operations, he should have found and documented the external dependencies along with steps to verify that they are in place.
Whether product managers didn't uncover these requirements or the business didn't prioritize them, these aspects of the definition of "done" have more to do with thee environment the engineer is working in and less to do with his or her work.
This is not choosing an easy solution, this is doing just what needed instead of wasting time and efforts on something that may never actually be needed.
Now, the design should be sound and follow good practices. This is how to limit the rework needed when new features are added because it is impossible to design to cover all possible future features: Do the minimum but do it cleanly.
My litmus test for technical debt is things we should have done a while ago, but since we didn’t it’s hurting us. Continuing on same path will hurt even more. Better to fix the pain and then move on.
Working code that does what it needs and doesn’t require much maintenance is great code. Old != technical debt.
Are there specific patterns of code behavior, metrics which can be derived from analyzing code, quality scores that are non-controversial, which can be used to define this problem? Things that can be measured and managed in a canonical way? How can some knowledge of experienced CTOs be distilled so that we can start to automate some of this wisdom?
So e.g. lack of functionality isn't tech debt, but a bad abstraction is usually tech debt.
If I've run on Version X of some DB software for a while, and tomorrow Version X + 1 is released, yes, I should recognize that staying on Version X is tech debt. But that may absolutely be the right thing to do for now.
- Payments couldn’t be processed in different currencies
- If the delivery system is offline, the code wouldn’t work
- Users with deactivated accounts could still access the system
- No automated testing
Ah no, this is what is known as a broken software development process.
If you are ignorant of the possible consequences of decision, or choose to ignore or hide them, that's something else.
The technical debt accrued due to:
1. Poor code practices (C++, poor use or understanding of pointers and threads created most of the actual errors in the program that make it unstable)
2. Excessive copy/paste (so bad decisions propagated quickly)
3. Poor architecture (didn't scale, performance is technically fine but development costs exploded as new capabilities were added)
4. Poor testing (almost entirely manual, as the system grew this grew with it to the point of creating major delays in releases)
They're rewriting it now keeping these lessons in mind. They're surviving because they're part of a larger org that can afford to keep them around for the rewrite, but essentially this now 10-12 year old system is dead. The other reason they're surviving is they happen to have only one competitor, whose product costs 10x as much per seat. So their customers want them to survive if they can hit the quality improvements in the rewrite, failing that they'll be disbanded.
The point of incurring debt is to make a "large payment" (deliver software) quickly and immediately, in exchange for paying interest over time.
If you can deliver a working prototype, scale, get funded, and grow your business quickly in exchange of technical debt, that's not really a bad deal.
So instead they chose to lift/stabilize it.
I think it's more like a rushed major remodel of a house. It's kinda bad, but there's really just no practical way to fix it without a teardown.
It provides leverage to incur debt at early stages, incur too much and it can really come back to bite, but a startup that incurs no debt will struggle to grow.
If I am the one paying bills, would I really pay another developer for this refactoring? Yes: Technical debt it is. No: just busy work.
1. Some objectively "right" or "best" way to write software exists. We refer to "patterns" and "best practices" as if those had the force of science behind them, even though at the same time we can't, as a profession, agree on what we mean.
2. Code we (or someone else down the road) has to maintain wasn't written correctly or perfectly in the first place. Programmers tend to hate maintenance work.
3. We imagine that we could write the perfect program if only our managers and customers didn't impose time and budget constraints, or interfere with their stupid product and marketing directives.
4. Failing to write perfect code that lasts forever indicates a failure on the part of the programmer or the team.
5. Perfectionism (a symptom of obsessive-compulsive behavior) and the increasing worry among programmers about how others perceive and critique their code.
Writing perfect code that requires minimal maintenance in the future would require knowing that future, and all of the changes to requirements and constraints that will happen during the lifetime of the code. We can only work with the requirements we know, and those we can reasonably anticipate. Trying to code for requirements we don't have is usually called "overengineering," which means something else in other engineering contexts (as a few other commenters pointed out).
I have worked in software development for 40 years so I know that almost no one likes doing maintenance work, especially on someone else's code, and especially with languages and tools no longer in fashion. Maintenance programming is often given to new hires and junior programmers, while the senior developers get to write new code and play with the latest toys. This class division among programmers, often expressed in terms of what makes a programmer junior or senior, gets exacerbated by the personal quality of programming. Programming is a craft, not a science, not even engineering, but we forget that and try to express aesthetic preferences in terms of objective quality, even when we don't have agreed-upon ways to measure objective quality of code.
Most software has a short lifespan, which means the "technical debt" will disappear when the code gets rewritten or replaced by a SAAS product or the requirements change. Bridges are expected to stand for decades or even centuries. Jumbo jets have lifespans measured in decades. Very few software projects will stay in use that long. I routinely work on web sites that will stay online for less than a year, because they support short-term marketing goals. American companies have an average lifespan of 18 years, and startups are more likely to go out of business (or get acquired) than to stay in business for even a few years, so focusing on building the equivalent of the Great Pyramid in code at a startup or small company in order to avoid "technical debt" is almost always wasted effort.
Businesses factor maintenance into the lifetime cost of software just like they factor maintenance (and depreciation) into the cost of buying a fleet of trucks or a truckload of copying machines. It's the programmers who impose unrealistic goals of no/low maintenance and frame that inevitable maintenance work as a failure of the original design or implementation, i.e. technical debt.
We should try to do our best work, to make code that works (i.e. meets requirements and doesn't suffer from preventable bugs), and we should try to anticipate likely changes and make those easy for the next programmer. We should stop imagining a perfectly right way to write code, and stop thinking that maintenance work is beneath us, or a sign that the last team didn't know what they were doing.
I think programmers would enhance their value and learn a more realistic (and maybe humble) approach to their craft if they didn't think of programming as an isolated activity disconnected from the business and customers/users their code is meant to serve.
James Joyce wrote “What makes most people’s lives unhappy is some disappointed romanticism, some unrealizable or misconceived ideal. In fact you may say that idealism is the ruin of man, and if we lived down to fact, as primitive man had to do, we would be better off....” Joyce wasn't talking about programming, but I try to keep that in mind when balancing technical decisions against the larger set of business and customer concerns.
- Your startup is much more likely to succeed by being purchased by a private equity fund than anything else these days. Most tech acquisitions are now PE, like over 75% of big ones. (Even up to north of a billion)
- The PE fund will send a diligence team. Depending on who they hire, your diligence team could be clueless consultants with checklists, or they could be real hackers who will look at your code, ask all the hard questions, and know a lot about tech debt. They will talk to you for many hours about tech debt. We typically interview for 12 hours and ask for a lot of metrics and docs. We do this for a living, so we're pretty good at sniffing out BS and knowing how to get devs to talk about their software realistically. Most devs are pretty happy to talk shop once they know we are actually developers who have been through the same process ourselves. Sometimes I feel like it's more like Developer Therapy. :-)
- If you have bad tech debt, it won't necessarily sink a deal. Deals usually go through or not on other reasons, like business model revenue numbers - stuff that can't be fixed with the application of cash. However, we make a table for them of estimated cost to mitigate debt, and that can be serious cash. (i.e. X extra senior devs and QA for Y months/years)
- This cash will come off the price. Or as Amazon apparently puts it... "a haircut". A haircut can easily run over a million on even a small company. So the money saved on devs for the years of accruing debt just comes off the acquisition price, but with a terrible interest rate.
Also of interest: we don't actually say it's always bad. We differentiate between intentional and unintentional debt. If you accrued debt to get to market quickly and are then doubling back to attend to it reasonably, that can be a very smart decision and may actually reflect well on the assessment. On the other, debt accrued because the founders only hired interns and new grads and let them run the asylum for years does not.
If I were to encapsulate in one sentence what I've learnt about tech debt in the last two years and 50 odd companies examined it would be this: nobody saves money by trying to hire mostly junior developers. Nope nope nope nope nope. If you are running a start up, you need at least a couple of six figure salary, 5+ years experience devs, and a CTO or lead architect with like 10+ years or who really, really knows their stuff. Do not cheap out on this! At a guess, I'd say it's the most obvious differentiator between the hits and the really big haircuts. That said, I only look at companies who have gotten pretty far, so the main differentiator is likely different at early VC stage. We're basically doing the first exit, and if we're talking to you, you have a successful business that some professional big time investors want in on.
Hope that's of interest to some folks! :-)