Hacker News new | past | comments | ask | show | jobs | submit login

I cut my teeth as a developer on the Windows operating system. Because we were a platform, we had learned that when we shipped an API it became more or less permanent. So technical debt was a huge part of our planning and thought process.

When I joined Netflix I was assigned to add DASH support to the Silverlight web player. I spent weeks working on the new architecture, refactoring the streaming code, etc. One day my manager stopped by my desk and said that "I needed to wrap it up. It was taking too long." I was still weeks away from being done. I explained to that I was cleaning up a ton of technical debt and that's why it was taking so long.

He explained to me that they had different values on his team.

1. Most code will be rewritten every 2 to 3 years. 2. For code that doesn't get rewritten often, preserving battle tested code trumps paying down technical debt

Just remember that technical debt doesn't apply to all software.

The only point here that jumps at me is that the manager came a few weeks to late to explain the development values.

>> I explained to that I was cleaning up a ton of technical debt and that's why it was taking so long.

> The only point here that jumps at me is that the manager came a few weeks to late to explain the development values.

Well...(and I'm not having a go at rsweeney21). They were asked to implement a feature, not to spend ages cleaning up technical debt, and being "was still weeks away from being done" would be an alarm bell for me as a senior dev/manager. If the feature couldn't ever possibly be implemented due to technical debt then GP should have raised this with their manager far earlier rather than just going ahead and fixing this silently. If the feature can be developed and released, despite technical dept, then go ahead and point out to your manager (as soon as possible) this is going to be horrible for the dev(s) working on the next iteration, but the chances are your manager will likely insist you get on with the feature. Communication is at fault here.

We all work in projects that have technical debt and we'd all love to go back and fix these things. But you also have that obvious tension which is the fact that you also need to release features which have been identified by the business as another positive cash-income thing because you're competing against other services - and sometimes these features are very time critical.

Sounds like GP poster seems reasonably experienced (working at MS, Netflix) and should probably know this (unless there are other mitigating factors not included in your comment)

Yeah, I feel like this one was on me. I was new to the job and wanted to impress my manager. I could feel the refactoring getting out of hand. I should have had a conversation with my manager to ensure I was going in the right direction before he felt the need to tell me I was taking too long.

Don't be too hard on yourself; it takes two to tango. You are your manager's work, as much as your code is yours. Unless you lied about progress it's a miscommunication with no right or wrong.

Hey mate, hope I didn't come off too critical of you personally. And you know in my younger days I'd likely have been the same.

No offense taken. You're all good! :-)

Sounds similar to some of my learning experience too, as a maker and manager

This is central to Fowler's point, which is that technical debt only needs to be paid off if it gets "activated" a lot - this is where the metaphor breaks down, because typical debt accrues over time.

If you never touch code again, but that code was rushed and is ugly, there's no good reason to pay off the debt.

Similarly, if the code's getting thrown out (and if you design code to get thrown out), there's less reason to prioritize cleanup.

That's often overlooked, yeah - if it ain't broke, don't fix it. This goes against a lot of people that read code for the first time, myself included.

A recent example: React introduced hooks, which effectively means that there's a new de facto standard way to write components and you shouldn't be writing components as JS classes anymore.

The kneejerk reaction is then "We need to rewrite all our class-based components to use hooks!", which can take quite a long time (and it's tedious, boring busywork).

Luckily the React documentation on hooks itself [0] basically tells you to not bother unless you're actually going to work in that component anyway, and even then you should make sure everyone else in your team understands them; classes are fine, it'll keep working for a long time, resist your kneejerk reaction.

[0] https://reactjs.org/docs/hooks-faq.html#should-i-use-hooks-c...

Yeah, I try to align tech debt cleanup with 'touch points', since that's generally an indicator that the code needs work done on it anyway.

It's sometimes controversial to say "We should not prioritize tech debt" but I feel pretty strongly that it's true. We aren't paid to write beautiful code, we're paid to write code that gets the job done. If code is getting the job done, even if it's fragile, our job is done. Only if that code is impeding new work should we prioritize it.

> 1. Most code will be rewritten every 2 to 3 years.

a.k.a. defaulting on the technical debt

No, this is still a payment against technical debt. From the top of Martin Fowler's article:

"The extra effort that it takes to add new features is the interest paid on the debt."

If adding a new feature starts with, "just throw out all the code and start from scratch", that's a much larger cost over the long term than building code that's easier to maintain and refresh.

With very few exceptions code that is "easy to maintain and refresh" over any appreciable time horizon (say,longer than 2-3 years) requires the kind of engineering effort that the modern world of "CS trivia is good interview material" generally derides. I'm not even talking about "real" engineering processes as might be used for space systems or whatever; I mean just basic requirements analysis and technical documentation.

The cost of rewrite at that point is typically less than cost of straightjacketing the existing codebase to support all new requirements, otherwise it is less likely to happen. So, defaulting it is.

(Mind you I'm not fully sold myself that 'technical debt' is a useful analogy at all)

Not necessarily. More like an asset that has finished it's designed lifespan and the org has moved to a new asset.

That begs the question, what distinguishes actual technical debt from short-lived assets?

You can't fully make that distinction until it happens.

Sometimes you can say "I know this will need to be rewritten later" and be pretty sure you're seeing technical debt be accrued.

But other times you have no idea. It might be a tool that was intended to be used for a few weeks, but turned out to be incredibly useful and lasted years. Had you known, you might have written it better if the first place.

Technical debt isn't like financial debt in that there's nobody with an accounts sheet telling you that you owe them anything from the start. You don't necessarily make a deal with anyone. So you can't judge its existence in the same way.

Schrodinger cat's technical debt.

Or the target's moving fast enough that it's inefficient to build a gold-plated solution when a quick hack will get you out of trouble for now.

Research code is much the same. Your code doesn't have to be great (or even OK, to be honest) as long as it runs well enough to provide the results you need for your paper.

A technology no longer being supported (e.g. Flash, Silverlight, IE) is not technical debt though, it's external influences.

Likewise, rewriting a front-end in the framework of the week doesn't have to be because of technical debt; Angular 1.x was fine and it's still fine, however you'll see a lot of upgrade or rewrite projects happening right now not because of tech debt, but because people simply don't want to work in Angular 1.x anymore, or an employer can't find competent people that want to work with it anymore.

Likewise, COBOL applications aren't bad because of technical debt per sè, it's just become very hard to find people that can maintain it.

Never really seen that in practice, code is "touched" but rewritten?

It's downright stupid to say all code will be rewritten every 2-3 years, because in reality this never happens as completely as it's stated.

As the article mentions, it's about figuring out what makes sense to fix now, or wait until later.

But yes technical debt applies to all projects. The way we interact with it just varies.

Nice straw man. Nobody said "all code will be rewritten every 2-3 years". He said "most code" will be rewritten every 2-3 years. Also, like I said, these were values on his team. We were a front-end application and every 2-3 years Netflix would rebuild their player UI which usually involved a complete rewrite.

I'm just saying that if you have it in your head most code is going to be rewritten every 2 years, it's understandable if you don't bother making things easy to work on, and just hack it together.

An argument that "eh, most code is rewritten anyway, who cares" is what I'm reacting to, and have seen countless times myself.

But I don't want to say there is never a time someone will say, "this code is going away, don't spend too much time fixing it now" and it's really really good advise.

Its funny that you mention specifically this related to UI code. I was thinking about it after your first comment, and basically every product that I've worked on that fit this style of development has been UI related. And indeed, most UI projects have had almost complete rewrites every 2-3 years for one reason or another.

The ones that didn't were effectively stalled projects. This makes me think that technical debt really can be thought about in a completely different way for frontend code in many cases.

A few years make sense for UI for consumer products. The evolution of displays has been tremendous in the last decade and a half. Desktop screens have doubled in size and resolution, while mobile and tablets just appeared overnight and when from nothing to where they are today.

Products have changed, but also UI design tends to be driven (or at least influenced) by marketing. And marketing trends change regularly driving a need for markup and styling to be revisited.

In theory, this is just marketing, but in practice if often turns into much more.

In the case stated though, they were talking about a silverlight player.

I don’t think they quite got to the point where they had to rewrite it.

Sure they did. They rewrote it in another language and phased out the Silverlight language completely.

Well, that sounds about right from my experience, for some type of code and organizations.

The original developers have left after a few years, newcomers don't understand the system and may not even try to. It will ultimately get rewritten in newer technologies and frameworks, that happen to be nice for their resume, whether it was obsolete or not.

Indeed. 2-3 years is short, that way of thinking could discourage doing things right in the first place because ‘it’ll be rewritten it in 2-3 years so it doesn’t matter’.

Given that you were working on Silverlight, his decision to stop you spending more time on it was remarkably prescient.

It should have been communicated earlier though, so you wouldn't have wasted weeks on soon-to-be dead tech.

> Most code will be rewritten every 2 to 3 years.

That's kinda scary.

Is it really? It's another way of saying that conditions in the tech industry change every 2-3 years, which isn't all that controversial a statement.

Sometimes it's because developers just want to change tech to enhance their cv or keep themselves entertained. That's scary.

Sometimes it's rewritten because it's so unstable because debt wasn't paid off but total cost of writing it twice is much more than paying off as you go along. That's scary too.

I thought this was the old quote Fred Brooks quote, "Plan to throw one away; you will anyway."

Although then you butt up against the "Second System Effect", so you're damned if you do, damned if you don't.

Within consumer tech, it's more like "Plan to throw it all away; you will anyway."

I used to measure code lifetime in half-lives at Google: the half-life of most of the code there was about a year, meaning that after a year, 50% of your code will have been deleted. It's pretty accurate: by the time I left after 5 years, about 97% of the code I'd written had been deleted. Ironically, I'm told that my one contribution (after 10 years) that still exists is an attribute-renaming CL that I wrote over break at a team summit; basically the whole team agreed it was a good idea, we would never have a chance to fix the problem later, so I just went and did it before the framework got too entrenched to change. Meanwhile stuff I slaved over for months, sometimes even stuff that was directly sponsored by a VP (who is no longer there) or got commendations from the CEO (who is also no longer there) was gone within 1-2 years.

In my experience prototypes / proof of concpts always end up in production. Might as well make a half decent first attempt.

Often times if something needs to be rewritten, it means that the feature became popular enough that it needs to scale up. It is not a bad problem to have.

Even well written code might not scale or be flexible to changing requirements. The feature could even be removed. The effort it took paying off tech debt prematurely would have been a waste.

I think that's a different statement, actually.

Expecting code to be replaced every 2 or 3 years is nervous-making for a couple of reasons.

First, new code is always less reliable than old code, so -- all other things being equal -- devs should lean toward keeping the old code rather than replacing it.

Second, code should be architected so that any changes required to adapt to changing conditions should be limited in scope. The bulk of code for most projects should only very rarely need changing to adapt to changing conditions.

You're presumably used to operating in an enterprise environment with O(1000) customers and O(10) developers, not a consumer environment with O(100M) customers and O(1000) developers.

Reliability issues at large consumer tech companies will usually be caught by the QA/canary/SRE process. If they're not, you'll hear about them soon enough with 100M users banging on the product, and then just do a rollback and fix it on a more leisurely pace. Consumer expectations in non-tech industries have gotten so low that you can burn down whole towns, poison people, and leak millions of records of personal data with few consequences to the company; not being able to access your favorite video for 15 minutes is comparatively minor.

Also, the type of changes in market conditions consumer companies face are usually not those you can architect around. They include things like "We are no longer shipping DVDs to customers; we are streaming content online", "We no longer write web software, we're a mobile-first company", and "our business model is no longer selling personal data, it's payments". There is no architectural fix for "your product is canceled".

> You're presumably used to operating in an enterprise environment with O(1000) customers and O(10) developers, not a consumer environment with O(100M) customers and O(1000) developers.

I have a great deal of experience with both environments.

I have no idea what you're trying to say. O(1) and O(1000000) are the same thing, the O() notation ignores the constant factor.

> Is it really? It's another way of saying that conditions in the tech industry change every 2-3 years, which isn't all that controversial a statement.

I'd say it's an outright incorrect statement, fads come and go but the revolutions (like the move to web apps) are rare and take and take decades to complete. If everything around you changes every 2-3 years then you're hanging around fashionistas not engineers.

Uh... Really?

The example at hand was netflix... The bulk of the platform seems pretty stable to me. If they are rewriting their best code every 2-3 years that'd be a red-flag to me.

I worked on Google Search for 5 years. You would probably say Google Search is pretty stable - at least, when was the last time you noticed it go down? The half-life of code I worked on was about 1 year. Over a 2-3 year period, 75-90% of it would be gone.

We did things like incrementally rewrite a million-line binary from C++ into Java while it was running, or completely change the indexing system from batch processing to continuous updating, or grow the number of documents indexed from about 80B to 1T. There's a huge iceberg of development that you don't see, much of which has to do with scalability (you generally have to rewrite every time a key metric grows by a factor of 10) and much of which is experimental, trying out new features to see what resonates with the userbase.

Very curious about what that Java program is.

GWS (Google Web Server). Powers websearch, responsible for rendering everything you see when you run a search. It used to have a Wikipedia page but basically everything on the Wikipedia page was wrong so the admins eventually deleted it (edit: apparently it's back. Basically everything on the wiki page is still wrong, but at least it doesn't claim GWS is an Apache derivative now). It's the oldest continually-pushed binary at Google (originally written 1999 by Craig Silverstein; at least when I left it still had code in it from Craig, Marissa Mayer, and Sergey Brin). It's changed programming languages twice (from C to C++ around 2005, and from C++ to Java etc. around 2010), and when I left was a frankenmonster of C++, Java, and two proprietary DSLs. My understanding is that it still exists, though most of my friends at Google have since moved on to other teams.

Thank you. Can you talk about the reasons that necessitated the rewrite from c++ to java? I didn’t realize Brin wrote any code that wasn’t python either.

I personally wasn't a big fan, I liked C++ (and Python - there was an embedded Python interpreter in 09-10 that was a casualty of the rewrite, and IMHO Python was a lot more productive than Java for experimentation).

There were some pragmatic reasons though. It's very difficult to multithread C++ correctly, while Java at least has a proper memory model and thread support (note that this was before C++11; at the time C++ had no standardized memory model at all). Debugging core dumps in production sucked. Most of the newer parts of the company (GMail, Docs, Google+, etc.) were written in Java, and the rewrite let us share code with them. Compile times sucked, and Java let us pluginize the architecture, load code at runtime, and build & push each component team's codebase independently (as well as shut them off independently if they started crashing).

I wonder how much of that is due to using the current fad tech... my longest running major piece of code (untouched 15 years+), I implemented in PL/pgSQL. Rock solid, well understood technology, isolated from the current hip web framework that is probably calling into the views. Somehow I think the technology choice is a big part of what ultimately made the program a great investment for the business.

> Just remember that technical debt doesn't apply to all software.

I'm not following. You still need code that could be rewritten and deleted every few years. Any code that couldn't just be deleted as needed wouldn't fit that plan.

So whatever forced code to outlive its welcome is technical debt in that scenario. Examples could be code that used outdated middleware, maintained arcane ETLs to keep data accurate, consumed said data in unconventional ways, code that wasn't repackaged and deployed regularly, etc.

> I spent weeks working on the new architecture, refactoring the streaming code

Seems weird to do that without talking to your boss about it.

How many bosses have you had that suggested paying off technical debt? For me it was always a never ending "next month, after we have rushed out this new feature".

Some, but usually the rule is to not let them know - if they don't understand the problem, they shouldn't be burdened with deciding when and how the problem should be solved either.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact