Hacker News new | comments | ask | show | jobs | submit login
You Don't Find Bugs in Legacy Code (davidtanzer.net)
39 points by struppi on Jan 23, 2017 | hide | past | web | favorite | 27 comments

As someone that has spent the last two years migrating legacy code, I can assure you, you do find bugs in legacy code.

I've seen bugs that user's don't notice.

I've seen bugs that user's have just gotten used to.

I've seen bugs where that failures it caused were dismissed as the system being a bit temperamental, for many years.

I've seen bugs that financially affected thousands of customers, and remained unnoticed over the course of several years.

I've even seen bugs that only remained because, by luck, the preconditions to trigger it had not been met.

I can assure you, you do find bugs in legacy code.

And yet, by fixing any one of those, you'll probably break something else. Something that relies on the broken behaviour, or 'corrects' it.

Any bug in a large codebase that's older than a certain point becomes a feature. ;)

And that's how we got Spacemacs.

> And yet, by fixing any one of those, you'll probably break something else.

That's an empirical claim and I doubt it's true. Sure, sometimes you'll break something else, but 50% of the time plus? I'm gonna need to see some evidence.

Quick website comment: If you're going to ask if I want to subscribe to your agile development course and your newsletter, I'd prefer if those popups came at the same time rather than several seconds apart. It's annoying enough to get one such popup.

If you read the article with NoScript installed in default deny mode, you get zero popups while reading the whole article.

> None of the business analysts could remember what this statistics feature was used for, so we tried to find out who uses it. We looked at the logs of the last 6 months and found out: It was never used! Not a single time!

I've seen a lot of similar stories. General advice: if you want to find out whether anyone really uses something, the relevant time span is a year and a day. It's theoretically possible longer would be needed, but a year and a day will cover almost all cases.

Very good advice, especially for obscure features. If the feature was only used when filling out some field on the annual tax return for example, then no one using it for the past 6 month could be perfectly reasonable if you pick the wrong 6 month period.

If you find a feature nobody has used for the last 6 months... and its to do with some long term calculations... probably safe to say they use it annually. We have the end of year reports that generally kill the database when people try to run 5 years worth of data through them every christmas... they never touch them during the normal working year though.

It's only mentioned in a single line, but there are a lot of real bugs in legacy code that the users actually learned to live with. It's the software's personality now. And when someone has learned to live with a bug, it will be missed if it is suddenly not there any more.

In the past I considered that a bug in human behaviour, but actually it is admirable. You give people a broken tool, but instead of failing, they find a way to succeed using that broken tool. And of course after the effort they are not as willing to change to anything else. Don't change a running system and such.

"Legacy" is a very loaded term -- generally used in a derogatory way.

There is an old engineering truism: "if it ain't broke, don't fix it"

When Lindbergh flew across the Atlantic in the Spirit of St. Louis, the Wright Aviation Company which made the engine for his plane, wanted to do its best to make sure he made it. They carefully built and checked the engine and hand delivered it with a note: "don't monkey with it" (or words to that effect). Lindbergh was a careful guy and he didn't.

Businesses and developers need to distinguish between warranted or needed changes to working systems (software or otherwise) and ego trips.

Real systems are built in limited time, with limited resources, even if they have a multi-billion dollar budget. Invariably there are things someone can make an issue about even in a system with a perfect performance record. Why did you hard code 3.14159267 instead of having a general variable for PI initialized to 3.14159267, and by the way why do you have only 8 decimal places -- wouldn't 16 be better? It is an ego trip and it is the ancient territorial imperative, marking your territory by deprecating the previous person or team -- or a colleague.

Joel Spolsky wrote a famous and widely ignored blog post on this sort of thing:


Take a deep breath and ask yourself: is this something we really need to do or am I (are we) showing off/putting someone down/behaving like a spoiled adolescent?

In the sales community, "legacy" seems to mean "what we didn't sell you."

So users of old software may be aware of various defects and have changed their behaviour to make it work for them. It's important to consider the human element when "fixing" legacy software.

Aah yes, every change breaks someone's workflow: https://xkcd.com/1172/

You can't really safely alter legacy code unless it had decent unit tests. Red/Green programming (test drive development) is absolutely essential in cutting down on technical debt.

Unit tests don't help you determine whether a given behavior is incidental or necessary. For a lot of code, including the statistics code in the article, correctness is a dynamic property: dependent on its context, environment, clients, etc. So working with legacy code is as much project management and communication as it is engineering best practices.

Consider Chrome's recent change to mark sites as insecure if they have a password field without TLS. Will my grandma's bank page be reported as insecure? There's no way to know a priori, or through isolated unit tests. The Chrome team has to go out into the Internet jungle, reaching out to the site owners and pushing until the major sites are ready, and any holdouts can be considered acceptable casualties.

With that understanding, whether code is "legacy" isn't about age or budget or level of support; it's about how much influence you have over your clients. Some code is never used and so never becomes legacy, no matter how old it is. But other code is born legacy: it gets clients immediately and pays the price for being popular.

Another problem with fixing something like that statistics code is that it makes it to impossible compare to new results to old results.

I have a piece of old code like that myself kicking around that tends to overestimate a certain value. I also have some new code that uses a better model and gives a more accurate value. The problem is I cannot really roll out the new code since it will make all the new things look worse than all the old things, even if they're actually the same or perhaps even better. So I use the old code, because even if it's wrong, at least it's wrong in that same way as all the other results and people can make relevant comparisons.

> Unit tests don't help you determine whether a given behavior is incidental or necessary.

That's his point. If you have a test-case for the behaviour, you know that this is how it should actually behave.

Anything not covered in tests however, that will naturally be covered in uncertainty for any weirdness, quirks or odd behaviour. Should it be there or not? Hard to tell when it's not formalized in a unit-test.

Assuming they were testing behavior and not shooting for some coverage metric. I come across plenty of bugs with tests against them. I have a rather strict rule personally, every test should start with a line that says what it's testing.

I commonly hear this, and yet it is so wrong. I don't understand how this idea that unit tests significantly protect against regressions has spread so far. Classic texts such as "Growing Object-oriented software with Tests" note, rightly, that end-to-end tests are for external reliability.

Unit tests, which don't operate on that abstract high level, give you relatively little safety. Sometimes they can ensure that an algorithm is not broken, but algorithms is usually not where the legacy code has issues. Often unit tests are at the level of "assert that X function calls Y stub Z times".

> "assert that X function calls Y stub Z times"

That's a terrible test that should never pass code review. It's worse than no test at all - it's brittle and guarantees nothing.

Writing good tests is _hard_.

I totally agree with you. In the end you need different kinds of tests (end-to-end, unit, ...). And with legacy code, you probably want to start with end-to-end tests or integrated tests.

But in that specific case, no amount of tests would have helped us...

I agree with you, but I would drop the word "unit". When you start to work on legacy code, you often can't write "real unit" tests. The important thing, at this point, is to bring the code under some kind of test harness (Golden Master, executable specs, ...).

Later, when you have some test coverage, you can add unit tests. In some cases, you can even try to refactor without tests first, and then add tests once you have a testable API. That's the whole point of the "BabySteps Timer" kata I have linked in the post...

Still, in the case I described in the post, a test harness would not have helped us. The only thing that would have helped us would have been to ask one of the few users who needed the features. And we didn't know them...

You can, it just require more work. Once you manage to put some code under a test harness, you can write your own unit tests. If the code is too difficult to follow, then some amount of fuzzing is likely to be needed between the commits, with the old unaltered code as the reference, and any difference in behaviour can be considered a red unittest.

The problem is that new code becomes legacy code about ten minutes after it is written.

Hence why it's important to unit test legacy code using a mocking tool like Typemock Isolator

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact