
Ask HN: How do you deal with legacy code? - ainiriand
More specifically, how do you manage to get &#x27;green&#x27; builds when there is code that is not compliant with style or inspections?<p>In php there is phpmd and phpcs that are really hard to pass when having some legacy code.
======
archevel
What I usually aim for is that all new code/commits conforms to style and
other requirements like having tests etc. Usually it is possible to configure
tools to only flag something as a bad build when the new code doesn't conform
to those qualities.

However, this means that as soon as you start mucking about in the legacy
code, that code need to be covered by tests and formatted to pass any style
checks and linters. That is not an insignificant amount of work so plan
accordingly.

Also, I find it useful to remember that even if the code is not to my liking
(for whatever reason) it might be consistent. I.e. maybe it uses some coding
style I am unfamiliar with and I would rather it used another it is often
detrimental to start changing it to be more inline with my preferences.
Consistently poor code is often easier to reason about than code that is a mix
of different preferences and patterns. This also goes for when thinking about
introducing new technology to an existing code base (see the lava flow
antipattern).

~~~
ainiriand
Tank you for the reply. How do you configure those tools to only flag
something as a bad build when the new code doesn't conform to those qualities?
Is it something specific to you build workflow?

~~~
cimmanom
Some will let you specify files or directories to exclude. Others let you
enable/disable one rule at a time.

It took us about 18 months of gradual chipping away to get our codebase to
pass a linter, enabling one rule or class of rules at a time.

We're still working on test coverage. The key is to make sure all new code is
covered (enforce in code review) and then very gradually backfill the rest.
Add reproduction before fixing a bug. Add end to end coverage before a
refactoring.

------
kazinator
Imposing new style on legacy code can be stupid; don't do that.

You're just adding to the "activation potential" required to make a necessary
fix in legacy code. Changes in legacy code should be minimal. Nobody who has
to make a fix in legacy code should be asked to upgrade its style and whatnot,
that is not related to the fix.

If it has its own conventions, just maintain those.

Or else, make a commitment to upgrade the legacy code to new conventions, as a
separate, dedicated task, whose goal is to do it thoroughly.

The last thing you want is a hodgepodge where code that has been recently
touched conforms to one set of conventions, and code that has not been touched
conforms to another.

I think a good way to approach legacy code (at a high level, big picture) is
to answer the question "how much of the value of what we are doing here is
propped up by this legacy code?" If the fact that the legacy code works is
paying your salary, respect that; maybe those people weren't idiots.

------
croo
Maybe you will be interested in the book "Working Effectively with Legacy
Code".

It teaches several ways of dealing with large untested codes in edible chunks.

~~~
ainiriand
Thanks for the hint. I recently landed on a huge and very old codebase (php,
>12 years old) and I am a bit scared.

~~~
croo
Then this book may be written exaclty for you.

One of the most memorable part is that the author defines legacy code as "code
without tests".

And you can handle it by the following:

0\. poke around in the code till you find a part you think you can test
something in isolation. If you need some _trivial_ changes to reach that,
thats okay. 1\. create some kind of test that will catch you if you break
something 2\. refactor and understand

The book is about offering several tactics for dealing with points 0. and 1.
and is very digestible, good read.

------
ThorinJacobs
I've worked on systems that had the CI system set up to only fail style or
inspection when the overall issue count was larger than the previous one. Over
time, your issue count will become asymptotic toward zero. There is a gap in
that if you add new code that has issues at the same time that you refactor
and remove issues, you won't be alerted. So...don't do that!

Tests have also been very helpful. So say I'm in a situation where the legacy
code is not style-compliant, let's forget the style-oriented analysis and
double down on adding tests. Once you have good test coverage, start to clean
up the styling and start to think about static code analysis as part of the
build.

~~~
shoo
> I've worked on systems that had the CI system set up to only fail style or
> inspection when the overall issue count was larger than the previous one.
> Over time, your issue count will become asymptotic toward zero.

Yeah, that's probably not a bad approach. it also means it is acceptable to
introduce new lint violations (this may be unavoidable) if the committer "pays
the tax" / "pays a tithe" of fixing some existing violation that might be
easier to clean up.

another way to implement this (which probably depends upon details of the
version control and lint tools being used) is, for each change, query version
control to identify what files have changed (or ideally subsections of files),
and evaluate before/after lists of issues using the lint tool.

for example, for lint tools that only need to read the content of a source
file (without e.g. executing or importing it or whatever), and support
processing a single file at a time (instead of needing to scan the entire
source tree) you can probably cobble something together reasonably easily if
no existing tool exists.

for example, if you are dealing with a legacy python codebase that is version
controlled using git, you could start using `pyflakes` that is able to analyse
each file in isolation, and merely needs to read the content of the file and
not import or execute it. it isnt that hard to query git to discover which
files were changed in a commit (e.g. `git show --name-status <commit>`) or
view the content of a file at an arbitrary revision (`git show
<commit>:<path/to/file.py>`). with enough glue scripting you could use this to
produce a custom report to show what the delta of issues (either a count of
sets of the actual specific issues) is for each file touched by a commit,
without having to scan the entire codebase.

------
SenHeng
I am currently converting a 4 year old Coffeescript codebase into modern
Javascript. There are certain parts containing heavy use of super, lots of
this and global variables from who knows where.

My solution is to ignore the parts that can't be easily fixed. List them up in
an issue or ticket somewhere to fix later if necessary. For now, modernising
the majority of the codebase is a more pressing issue than nitpicking
preferences.

~~~
pards
It's sad that a codebase that's only 4 years old is considered legacy. I
really feel sorry for the project sponsors in this scenario who presumably
paid good money for a solution that didn't stand up to the test of time.

~~~
shoo
if you take Michael Feather's definition of legacy code as "code without
tests", it's probably the case that someone in your org is producing some new
legacy code this week ;)

on the other hand, sometimes in a startup-ish scenario it is optimal from a
business perspective to bash out code to add features to drive revenue and
worry about tech debt for the far future hypothetical scenario of the startup
turning into a profitable business that can pay to bring people in to rewrite
everything.

that is, optimal in the sense of the alternative being slowly writing very
well thought out and engineered code and running out of money before having
enough customers to justify a longer-term engineering focus

------
slededit
Is the code actually bad or just different? Consider the costs of enforcing a
different coding style from what previously existed - is there a business case
for it?

------
xet7
Info about dealing with legacy code:
[https://www.legacycode.rocks](https://www.legacycode.rocks)

------
was_boring
> how do you manage to get 'green' builds when there is code that is not
> compliant with style or inspections?

See if there is tooling to do it for you. Google seems to favor this approach,
seeing as they put out YAPF for python and gofmt for Go.

------
slipwalker
usually, when i have the time for it ( ask management ) i start by writing
missing tests for the legacy codebase ( which, more often than not, have very
little automated tests ). Then, isolate and encapsulate the hell of it, to the
point that it can be replaced more easily, if/when needed.

OTOH, when there is no time available for this kind of care, i just go rodeo-
style, guns blazing, code crashing and long nights... fixing the most low
hanging bugs, or the most critical.

------
mabynogy
All source code analyzers I tried are broken. They generate too much false
positive. They don't help the programmer.

