
Cleaning bad code - ColinWright
http://www.gamasutra.com/view/news/176218/Indepth_Cleaning_bad_code.php
======
praptak
See also "Working Effectively with Legacy Code" book by Michael Feathers, IMHO
__the__ book on this topic.

[http://my.safaribooksonline.com/book/software-engineering-
an...](http://my.safaribooksonline.com/book/software-engineering-and-
development/0131177052)

~~~
16s
Very nice book. I use it often too. Highly recommend it.

------
cpeterso
My favorite technique for taming legacy code is to define facade interfaces
[1] that carve out new modules. When I'm trying to decipher old code (or a big
new library), the facade captures my understanding in a simpler API using
terminology familiar to me. I can quarantine ugly code and "discover" modules
that may have evolved out of the original code. Facades are also a great place
to insert debug logging and assertions.

With a well-placed facade, I'm like Alexander cutting the Gordian Knot of code
dependencies. :)

[1] <https://en.wikipedia.org/wiki/Facade_pattern>

------
gbog
I'm cleaning code right now and do it 90% in line with these good advices.

However, I differ on the over engineering issue. It is not enough to say
YAGNI. With YAGNI people will put a value in a new column of table, then later
on they will need another one, another column, then in fact we need to know
when these value where set, by who, and you get a table with 600 columns,
while what you needed to do from start was a simple event table, but YAGNI,
you know.

So, no, you need to think carefully for some parts of your code about what the
future might look like, specially for data structures and storage part. You
need to avoid the many traps, and only with some careful consideration and
some experience will you be able to do "the simplest thing that works" the
correct way.

So what is the correct way? This can be easily seen in examples: to me vim,
git, Linux, gnu coreutils, python, seem to be designed the correct way, even
after a long life (but git) they still have many user, the base is solid and
can support many fast changing projects.

~~~
mjfisher
I don't think that's how YAGNI should be applied. I've always found it to be a
great rule of thumb provided you're also constantly refactoring.

So in your example, perhaps the first column addition is justified. When it
comes time to add the second column, _that_ is the time to take a step back
and think of a more general abstraction. This is because you've just proved
for the first time that you actually need one. If you can't absolutely predict
future requirements (and humans generally speaking are bad at this), perhaps
you'd only ever have needed that one column in the first place. That way you
keep a clean, simple codebase until you know a more complicated framework or
pattern is required.

Of course, a rule of thumb is just that; I've seen YAGNI far too religiously
applied too. And I'm in absolute agreement about being careful to think ahead
carefully about the overall design and architecture of what you're working on
- otherwise it's easy to become trapped in local minima of your solution
space. It goes without saying different projects are suited to different
approaches - YMMV.

------
16s
One thing that I often do when dealing with old code that's performing low-
level operations and outputting some binary format is to md5 the output. After
that, I can change the code and I often do this when people ask for less
memory usage, better over-all performance, etc. Once the code has been
changed, I can prove that it uses less memory/CPU/whatever, and verify the
md5sum of my output against the original output so I know the code still
outputs as intended.

~~~
gruseom
It's a good trick, but you can't md5 all possible outputs. How do you choose
the inputs?

------
einhverfr
Most of this is really great advice. However I have run into horror stories
that would shock you. For example:

1) Delete most of the comments 2) Stuff breaks. WTF? The code was processing
its own comments!

Also I would add a third option here (the option we chose with LedgerSMB)
which is:

"Localize the bad code, and replace with good code, a block at a time."

Some code can never be cleaned up effectively. In this case, you separate out,
rewrite, and live with the fact that this will break some stuff while you get
something that, on the whole, is more robust. Now in this approach the thing
that is critical is you make _as few changes to the legacy code base as you
can._ You also look for other layers at which you can implement things like
new, needed security controls, or API's and you do everything you can to avoid
putting new stuff through the legacy sections.

~~~
praptak
> 1) Delete most of the comments 2) Stuff breaks. WTF? The code was processing
> its own comments!

My campfire story: change method name, stuff breaks. Method names had been
parsed in a wrapper object (Python's __getattr__ being abused) so that calling
'setFoo()' triggered a 'Foo_changed' event. In fact, the method names
constituted a mini-language with its own grammar and semantic. Apparently
someone had gotten high on reflection.

~~~
evilbit
Corollary: AOP is the enemy of refactoring.

You unwittingly changed one of the (seemingly unused/unnecessary) pointcuts
and your aspects aren't firing anymore. While not impossible to track down,
these problems are difficult to resolve.

~~~
ldh
This worries me every time I use AOP, but I mitigate that risk by writing
integration tests which assert that the aspects are doing what they are
intended to, and also to raise an alarm if somebody unknowningly breaks them.

------
lifeisstillgood
>> On the other hand, the long-term effects of never cleaning your code can be
devastating. Entropy is the code-killer.

I love that quote.

Can I have "Entropy is the code-killer" on a T-Shirt please.

~~~
davidcuddeback
You can make one on CafePress. Here's one that I just made:
[http://www.cafepress.com/cp/customize/product2.aspx?from=Cus...](http://www.cafepress.com/cp/customize/product2.aspx?from=CustomDesigner&number=687278700)

It's a little pricey for a t-shirt---probably because they only expect one
sale. I don't know if there's a better way to design custom t-shirts.

~~~
lifeisstillgood
I was not expecting that. :-)

------
mbel
I find the 7th point to be really great piece of advice, I simply love the
subtle way, in which author is pointing out advantages of functional
programming.

------
hypnocode
You could pretty easily title this "writing code right the first time" and
some adjustments to tense, I don't think anyone would know the difference.
Great article.

------
jamesaguilar
I like the part about objects. Never thought about it like that before.

