

The Economics of Testing Ugly Code - edw519
http://www.1729.com/blog/EconomicsOfTestingUglyCode.html

======
wynand
The author mentions that changing tested code decreases its value. The
steepness of this curve varies greatly depending on the language. The strong
guarantees of a language like Haskell mitigates the risk and so ensures a less
dramatic drop.

But what I find more alarming, is that the author completely misses the human
side from the perspective of programmers: a good programmer that is forbidden
from changing ugly code will become frustrated and will either become less
productive or leave.

If there are developers that don't care about ugliness and are very
productive, then I am in big trouble (since they would be better hires from a
business perspective than me). However, there is quite a strong connection
between the caliber of a developer and his or her aesthetic sense in code.

If one looks at projects like Squeak, it's clear just how much good developers
can achieve with a relatively small amount of well crafted code. A lot of good
programmers are unable to reach this kind of potential because of the coding
tar pits they have to negotiate every day at work.

~~~
briansmith
"The strong guarantees of a language like Haskell mitigates the risk and so
ensures a less dramatic drop."

Haskell's strong typing _may_ help prevent some bugs, but Haskell's evaluation
model means that even a slight change (literally one character) can change the
performance characteristics of your program, in space and time, by orders of
magnitude. Because of that, I've found that Haskell programs require _more_
testing than Python programs, to ensure there are no edge cases that have
space leaks.

I am a fan of Haskell for some classes of programs (like compilers and
inference systems) where you can spend a lot of time perfecting a strongly-
typed data model that is fairly unchanging--one where "Big Design Up Front"
makes sense. But, I cannot say that Haskell's strong typing is always a net
plus when the data model needs to be frequently updated. In Haskell you often
have to write extra code, or code that is more abstract in order to get
programs to typecheck. Then, you still have to write almost the same kinds of
unit tests that you would have to write if you were using any other language.
With Python or Ruby you can skip the type system workarounds, which often
means you have less code to deploy. I am a big believer that less code ==
better code.

Dealing with ugly code is part of being a good developer. I believe you should
leave ugly code around until you need to change it. When you need to change
it, you can rewrite it to make it more beautiful, but only if that
beautification facilitates the improvement you were originally tasked with
making. But, it is usually a bad idea to go around tidying up code for the
sake of tidiness.

Also, think about this: if the original code is really unclear, then what
confidence do you have that you understand it well enough to rewrite it
correctly; if the code is clear enough to be easily rewritten, then why does
it need to be changed?

~~~
wynand
My Haskell experience is too limited for me to have experienced performance
changes resulting from small code changes. OCaml has a fairly predictable
performance model and a language which combines aspects from OCaml and Haskell
(this is the point where someone shouts "Scala!") will go a long way to reduce
brittleness in the code and won't require much testing for unexpected
performance issues.

I thought more about the original article yesterday and realized that -
regardless of the language one uses - the author made a mistake by implying
that the quality peak is equally high as one moves to the right of the graph.
If by "beautiful code" we mean well designed code, then changes will have far
less far-reaching effects. The quality peak will be lower (relative to the
graph) and hence the descent induced by changes, less steep. And this happens
because as you rightly state, less code == better code.

If a company needs to be agile (in the general sense of the word), their
software should also be easy enough to adapt to changing circumstances. Given
that good code allows for such changes, the extra effort to produce good code
pays for itself.

On the other hand, for a piece of software that is at the end of its lifetime,
there is little of sweeping changes. Beautification of end-of-life code is a
bad use of time.

You are right that a programmer should be willing to endure some ugly code.
Anything which encodes real world relationships (which are complex and messy)
will have some ugliness in it. A programmer which rails against this makes
life difficult for everyone around. But this can be taken too far and I have
known people that are too conservative in this regard; this made the software
unnecessarily hard to maintain and the programmers unhappy [1].

The answer to your last statement is that if you have to modify barely
understandable code, then it is worth refactoring (and as a last resort,
rewriting) it, even if you don't understand every aspect of it. If you don't
you will likely spend more time trying to paste your code in somewhere in the
hope that it will work and in so doing you the crufty monster will get even
bigger and harder to maintain (and you don't gain an understanding of the code
which is a very valuable asset to the owners of the code).

[1] No company has a legal obligation to keep employees happy. But such
companies can expect mediocrity if they're lucky but probably much less.

Edit: I obviously don't know HN's markup. Fixed italization.

~~~
demallien
Surround phrases to be shown in italics with asterixes...

------
swombat
Interesting. I'm not entirely sure how well the argument holds, but it does
put forward some interesting points.

One of the weaker aspects of this article is that the author doesn't define
"value". I'll take a stab at it: for a business, the value of its software is
how much money it can make or save from it. There are two components to this
value, an immediate, obvious one ("can we sell/use it tomorrow?") and a long
term, less obvious one ("can we still sell/use it in two year's time?").

The graphs and explanations, I think, apply fairly well to the "immediate
value", but less so to the long-term value. The long-term value of code you
can keep evolving, cheaply, to meet user demands for years, is much greater
than that of ugly code that will become a viscous nightmare within a few
months. Immediate value is very important too, but generally is focused on at
the expense of long-term value, particularly by non-developers, who aren't
fully aware that code can reach a point where it's so nasty that the only
solution is a bullet to the head and a rewrite.

So I think the problem is more one of making the long-term value of beautiful
code clearer, rather than one of flattening spikes. The curve to the right of
the spike rises much faster and much higher than in the graphs, and it's worth
moving there if you have people who are capable of it.

~~~
pbh101
I think the graphs, bu fault of 'value' not being defined, are showing a
hybrid of the long-term and immediate value of the code. Take, for example, a
strict refactoring of the code, in the sense that only implementation is being
changed and no features are being added. The immediate business value of that
action is near zero, but the long term value is more. What the graphs point
out is that, due to manual testing and production environments rigorously
stress-testing a particular, perhaps brittle, snapshot of the code, the amount
of refactoring and beautification of the codebase required to provide a
baseline level of long-term value equivalent to the spike is much higher. Yes,
the code will be cleaner, and easier to maintain and update for the future,
and the resulting spikes from testing and production deployment will be that
much higher, too.

Take a look at the very last graph. I would posit that once the code is
beautified and a more flexible version deployed, non-source readers would see
another spike at the end, but from the original spike, it's hard to justify
the valley of little incremental value.

~~~
Tamerlin
You're both missing something important. It's who decides on how to define
"value" and in most cases, it's management, and it's almost invariably short-
term value that wins out.

The reason?

Most organizations believe that software engineering is easy, and therefore
software engineers are as interchangable as cogs. Most corporations don't
understand or value experience.

~~~
swombat
I'm not missing that. That's the gist of my comment...

------
13ren
But testing also reduces ugliness: The act of making code unit testable makes
it less ugly, because unit testing requires access to a portion of the code in
isolation, thus making the code more modular. Modular code is less ugly (or so
I claim).

That's an intriguing idea about DSLs enabling customers to see the ugliness.
But why would customers look? (A parallel is non-lawyer customers looking at
legal documents for beauty) Does anyone have experience with DSL's for non-
developer customers? eg: SQL is a DSL for business analysts - but they develop
in it, not just look at it.

 _This form of testing is called production._ :)

~~~
gruseom
_Does anyone have experience with DSL's for non-developer customers?_

I did that on one project. It didn't work. The response we got was "But I'm
not a programmer." We replied, "But you don't understand! This is not a
general-purpose programming language, it's a high-level domain-specific
scripting language!" The response we got was, "But I'm not a programmer." I
learned a lot from that experience.

~~~
learninglisp
Yeah... but in my company, there are many cases where job positions have
spontaneously materialized that are essentially programming positions, but
that we have "con" someone into doing it even though it really is programming.
The users are, for all intents and purposes, programmers... but we have to
keep them from finding that out!

~~~
gruseom
Surely you're not going to get very good programmers that way?

~~~
learninglisp
But they're not _technically_ programmers. I'm talking about applications that
require extensive customizations to be useful. There's a lot of cases where
people are having to trick people with the business knowledge into doing the
configuration for them because the 'real' programmers can't or won't do the
work. _This_ is where we should be looking at applying DSL's.

------
ars
So in summary: don't write ugly code, because once you do it's permanent.

~~~
silentbicycle
You missed a very important point: What makes it permanent (or at least
provides much of the inertia) is the risk of reintroducing old bugs and not
wanting to manually test all over again. _Automated testing does not have this
problem._

In practice, of course, this depends on how badly the testing framework is
entangled with implementation details, rather than testing via a public
interface. If making changes means having to rewrite 90% of your automated
tests, you haven't gained much.

(The article has a couple other potential threads in it, as well, but that
seems like the main idea to me.)

~~~
wynand
"Automated testing does not have this problem."

Good point. I mentioned in another comment here that some languages (like
Haskell with its strong, static type system) mitigate this problem. And for
languages without static type systems of this nature, automated testing can
produce the same effect (which is not to imply that tests for Haskell programs
are useless).

~~~
silentbicycle
Look like I simultaneously replied to your comment with the same comment you
were posting on mine. :)

