
The truth about TDD - _sdegutis
https://sdegutis.com/blog/2018-08-29-the-truth-about-tdd
======
mikekchar
This is pretty far down the page. Possibly only a discussion between me an you
;-) I enjoyed your article. Thank you for writing it. However, I have a
different point of view which you might find interesting.

There is a difference between TDD and testing. Testing is something you do to
see if what you've done is correct. Quite frequently I work the way you
described and it's very nice to feel confident that I have implemented what I
set out to implement.

At that point, I often just throw the test away. Sometimes if I have a pair
programming with me, they'll be shocked. "Why did you throw away that
perfectly good test", they will ask. "Because I wrote it to find out if the
code works and now I know that it works".

TDD is about something else. With TDD you aren't so much worried about the
correctness of the behaviour, you're worried if the behaviour has changed
since the last time you touched the code. Ideally, if it _has_ changed, it
would be nice to know _what_ changed. Even better, it would be nice to know
what assumption was violated that caused the change in behaviour.

Imagine that you have an application that you have tested inside and out. It
works perfectly. Next imagine that you have a special magical device that
tells you if the behaviour changes every time you modify the code. It doesn't
tell you if the behaviour is _correct_ , it simply tells you that it changed,
what the change was, and what caused it to change.

Now every time you add code, if the behaviour doesn't change, you know that it
is still operating correctly -- because it was before and it hasn't changed.
If the behaviour _does_ change, you can observe the behaviour and decide if
the change is good or bad. If the change is good, then the code is _still_
operating correctly. If the change is bad, then the code is operating
correctly except for the change. If your magic device also tells you where
your assumptions are violated, then it becomes easy to decide how to make the
behaviour good again.

There are a couple of really cool things about this magic device. It doesn't
need to know if the system is operating correctly or not (which is difficult
in most cases and impossible in the general case). It just needs to know if
the behaviour is the _same_ as before (which is a much simpler problem). The
other really cool thing is that if you make a change to the behaviour of the
system and the magic device doesn't detect it, then you know the magic device
is broken. Since the magic device is very useful, it is probably a good idea
to fix it right away.

Of course the magic device is a test suite. But it's important to understand
that it's a very special kind of test suite. It measures behaviour and detects
if the behaviour changes -- not if the behaviour is correct (you can test that
separately, either in a manual or automated fashion). Often in legacy code
I'll write a whole bunch of "tests" by running functions with various inputs
and recording the results. That's all I need. I don't need to know if the
results are correct or not. As I modify the code and watch the differences, I
can determine if the differences are good or not, and I may find bugs. But
bugs are not my main worry with this style of test suite. I'm using the test
suite to inform (and later confirm) my assumptions about the behaviour.

Second, this kind of test suite needs to tell you what is actually wrong. For
example, you might have a test that determines if a particular result was
produced. If the test "fails", you might say "the test fails". This is pretty
useless, though. Now I have to go and debug the code. Instead, I want to be
told exactly what is different between what I expected and what I received. I
also want to be told what context the program was in when I got the result. So
I need to be able to see at a glance the input, output and processing (big
hint: fixtures, as convenient as they are, are usually bad because you can't
see the input).

Thirdly, this kind of test suite needs to tell you _where_ in the code the
code the problem is likely to be. If you change the behaviour in one place,
then ideally a single test will fail. This underlies the difference between a
test suite for testing correctness and a test suite for testing changes in
behaviour. If you were testing correctness and you have the same behaviour in
several different contexts, then you would expect to have several tests to
ensure that the behaviour is correct in each context. With this kind of test
suite, you actually want to test once and then simply ensure that the same
code paths are followed in each context. Then when the behaviour changes, you
only get a single failure -- in the tests related to the behaviour.

When writing these kinds of tests, you will find something a bit strange. In
order to ensure that a test fails whenever behaviour changes, you need 100%
test coverage and _also_ 100% branch coverage. This means that you need to be
able to write tests that hit all of your code.

If you want to ensure that only one test fails whenever the behaviour changes,
then you will have to break up your code so that you have access to just the
functionality that you need.

What you will discover is that you are doing _white box testing_ and not
_black box testing_. Because you actually want to see what the implementation
is doing. The best analogy I've heard of is that it's like putting watch
points in a debugger. Then when the code is running, you can inspect the state
of the code to see what it's doing.

This is where the real kicker is: it forces you to expose state rather than
hide it.

And I'll leave that in a paragraph by itself, because the implications are
huge for design. Quite frequently we choose to abstract away state. Sometimes
it is not even measurable. "As long as my function returns the correct value
(no matter how complicated it is to generate), there is no need to see the
inner workings. In fact I don't _want_ to see the inner workings because it
will add complexity to the system. I want a simple API and I'll push all the
complexity inside it". This is the antithesis of TDD, because then we can't
inspect the running state of the system and pinpoint where behaviour changes
(except in really large chunks).

Obviously, there are lots of really good advantages to abstraction and
encapsulation. However, there are ways of providing that while exposing state
(at every level) at the same time. Essentially, you are forced to create a
many layered system and you are forced to think hard about those layers
because...

There is another important result of TDD (or at least of the TDD I describe
above). I mentioned that you have to be able to see at a glance the contexts
that you are creating when you are writing the test. This means that it must
simple (and concise) to create the collaborators that you need to create that
context. This, in turn, means that you have to reduce the complexity of the
inter-dependencies of collaborators and you especially need to be able
construct the dependencies clearly. Things like global variables and
singletons become big liabilities in this kind of environment.

The result of all of this is that you need to have collaborators with simple
dependencies (low coupling), you need to have many layers, where the state of
the system is always expressible, and you need to have simple interfaces on
your systems.

The most important thing to understand is that TDD does _not_ design your code
for you. Of course, you need to do that yourself. Rather it provides a series
of constraints (as a result of always being informed of the behaviour of the
system at all levels), which enforces certain good qualities in the resultant
design.

Having said all that, I know many people who do not like designs that follow
the constraints that TDD enforces. They prefer to have larger, more
complicated interfaces. They prefer to have APIs that hide the inner behaviour
and make it impossible to examine the state of the code (without actually
running it in a debugger). This is clearly a choice. However, I'm bumping up
to about 20 years of doing it the TDD way and my experience has been that a
system designed in that fashion is more flexible, easier to read and reason
about, and considerably easier to work with. As always YMMV.

I hope that provided some insight into a different way to look at TDD.

