
Why Most Unit Testing Is Waste [pdf] - quickthrower2
https://rbcs-us.com/documents/Why-Most-Unit-Testing-is-Waste.pdf
======
alkonaut
The normal practice for large scale codebases in complex domains is "the code
is the spec". That is, the only specification for how the system should work
is how it worked yesterday. In that case, unit tests serve as a great
specification. Even tests that just duplicate the business code under test and
assert that it's the same (A huge waste in normal cases) is useful. Because a
Unit test is much better than a word document in describing how code should
behave. I very much prefer a massive, hard to maintain set of poorly written
unit tests, to a large number of outdated documents describing how every bit
of the system should work.

So is a big/bad test suite a burden? Sure. But is it a burden compared to
maintaining specifications of other kinds? Is it a burden compared to working
in a system with neither type of specification?

Further, the people writing long articles like this are (or at least were)
very good developers. There is often an element of "good developers write good
code, so just use good developers" in them. But writing good software with
good developers was never the problem. The problem is making software that
isn't terrible, with developers ranging from good to terrible, and most being
mediocre.

~~~
mattmanser
The article addresses your first point in a big section:

1.4 The Belief that Tests are Smarter than Code Telegraphs Latent Fear or a
Bad Process

Your other point about only good programmers not needing unit tests is moot as
you haven't followed it through to the conclusion. Namely, if bad programmers
write bad code that needs unit tests, they're also going to write bad unit
tests that don't test the code correctly. So what's the point?

~~~
ekidd
> Namely, if bad programmers write bad code that needs unit tests, they're
> also going to write bad unit tests that don't test the code correctly. So
> what's the point?

Let's assume you hire _good_ programmers, because otherwise you're doomed. But
oftentimes, the "good" programmer and the "bad" programmer are the same
person, six months apart:

1\. John writes some good code, with good integration tests and good unit
tests. He understands the code base. When he deploys his code, he finds a
couple of bugs and adds regression tests.

2\. Six months later, John needs to work on his code again to replace a low-
level module. He's forgotten a lot of details. He makes some changes, but he's
forgotten some corner cases. The tests fail, showing him what needs to be
fixed.

3\. A year later, John is busy on another project, and Jane needs to take over
John's code and make significant changes. Jane's an awesome developer, but she
just got dropped into 20,000 lines of unfamiliar code. The tests will help
ensure she doesn't break too much.

Also, unit tests (and specifically TDD) can offer two additional advantages:

1\. They encourage you to design your APIs before implementing them, making
APIs a bit more pleasant and easier to use in isolation.

2\. The "red-green-refactor-repeat" loop is almost like the "reward loop" in a
video game. By offering small goals and frequent victories, it makes it easier
to keep productivity high for hours at a time.

Sometimes you can get away without tests: smaller projects, smaller teams,
statically-typed languages, and minimal maintenance can all help. But when
things involve multiple good developers working for years, tests can really
help.

~~~
falcolas
Here's the crux of your argument - and I honestly think it's a flawed premise:
The failure of unit tests indicates something other than "Something Changed".

Were the failing tests due to John/Jane's correctly coded changes,
regressions, or bad code changes? The tests provide no meaningful insight into
that - it's still ultimately up to the programmer to make that value judgement
based on the understanding of what the code is supposed to do.

What happens is that John and Jane make a change, find the failing unit tests,
and they deem the test failures as reasonable given the changes they were
asked to make. They then change the tests to make them pass again. Again, the
unit tests are providing no actual indication that their changes were the
correct changes to make.

WRT the advantages:

1\. "design your APIs before implementing them" \- this only works out if we
know our requirements ahead of time. Given our acknowledgement that these
requirements are usually absent via Agile methodologies, this benefit
typically vanishes with the first requirement change.

2\. "makes it easier to keep productivity high for hours at a time" This tells
me that we're rewarding the wrong thing: the creation of passing tests, not
the creation of correct code. Those dopamine hits are pretty potent, agreed,
but not useful.

~~~
Floegipoky
>> Here's the crux of your argument - and I honestly think it's a flawed
premise: The failure of unit tests indicates something other than "Something
Changed".

Even if all you know is "something changed", that's valuable. Pre-existing
unit tests can give you confidence that you understand the change you made.
You may find an unexpected failure that alerts you to an interaction you
didn't consider. Or maybe you'll see a pass where you expected failure, and
have a mystery to solve. Or you may see that the tests are failing exactly how
you expected they would. At least you have more information than "well, it
compiled".

And if "the unit tests are providing no actual indication that their changes
were the correct changes to make", even the staunchest proponents of TDD would
probably advise you to delete those tests. If there's 1 thing most proponents
and opponents of unit testing can agree on, it's probably that low-value tests
are worse than none at all.

I have no idea if this describes you, but I've noticed a really unfortunate
trend where experienced engineers decide to try out TDD, write low-value tests
that become a drag on the project, and assume that their output is
representative of the practice in general before they've put the time in to
make it over the learning curve. People like to assume that skilled devs just
inherently know how to write good unit tests, but testing is itself a skill
that must be specifically cultivated.

~~~
krfsm
Somebody could do the TDD movement a great big favour and write a TDD
instruction for people who actually already know how to write tests.

There probably are good resources about this somewhere, but compared to the
impression of "TDD promotes oceans of tiny, low-value tests" they lack
visibility.

------
h0l0cube
People seem to have not read the article properly and are conflating unit
testing, with all testing. The article doesn't say to avoid all unit testing,
but to restrict the test suite to that which can be validated by business
logic or some formalized oracle - this would imply unit testing for critical
systems like hardware drivers, banking/avionics, crypto, etc, where there are
known and eternally consistent results that relate to the feature spec. But it
also suggests that, in most cases, maintaining systems and integration testing
are more economical usage of time, and further implies that they are more
'correct' in light of the fact that they directly relate to the feature spec.

I believe I came across this document (or something close to it) a few years
ago, and it changed my life. I was able to get great velocity and low bug-per-
feature count (1 or 2), using systems/integration testing combined with
exploratory testing by a QA team. I was even able to find subtle bugs in the
unit-tested back-end via integration tests from a mobile app. At the end of
the day, if your testing isn't serving the business logic, it's wasted effort.

The most bad-ass thing I remembered was the part where if you couldn't elicit
a code path using a systems test, it's a good candidate for deletion.

~~~
lewisl9029
Agreed on all points.

These days, a large majority of the frontend tests in my codebase aren't tests
that I've written myself. They're automatically generated Jest snapshot tests
that capture the virtual DOM output of entire components in different states
as dictated by product requirements, using the StoryShots plugin for React
Storybook.

These tests prove to be extremely effective in catching bugs, because they
directly reflect product requirements, and because they end up exercising a
large majority of the codebase for very little cost compared to covering the
same amount of code with individual unit tests.

The latter point is an important consideration too, because a lot of the code
these integration-level tests exercise are code that I'd have considered too
trivial to have been worth the ongoing maintenance costs in building unit
tests for, when considered in isolation.

This means it'd have been easy for me to be satisfied with selectively and
arbitrarily deciding which pieces of code should qualify for test coverage,
and leave test coverage at some arbitrary number, as opposed to what I do now,
which is to always strictly require 100% code coverage, but explicitly and
deliberately exclude code that I have good reason to not test, which feels
like a much more solid framework for ensuring code quality.

I still write unit tests for functionality that component snapshot tests can't
reasonably cover, but usually only at a level of granularity where the tests
themselves can map cleanly to product requirements as well, instead of
painstakingly testing every little function in perfect isolation.

------
geebee
I'm afraid that the TDD movement went to far. The proponents became so
convinced that they were right that they started comparing themselves to
people who argued that washing hands was important in early surgery, and that
those who questioned it would soon be unemployable in the field. Not those who
didn't practice it, those who _questioned_ it.

I dealt with some of this, including a rather bullying type who tried to
browbeat people into writing tests first. And while that may be a bad example,
I'm afraid it really wasn't unrelated to the movement itself.

When people, a while ago, said "TDD is dead", they didn't actually argue that
the technique wasn't useful. They were saying that the whole "you're wrong,
I'm right, if you want to stay employed you'll do as I say and practice TDD",
that is no longer defensible.

It's probably time to leave that in the past, and hope that people who promote
a methodology this way have learned the mistakes of aggressively cramming
things down everyone's throat. Truth is, I was actually a frequent
practitioner of TDD, and I was pretty appalled with how it was getting pitched
to the programming community.

Now, let's leave that in the past and take a fresh look on whether TDD is a
very beneficial practice in many contexts. I certainly agree it isn't crap.

~~~
adrianmonk
As in many areas of thought, when the pendulum swings too far in one
direction, it then swings too far in the other direction. Hopefully we'll
arrive at a reasonable middle ground at some point, and unit tests will be
valued (and prioritized) neither too much nor too little.

On a side note, unfortunately I think the software industry tends to be
particularly bad about this pendulum swinging back and forth between extremes
thing. There is so much emphasis on innovation that people are biased toward
making radical changes and believing they are going to revolutionize
everything. And the culture is such that the more you go out on a limb and
push something radical, the more you are respected, because we often value
guts more than good judgement.

~~~
eric_h
> the pendulum swings too far in one direction, it then swings too far in the
> other direction

> Hopefully we'll arrive at a reasonable middle ground at some point, and unit
> tests will be valued (and prioritized) neither too much nor too little.

My coding style has very much gone through a similar evolution. When I first
learned about TDD, I went in whole hog - test everything, ui tests, request
tests, unit tests - tests for everything.

Then I started to notice that the development costs associated with extreme
coverage did not, in fact, pay off.

The number of bugs found was vastly outnumbered by the time wasted dealing
with the peculiarities of various ui testing frameworks, let alone the amount
of time wasted waiting for those tests to run.

My metric is now that I've written enough tests to feel confident that the
code works. It's an extraordinarily qualitative metric, and I cannot find a
way to objectively quantify it, yet it very much works for me.

Really, it all comes down to visibility. Is there somewhere that is up to
date, that will tell you what the code is supposed to do? Do you have tooling
in production that will alert you when it's not doing what it's supposed to
do? Do you feel confident that the code works?

------
henrik_w
I disagree: [https://henrikwarne.com/2014/09/04/a-response-to-why-most-
un...](https://henrikwarne.com/2014/09/04/a-response-to-why-most-unit-testing-
is-waste/)

~~~
crdoconnor
"In my experience, unit tests are most valuable when you use them for
algorithmic logic. They are not particularly useful for code that is more
coordinating in its nature."

I agree in principle, but IME:

* This kind of algorithmic code often makes up a fairly small proportion (~5%) of code written for most business driven applications - it's usually _mostly_ coordination. Some domains may be different (and for them, unit testing will appear to be much more effective), but I think this is the norm.

* Integration tests can test algorithmic acceptably well, but unit tests do _not_ test integration code in an acceptable fashion.

* Where you have algorithmic and integration code smushed together (this type of technical debt is, sadly, the norm 'in the wild'), again, integration tests work acceptably well whereas unit tests require a mess of mocks.

* Integration tests do not have to be high level or slow.

"Refactoring breaks tests. Sometimes when you refactor code, you break tests.
But my experience is that this is not a big problem. For example, a method
signature changes, so you have to go through and add an extra parameter in all
tests where it is called. This can often be done very quickly, and it doesn’t
happen very often. This sounds like a big problem in theory, but in practice
it isn’t."

This is a _big_ problem in practice when you are working on large scale code
bases in which the developers have not been super strict about decoupling
everything.

~~~
RHSeeger
> Integration tests can test algorithmic acceptably well, but unit tests do
> not test integration code in an acceptable fashion.

True, but you can have both types of tests, which lets you test algorithms
(unit tests) and higher level combinations (integration tests) both in the
best way. The right tool for the right job.

> Where you have algorithmic and integration code smushed together (this type
> of technical debt is, sadly, the norm 'in the wild'), again, integration
> tests work acceptably well whereas unit tests require a mess of mocks.

In many cases, you'd be better off separating the logic, which would let you
have both types of tests, each for a different part of the logic.

> Integration tests do not have to be high level or slow.

No, but they are a different tool.

~~~
crdoconnor
>True, but you can have both types of tests

You can, but if your code is 95% coordination and 5% algorithmic, at _most_
you'd want 5% unit tests.

>In many cases, you'd be better off separating the logic

Which is refactoring and if you're refactoring you need to have your code
surrounded with tests to do it safely.

And, once you've done that, if you've already written integration tests for
that logic and they perform acceptably well, there may be no point in
rewriting the integration test as a unit test.

------
fermigier
This has already been discussed to death, on 3 occasions:

[https://news.ycombinator.com/item?id=7353767](https://news.ycombinator.com/item?id=7353767)
(268 comments)

[https://news.ycombinator.com/item?id=11799272](https://news.ycombinator.com/item?id=11799272)
(280 comments)

[https://news.ycombinator.com/item?id=13815779](https://news.ycombinator.com/item?id=13815779)
("only" 24 comments)

~~~
fermigier
Also, there's a followup by Coplien which _has not_ been discussed:
[https://rbcs-us.com/documents/Segue.pdf](https://rbcs-
us.com/documents/Segue.pdf)

So maybe this document should be the focus of the discussion now.

~~~
vog
I'd love to submit this document to HN, so it has a chance to get a wider
audience than provided by this sub thread.

However, the title "Seque" is completely unspecific and hence totally useless.
Given that title, I don't see how to submit this to HN in a way that it
attracts readers.

This is really a pity.

~~~
ballenf
The guidelines on titles allow for writing your own title as long as accurate
and more informative than the original title.

The bigger problem, I think, is the velocity of articles appearing and
disappearing from the front page makes follow-up articles have an inherently
smaller starting audience.

~~~
DanBC
> The guidelines on titles allow for writing your own title as long as
> accurate and more informative than the original title.

Mods appear really clear about this. They want you to use the original title
unless it's clickbait, or inflammatory, or misleading. Only then can you
change it, and they say an informative sentence from the article might do.

This means that articles with terrible, unclear, titles get posted to HN all
the time.

See eg this thread:
[https://news.ycombinator.com/item?id=15540388](https://news.ycombinator.com/item?id=15540388)

(Of course, I'm not a mod, so maybe I'm wrong.)

~~~
lmm
Certainly mods actively edit useful titles, explanatory titles to replace them
with the terrible, unclear original title (Steven Hawking's PhD thesis is a
good recent example).

------
dkarl
_Don’t underestimate the intelligence of your people, but don’t underestimate
the collective stupidity of many people working together in a complex domain._

Sometimes I find that unit testing is an attempt to spend as much time as
possible experiencing the first reality (you are a smart, competent
programmer) as a relief from the pain of the second reality (you have a
gnawing fear that your understanding of the context is fatally flawed.) Can't
figure out if the work you're doing is constructive? Anxious about the lack of
requirements? Go soothe yourself writing unit tests. Kick back and watch the
integration tests run. You're doing your job, and the rest will take care of
itself, right?

------
mosselman
This is such a great document. I have referred to it quite a lot in my last
job as we were restructuring our test. We mainly had unit tests that we'd just
try to make green every time after changing code. It didn't stick well with me
that we'd just rewrite the code in a different syntax (Rspec DSL). You don't
really test anything in that way.

Just be prepared for a lot of shocked reactions when you say 'Most Unit
Testing Is Waste' to people who have not been softened a bit to this concept.

I really like this way of thinking. The next chapter of getting into this
subject are the live discussions (5 videos) between Kent Beck, David
Heinemeier Hansson and Martin Fowler talking about TDD (which relies a lot on
unit tests): [https://martinfowler.com/articles/is-tdd-
dead/](https://martinfowler.com/articles/is-tdd-dead/) I enjoyed these a lot
too and the combination of these sources has really improved my testing.

------
lowbloodsugar
This article is an anti-resume, and I recommend not reading it. I'll take a
look at two claims.

 _Unit tests are unlikely to test more than one trillionth of the
functionality of any given method in a reasonable testing cycle. ... Trillion
is not used rhetorically here, but is based on the different possible states
given that the average object size is four words, and the conservative
estimate that you are using 16-bit words)_

An int may contain "four billion states", but for the requirements, its highly
likely that we can classify the integer into three states: less than zero,
zero, greater than zero. As a bank, I might not care how much money you have,
only that you have more than zero. In a transaction, I don't care how much
money changes hands, as long as no money is lost.

Pointing at memory-as-bits, as if we're still using punch cards, and then hand
waving "I can't possibly test this", ignores sixty years of progress. The
refusal to imagine a class with range checking is a damning statement about
the author's own ability as an engineer.

 _Programmers have a tacit belief that they can think more clearly (or guess
better) when writing tests [than] when writing code, or that somehow there is
more information in a test than in code._

Consider writing a sorting algorithm vs testing a sorting algorithm. Would you
feel more confident writing the _test_ for a sorting algorithm than writing
the algorithm itself? The test is simple: is every item in the list less than
the next item? The code is far more complex. We're in the same realm as NP
problems. I can write a test to verify that a graph is correctly 3 colored,
but the code might be a _bit_ harder.

Perhaps, then, the author's experience of other developers believing they can
"think more clearly" is actually his observation that the developers are
_solving simpler problems_ , and are thus more confident. And that is the
point of tests: it is easier to verify than solve.

In short, every conclusion in this article begs the question, "Might there be
another explanation?"

------
siliconc0w
The best workflow I've found is to define an empty function, add a breakpoint,
and code the function in a repl. At this point I'm pretty confident the code
works for at least a 'happy path' input so I copy the code to the editor and
add tests to call it with a variety of other inputs.

Being able to inspect the state of the program in real time is invaluable and
gives me a lot of confidence that I understand how the code works. For some
reason most programmers I see don't even run their code locally and just use
log statements to guess at state when it invariably doesn't work as expected.

Another big problem is test data. I see way too much naive mocking. You really
need to exercise your code with data that is as close to real-life input as
possible, ideally it's a sanitized version of production data. Other tests are
great too (i.e large lists of 'naughty strings') but if you're manually
specifying your test data you are a) spending a lot of time doing something
that should be automated and b) are only exercising your code with what you
_think_ it might see which is usually not good enough.

~~~
joshribakoff
Log data is how unit testing started I think, people would log all the output
and compare the sheet of actual log output to the sheet of expected log
output.

Working in an GUI debugger / REPL is great for visualizing code I often code
that way myself, but let's not fool ourselves, it still requires manually
setting breakpoints, and pressing keys to step through code. You can't do this
for every method after every change, whereas unit testing has that advantage.
I do agree with most things mentioned in the article though. A lot people end
up writing tests that hit databases and are too slow or fail randomly due to
chained state, or are over specified & end up just getting in the way. What it
comes down to is you can have good tests or bad tests, and its still entirely
subjective just like whether the code itself is good or bad. I recommend the
book xUnit test patterns, its basically a bunch of "rules of thumb"

------
hacker_9
_" my team told me the tests are more complex than the actual code. (This team
is not the original team that wrote the code and unit tests. Therefore some
unit tests take them by surprise. This current team is more senior and
disciplined.)"_

Despite the inflammatory title, the point the PDF is making I believe is that
unit tests should be kept short and simple, and convey the _intent_ of your
code.

Intent is akin to the 'why' \- written code explains the 'how' and 'what'
extremely well, but without the 'why' it loses all meaning. Writing tests to
convey intent is essential because no computer or programming language can do
this for us currently, so it's left up to us.

~~~
abritinthebay
Yes, I agree. Unit tests should test the smallest unit of work and build like
LEGO bricks.

What the Article describes are _tests_ but I would not call them _unit_ tests.

------
snarfy
I pretty much agree with all of this.

Unit tests are great when you have a large code base with dozens of apps and
need to modify a core library to remove side effects. How do you know you
didn't break one of the apps?

But if we go deeper, why does the core library have side effects? Because it
was a crummy, poorly designed piece of crap to begin with. If it was
originally written with high quality it wouldn't need the refactor now. The
author noticed this. He wrote good code to begin with so it didn't need a lot
of testing.

When you have a large team of mediocre engineers you need unit tests to guard
against more bad code from getting in. They might even be brilliant engineers,
stuck in a horrible process of churning out features to unrealistic deadlines.

If you have a great team, a great process, and a great budget with realistic
goals, you can crank out amazingly good software without the need for a lot of
unit tests. But that's not the real world. In the real world we have lots of
unit tests. They are a band-aid over the other problems without addressing
them specifically.

~~~
josteink
> I pretty much agree with all of this.

It's easy to disagree with someone who portrays an enemy which doesn't exist.

I disagree with this piece everywhere it manages to land somewhere concrete,
where its claims can be verified and assessed.

> But if we go deeper, why does the core library have side effects

Let's not get ahead of ourselves in theoretical, non-applied functional
programming.

As even Haskellers recognize, all computing would be meaningless if there
ultimately was no side effects. In fact, we use computers for their "side-
effects".

Sometimes you need side-effects. And sometimes you unit-tests are great way of
automatically verifying that you have the _right_ side-effects.

~~~
lmm
Side effects are by definition outside the thing itself, so they're not really
amenable to unit testing; you have to use integration tests.

~~~
BlackFly

        class AccessCountedInteger {
            private final int value;
            private int accessCount = 0;
    
            AccessCountedInteger(int value) {
                this.value = value;
            }
    
            public int getValue() {
                accessCount++;
                return value;
            }
    
            public int getAccessCount() {
                return accessCount;
            }
        }
    

This unit has side effects on the method getValue() that impact what is
returned by getAccessCount(). While it is contrived, if you are making
something following the builder pattern, you will in general have a lot of
side effects from all of the methods called on the final build() method. This
is quite amenable to unit testing.

~~~
lmm
That is not the kind of side effect that Haskellers would acknowledge as
necessary (and I'd see it as a bad idea). We can call it a "side effect" but I
think there's a qualitative difference between that and something like file
I/O or async.

------
pqwEfkvjs
"Unit tests are pointless" \-- this is often said by developers who suffer
Dunning-Kruger syndrome. Except the devs who are actually working on something
more exotic that cannot be tested well automatically.

~~~
wtetzner
You can have automated integration tests.

~~~
pqwEfkvjs
Yes, but devs who don't write unit tests are probably not going to write
integration or acceptance tests either.

Maybe except of a one guy who I talked to a while ago. He does not write unit
tests, because static type checking in C++ takes care of everything that unit
tests do (according to him), but I actually have seen his code that contained
few system tests, so there are exceptions.

~~~
wtetzner
Except that the featured article isn't saying not to worry about testing at
all, it's about how to actually get value out of your testing.

~~~
pqwEfkvjs
I think it advocates wrong attitude towards software development and I think
most points made in the article are just plain wrong and stupid.

I guess the main reason for my condescending sentiment is due that the author
gives 0 examples of code that requires tests vs code that does not so it is
actually not.

------
luord
> In most businesses, the only tests that have business value are those that
> are derived from business requirements.

This is, imo, the most important takeaway and it is, or should be, obvious,
but it often isn't (maybe because "common sense is the least common of senses"
or something like that). It's also what I strive for, with one caveat[1].

> One is to use it as a learning tool: to learn more about the program and how
> it works.

Another great takeaway. _Good_ tests should work as documentation, imo. I
consider this another test quality metric, even: If looking at the tests only
confuses me further, those tests need to be changed (and the code they're
testing too, most likely).

[1]: Striving for this can turn code coverage into an useful metric. Not of
correctness, of course, but on how much code we're writing that doesn't solve
a business requirement. That code should be refactored away to separate
libraries or replaced with third party libraries that already do that. I'm
firmly in the "avoid NIH" camp.

------
groby_b
> 1.4 The Belief that Tests are Smarter than Code Telegraphs Latent Fear or a
> Bad Process

That's not why you write unit tests. They are, in a roundabout way,
programming's way of double-entry bookkeeping. Tests and code both are there
to double check the other one. Neither one is "smarter" than the other.

------
juandazapata
It'll be fun to refactor a codebase without unit tests </sarcasm>

~~~
wtetzner
Refactoring code usually means you have to refractor unit tests as well, since
they are so closely tied to the structure of the code.

Integration tests, however, should not have to change, since they are
typically run against the external interface to the software, which shouldn't
change if you're just doing a refactoring.

~~~
juandazapata
Funny thing. Unit tests force you to think about good code design. Integration
tests don't. If you only have integration tests in your system, you'll most
likely end up with a big ball of mud. I've seen this so many times in so many
real world projects that it hurts.

~~~
mannykannot
> Funny thing. Unit tests force you to think about good code design.
> Integration tests don't.

They absolutely do, but at higher levels of abstraction, in terms of
interfaces, interactions, constraints, obligations and responsibilities. In
fact, unit and integration tests make you think about design in essentially
the same way.

~~~
juandazapata
Whatever floats your boat. My experience in large enterprise products is
exactly the opposite, but I'm glad it works for you.

------
jasode
The SQLite project seems to have an alternative view:
[https://www.sqlite.org/testing.html](https://www.sqlite.org/testing.html)

A lot of us would consider SQLite to be high quality and relatively bug-free.
The extensive test suite they've built up that exercises each release is a
huge reason for it.

Of test code _quantity_ , Coplien writes:

 _> If your coders have more lines of unit tests than of code, it probably
means one of several things. They may be paranoid about correctness; paranoia
drives out the clear thinking and innovation that bode for high quality._

 _> \- Keep regression tests around for up to a year_

 _> \- Throw away tests that haven’t failed in a year._

SQLite appears to keep their tests. Somebody files a bug; SQLite writes a test
that reproduces that bug; the test code remains long after the bug is fixed.
This prevents the bug from _reappearing_. (Isn't that the main purpose of
"regression" in the phrase "regression test"?)

I also think it's better to keep relevant regression tests for _years_. E.g.
Consider that there's piece of code that's been working correctly for 5 years
with 5 years of passing regression tests . Imagine a new programmer wants to
rewrite the code to optimize it for speed and reduced memory usage. I think
we'd feel much more confidient if the new code passes those same regression
tests that were accumulated over 5 years.

As for test code size ratio... A lot of good comprehensive tests will have LOC
outnumbering the actual code being tested. This is especially true for library
code that's used in many places up the stack. I wrote string parsing routines
and a reverse Boyer-Moore search routine where the test code (test edge cases,
test nulls, test string sizes at 2^32 boundaries, etc) was 10 times larger
than the actual code.

Of testing's _utility_ , Coplien writes:

    
    
      - Testing can’t replace good development
      - [...] Tests don’t improve quality: developers do
    

... which looks like a strawman and a false dichotomy. Can anyone cite a
credible development philosophy that believes testing can _replace_ bad
developers or bad process?

We could say that about <ANYTECHNOLOGY> such that <ANYTECHNOLOGY> can't
replace quality developers. Garbage Collection doesn't improve quality,
developers do. Array boundary checking doesn't improve quality, developers do.
And so on.

Or maybe there's a difference in terminology? I wonder if Coplien considers
SQLite testing "system test" or a "unit test"? Does he consider SQLite "white
box testing" or "black box testing"?

~~~
softwarefounder
Is there a reason why someone would throw away unit tests? I've never
understood this. Time was taken to write it, test it, upheave the bug, and now
we want to remove the safeguards that we spent time/money on? Leaving the
possibility for the bug to infest again?

Scrapping unit tests is scrapping time.

~~~
vkou
If you are testing at the wrong layer of abstraction (Your test is tightly
coupled to the implementation of your class, as opposed to its observable
behavior), then refactoring your code will require refactoring the test.

The correct solution to this is to not throw passing tests out, but to stop
doing white-box testing.

(Not to mention that a test that may pass in your CI environment may fail -
frequently - in your local workspace.)

------
valuearb
Management keeps asking when we will write unit tests, and I tell them I’m too
busy fixing bugs, because each time I fix a bug I

1) examine the entire code base to find any similar problems 2) implement full
parameter validation in that code to try to ensure it can’t happen again and
3) add exception logging code so that we will get an immediate error reported
directly on the line it occurred if it does.

If I have any extra time left over, I refactor to eliminate code duplication,
update/verify/add more parameter and input data validation, and increase code
encapsulation.

Then I point out our crash rate has dropped by 70% in the 6 months I’ve been
working on the project.

~~~
bradenb
I'm curious what you consider examining the entire code base to find similar
problems. If this is more than a 5-10 minute search then I would feel like
it's a significant waste of time. And if it is more than that amount of time
and you are continuing to find bugs then I would ask what is the problem that
allows the same bug to surface in multiple places where it is easy to identify
but hard to search for?

~~~
valuearb
Here’s an example. We got a crash report that had a stack trace that
pinpointed the location in our code. The people who wrote the code used a
design pattern called VIPER that uses 5 classes instead of the classic 3 from
MVC. Our code base is in Swift, which uses optional values to protect against
nil values. The protection is unwrapping with an if statement, if the value
exists the if can execute.

The cause was that one of the VIPER classes forced unwrapped an optional
reference to its view. Force unwrapping is without the if, just assume it’s
always valid and crash if it’s not.

The problem was they made this assumption because the view is never nil except
for one uncommon edge case, if a network operation completed after the view
was disposed. So the fix was easy, remove the force unwrap (you should never
force unwrap in Swift, it’s terribly bad).

But what about our other VIPER views? It took me a couple minutes to review
all 20, and Trey all had the same identical flaw. Fixing them took seconds,
but individually testing the fixes took a few hours.

Should I have left those other 20 views alone so our app could mysteriously
crash for some of our hundreds of thousands of users?

A more recent bug I found on my own was a memory leak caused by our main views
network closures holding strong references to the ViewController. This caused
code to fail that was still in old dead copies. So I fixed the closures and we
no longer have old copies of the VC handing around forever. I didn’t have time
to do a search at that moment, but I took notes in my dev log so I can review
every view controller (50 or 60) for similar problems. Why wouldn’t I? How
many known bugs and random unduplicated problems would go away if we got our
Viewcontrollers memory Managemnt right? im betting at least a few, and that
I’ll eliminate some future problems before they can even be found.

Software engineering is usually 80% fixing bugs. Developing rigorous standards
to prevent their formation can give you much more time to build new and
improved features.

------
vkou
I strongly agree, and disagree with many parts of this.

One part I strongly disagree with, is this passage:

> Most programmers want to "hear" the "information" that their program
> component works. So when they wrote their first function for this project
> three years ago they wrote a unit test for it. The test has never failed.
> The question is: How much information is in that test? That is, if "1" is
> the passing of a test and "0" is the failing of a test, how much information
> is in this string of test results:

> 11111111111111111111111111111111

> There are several possible answers depending on which formalism you apply,
> but most of the answers are wrong. The naive answer is 32, but that is the
> bits of data, not of information.

Just because a test has passed in your Continuous Integration environment 100%
of the time, doesn't mean that test is worthless. I have checked in many tests
that have never failed in CI - but _have_ failed when I was working on my
code. However, since there's no shiny team-visible metrics with bar charts
about how often a test failed in a local workspace, people can wrongly assume
that a unit test is worthless.

> Now, how many bits of information in this string of test runs?

> 1011011000110101101000110101101

> The answer is... a lot more. Probably 32.

If I see that in our CI environment, the answer is 'this test is nearly-
worthless.' The long answer is 'If it's a unit test, there's a race condition,
if it's an integration test, there's a race condition, or it's flaking for
reasons outside our control.'

> Another client of mine also had too many unit tests. I pointed out to them
> that this would decrease their velocity, because every change to a function
> should require a coordinated change to the test. They informed me that they
> had written their tests in such a way that they didn't have to change the
> tests when the functionality changed. That of course means that the tests
> weren't testing the functionality, so whatever they were testing was of
> little value.

... That's the whole point of blackbox testing. If observed behaviour is not
expected to change, neither should the test. If a refactoring forces you to
update the test, then, yes, you are testing at the wrong level of abstraction.

This piece could have saved a dozen pages if it just told us to stop testing
private methods, and write more integration tests.

------
regularfry
Why the concern that the volume of tests is greater than the volume of code?
Why should that be a constraint, other than some vague sense of wasted time
spent typing them out?

------
softwarefounder
One of the best uses for Unit Tests and Integration Tests, in my experience,
is for testing things that would be incredibly difficult to test through QA,
or the UI.

------
josteink
This is clearly an opinion-piece and as such I guess it's not _terrible_ , but
it certainly has enough points to disagree with.

From quickly reading through it, for instance I found the following things
directly objectionable:

> "Unit testing was a staple of the FORTRAN days".

Such an attempt at discrediting something can be applied to anything, and it
comes off as disingenuous, dishonest.

> "Unit tests are unlikely to test more than one trillionth of the
> functionality of any given method in a reasonable testing cycle. Get over
> it." ... Trillion is not used rhetorically here, but is based on the
> different possible states given that the average object size

Who says you have to test the method? Who says you're not allowed to test
functions which operate on a known, closed subset of data?

False dichotomy and plain bad math.

> If you find your testers splitting up functions to support the testing
> process, you’re destroying your system architecture and code comprehension
> along with it.

That may be right. Or it may be possibly completely backwards. It's quite
impossible to tell really, without asking _why_ they are splitting up those
functions.

If people do this for gaming some sort of system about "at least 80% coverage"
or whatever, then clearly you should ask why people feel the need to game the
system. Gaming the system, no matter what aspect, leads to bad choices.

Me however, I'm not splitting up my functions to game the system. I'm
splitting up my system to separate data-retrieval from data-processing (so I
can directly test processing without depending on whatever retrieval depends
on).

I'm splitting up my functions to give them name and intents, this makes the
code speak much clearer about what it is doing and why. The function name
should be "what". The contents should be "how".

Basically I'm splitting up my function for good reasons.

If a function is long enough to contain enough actions, intermediate
variables, loops and conditionals so that you need to stop thinking about and
wonder what they do, and what their role is inside that function... That
functions should be split up, because you do not have a clear "what" and "how"
delimiter.

And guess what? That also assists testability. You can test that the "how"
correctly assesses the "what".

This is not destroying your system.

Some of the functions you end up with may even turn out to be reusable across
the class/system, meaning you increases consistency and correctness as a
result too.

Again: This is not destroying your system. Quite the opposite.

I could go on, but I'm just at the 4th page of 21, and my comment is already
the biggest in the thread, and I'm not planning on making a blog-post length
response.

Just saying this document is severely biased, contains factual errors and is
not a good thing to rely on to present an argument.

------
vannevar
The author is correct, in the sense that over the lifetime of a software
project 90% of the value from unit testing will come from 10% of the tests.
The problem is that you won't know in advance which 10% that's going to be, so
you have to write and maintain all of them.

------
eeZah7Ux
Somebody should tell them about
[https://en.wikipedia.org/wiki/Code_coverage#Other_coverage_c...](https://en.wikipedia.org/wiki/Code_coverage#Other_coverage_criteria)
and mutation testing.

------
jroseattle
While some points here are worthy of discussion, this blames the arrow instead
of the archer.

Write good tests, automate what you can, and make sure you're using your
chosen implementation correctly and appropriately.

------
icedchai
As usual, "it depends"...

I worked on projects that tested every getter and setter. Many of these
getters and setters existed only for other tests! Complete waste of time
testing those...

------
deliveryninja
The host of this pdf is a testing consultancy which seems to have posted the
same article about 10x already to hacker news. It's just spam

~~~
quickthrower2
I posted, I have no connection to the consultancy. I thought there were some
good arguments in the PDF which I agree with after > 10 years dev experience.

------
xyz-x
The article mentions object orientation several times as a hindrance to unit
testing. Perhaps that should be its biggest take-away?

Having written an accounting system, including web/user interface and
asynchronous/deferred coordination (such is HTTP/browser programming), for the
last three years, I can say that functional programming is increasingly
helping my team stay sane.

We do TDD always; sometimes in the form of unit tests; sometimes in the form
of integration tests and we tend to write as many of our tests as
random/generative tests to avoid having to write large code-bases. I've spent
the last two days making a piece of our domain monoidal, having defined the
three laws as property/generative tests; the rest of the time is an
interactive play with the generator, to see if it can come up with counter-
examples, to the code I just wrote.

I normally go about coding by writing a very high-level integration test (at
the top-most layer that I still have code in the service/frontend); then I
write a huge chunk of the code until I think it's correct and looks pristine
and easy to maintain. Now I run the test. If it fails and the test is correct
— it tests the right thing and is easy to read — then I start at the top
(highest level) of the stack, and write down the assumptions I have thought
about while designing the code — as unit tests. Until one passes (at some
level n); at the higher level n+1, an assumption/unit test is now broken and I
can divide-and-conquer until I find the line of code that doesn't work.

This, together with purification; the methodological extraction of pure
functions (things that only have one output ever, for a particular input),
makes it possible to avoid testing any side-effects/async things (they simply
flow non-async data between pure functions).

This, together with first-level-values for control flow, aka. not using
exceptions, and generative/random testing, makes it so that ALL of the input
has valid output, makes it so that all functions are total functions. And this
in turn makes the code uncrashable and bugfree (for the domains of bugs the
above methodology removes).

The domains of bugs that the above doesn't eradicate, are primarily cross-
browser bugs on the GUI-side, or UX bugs, where a feature is hard to
use/understand. On the server-side we sometimes crash when our logging storage
in ElasticSearch goes down and never comes up, the intermediate buffer
(Logstash) fills up, and then the app buffer fills up and then the app
livelocks, waiting for the logging to drain. (=> operations). The second most
frequent reason we have any exceptions/errors/bugs is DNS not working.

The first year of writing the software, we still invoked libraries that threw
exceptions, but now we've rewritten them all to do control flow with first-
class values, so that is not an issue any longer.

I just wanted to share how we do stuff at qvitoo :), in case it helps anybody.

------
singularity2001
not only is it often waste but it can tremendously add to code bloat and
'dept'. I saw a dev spending 80% of his time 'fixing test setup' instead of
fixing code.

~~~
octix
To me that would be a red flag. If it's difficult to setup within your tests
then it's difficult within your code and/or too much magic is happening.

------
abritinthebay
Most unit testing is wasteful because most unit tests... are not unit tests.
Or bad unit tests at least.

I can count on one hand the number of well written project test suites I’ve
seen - even on code bases with 100% coverage.

We, as a group, do not seem to be good at writing them. I’m not sure why, and
would love a discussion about it...

------
susankim
nit tests are not a silver bullet, but if you request passing tests and
coverage to put in prod, you basically request a minimal & always up to date
documentation.

------
danjoc
This is truly the dumbest thing I've read all year about software development.
It isn't just dumb though, it's harmful. Now moron developers everywhere will
hold it up as an excuse why their code is so good it doesn't need testing.
Now, especially after Equifax, is a bad time to have this attitude. There are
already journalists calling for developer licensure.

[https://www.nytimes.com/2017/09/11/opinion/equifax-
accountab...](https://www.nytimes.com/2017/09/11/opinion/equifax-
accountability-security.html)

~~~
dkarl
This is a critique of popular testing practices by an experienced engineer who
is trying to improve the practice of testing and increase software quality.
This is a big part of how engineering practices improve. A culture where we
call something "dumb" without reading it because we only care what "moron
developers" will make of it (presumably also without reading it) isn't going
to increase software quality or protect against the next Equifax. It's easy to
have a knee-jerk reaction to a title and rationalize the desire to respond on
that basis. Surely nobody else will read the article, and therefore _I have
already determined its true significance_ by having a knee-jerk response to
the title or a few haphazardly skimmed sections. If that reasoning prevailed,
though, people should just write headlines and never anything more thoughtful,
and we should be stuck forever with "UNIT TESTING GOOD" and "UNIT TESTING BAD"
as the two competing pinnacles of software testing wisdom.

In case anyone was fooled by this comment into thinking the article takes a
lazy approach to testing, it argues for keeping a certain class of unit tests;
for another class, preferring system tests instead; and for third class,
turning them into assertions that ship with production code where possible.
That's not lazy. Depending on how you currently test your code, it may
actually be a harder standard to meet. It's possible you personally have
nothing to learn from it, but I would wager there's something in it that will
make you see your tests in a different way.

------
fvdessen
In my experience the usefulness of Unit Tests is very dependent on the quality
of the team and its grasp on the business requirements. Some programmers can
write high quality code that doesn't need unit/integration testing at all,
some can't.

~~~
hacker_9
Legacy code = code without tests. You might write HQ code today, but without
tests it'll become technical debt within a month or even a week down the line.
Programmers come and go, unit tests are omnipresent.

~~~
bearjaws
All code is legacy code as soon as you type the semi-colon. It is wishful
thinking that a suite of unit tests will prevent a refactor or small change
from producing technical debt.

~~~
mnsc
In a sense you are right. That state where you have the complete program as a
mental model goes away when you stop for that session. But if you pick it up
the morning after that mental model could be reconstructed in < 1hr and you
can finish the feature and deliver it as value to a user. If you created good
tests (unit/integration/regression) that captures the gist of what the code
should do, two-year-older-you or another developer have a good chance of
getting a similar mental model in order to change the code with confidence. So
i consider "legacy code" as a scale where you on one end go "The reqs have
changed, better rewrite it all!" to "The reqs has changed, but I can
confidently add this functionality and keep some of the unchanged behavior!"

