
Test coverage only matters if it's at 100% - charlax
http://www.dein.fr/2019-09-06-test-coverage-only-matters-if-at-100-percent.html
======
cbanek
As someone who worked somewhere where 100% code coverage was practically
required, while it does point out flaws in your testing where you may have
"missed a spot." It still doesn't prove that you're testing everything. You
never know if a library function you are calling beneath your code has a
different path it can also follow.

> I believe the only level of test coverage that's worth it is 100% of the
> lines you want to be covered.

This statement I vehemently disagree with. It seems like the kind of thing
someone would hit you with as an excuse for why testing isn't useful at all
because it's not perfect. I have found in my experience that simple end to end
tests give me the best bang for my buck, and even a test suite that covers 80%
of the code is pretty good.

To paraphrase from the Google code reviewer guide: No code is perfect, just
try to make it better than it was.

100% test coverage doesn't tell you if you have race conditions, security
problems, or networking issues. 100% test coverage doesn't matter much in
distributed systems, where you can individually test each component to 100%
but still have the system not work as a whole. 100% coverage doesn't mean
you're checking that all your returned results are verified and what they
should be.

> The problem with having a goal that is not 100% is that it leaves room for
> interpretation and negotiation.

By handing out edicts like these, you're taking away tools like negotiation
and common sense and replacing them with blanket rules. Time is a finite
resource and test coverage is only one small part of testing. Going from 95%
to 100% might take more time and provide less value than other kinds of
testing and other concepts (like UX, or market fit, where you test that what
you're building is usable and useful).

Just because you answer all the questions right on the test doesn't mean you
have perfect knowledge of the situation. Thinking such is hubris and leads to
errors/defects.

By excluding lines and saying only make sure you cover what you're testing
you're intentionally making blind spots. Just let the coverage number be what
it is. Look at your coverage reports. Check the coverage of key components.

> I think it is a perfectly fine decision to exclude some code from coverage
> if the tests are too expensive to write (because of the required setup for
> instance).

No, no, no, no, no x a million. This is the fallacy that if you unit test each
component, then when you put it together it's perfect because each part is
perfect. You still have to do integration tests and complicated setup tests.
If a test takes a long time, try to make it shorter, don't try to avoid
testing it. Anything you avoid testing is where bugs will nest and code will
churn.

To sum up, most of the important, complicated, and hairy bugs are not found by
unit testing. It is the simple bugs that are found by unit testing. These are
important to get out of the way because a system with a thousand cuts can
still kill you, but in no way is 100% coverage and tests passing mean anything
other than 100% coverage and tests passing.

~~~
virgilp
I would go farther and claim that 100% test coverage is worse than 90% test
coverage. You're probably testing irrelevant stuff - which just means that you
have more test code to maintain, and also likely means that your tests are bad
(i.e. not testing business logic, but testing irrelevant implementation
details). Last, but not least, 100% test coverage means that your code is bad/
you're not programming defensively (how exactly did you cover all those error-
handling bits for cases of unexpected exceptions/ "stuff that theoretically
should never happen, but in practice it might, so let's have some sensible
treatment/ logging/ etc."? )

~~~
plorkyeran
Or it could mean that you've contorted your code to make sure that you can hit
every single defensive edge case, and in the process possibly introduced new
bugs due to the increased complexity.

------
aluminum96
This is totally ridiculous.

Test coverage clearly matters even if it's not 100%, as anyone who has ever
worked on an older codebase knows.

As long as you're not causing coverage regressions, it is perfectly acceptable
to leave untested code that "should" be tested, but hasn't been changed in
years.

Test it when you change it; don't test for the sake of achieving a coverage
benchmark.

~~~
rightbyte
And implying that there is 100% coverage just because the instruction register
had all valid values is silly.

When writing test one should worry about the other registers too ...

~~~
DoofusOfDeath
Thank you for stating this so succinctly!

This concern has been nagging me in the back of my mind, but I hadn't
convinced myself that code-coverage was insufficient (in practice) until you
framed it this way.

Now all kinds of supporting examples come to mind:

\- divide-by-0 errors

\- floating-point exceptions

\- numeric underflow / overflow (both silent and signaled)

The alternative code paths exercised by these situations are generally outside
the purview of code-coverage tools, because they reside in the silicon /
firmware, process' default signal handlers, etc.

~~~
m4r35n357
Mutation testing can expose weaknesses like these in your automated tests.

------
aeternum
This is the type of article I'd expect from a junior engineer.

Test coverage matters even if it is not 100%, especially if it covers the most
important use-cases. Cost and time always needs to be considered. 100%
coverage may not be worth it. Coverage based on LOC is also an arbitrary
metric in some sense, the author himself has examples of that. From a business
POV, use-case coverage is more important.

------
lvturner
We had some mild debates at an old company on this matter (I don't think
anyone was really ever pro 100% coverage, but we were under pressure to
increase coverage in order to provide momentum and to emphasise that automated
testing should be routine)

It resulted in a mug being made in the same style as our company mugs but with
the words "Fuck Coverage" instead of the company name.

Eyebrows were raised when it was mistaken for a 'real' mug and taken in to a
board meeting by a manager however..

------
kempbellt
So, don't bother writing any tests unless you can test everything?

At a previous shop, we had a simple test that ensured user could log in for a
while, until we created more coverage. We didn't test every aspect of the
code. This simple sanity check saved us once or twice.

Should we have not written this test just because we couldn't test every
conceivable piece of code?

Don't let perfect be the enemy of good...

~~~
AgloeDreams
Exactly this, plus, the entire concept of 100% coverage is hilariously wrong.

Are you covering 100% of code interactions? How much of your integration logic
are you mocking? Are you covering 100% of HTML and interface logic? How much
code is not in your code? What about packages and imports? It's incredibly
arbitrary for no good reason except to slow development of a product that may
depend on fast development for survival in trade to test for bugs that
honestly, actually, may never happen. To be clear, testing is good, but you
write tests to serve the code and product, not the other way around.

------
josepot
Test coverage is a very dangerous metric. IMO it will only be useful the day
that we find a way to not take into account those tests that are testing
implementation details. Every time that I see a high number being enforced,
the end result is always a disaster: lots of poor quality tests that are only
there in order to bump up that metric. Those tests are poison: they don't aim
to test correctness, they make refactors/improvements a PITA, they just lock
you down to a concrete implementation and they don't even give you any
guarantees that your code works correctly.

~~~
AgloeDreams
Bingo, this *1000.

Your team ends up spending half their time writing painful depressing tests
that are bad afterthoughts. Generally with TONS of mocks and little real
meaning.

~~~
cbanek
The other half of the time is spent editing the tightly coupled mocks/tests
when you refactor the real code. Bad tests can be a liability, not an asset.

------
amflare
The title is disingenuous. A better title would be "100% test coverage should
be the goal". Though that is laughably obvious and would generate no clicks,
so I can see why the author made the choice they did.

That said, if you have a single test, that better than no tests. And 90% is
better than 80%. Yes, 100% is ideal, but if you let perfect be the enemy of
good, than you waste time when low hanging fruit breaks.

------
t-writescode
The irony to me is that the examples he provides for things not to cover are
exactly things I think should be covered. Short-circuit logic is ripe for bugs
and the code that causes the short-circuit should definitely be tested!

~~~
dbcurtis
Yes, and exceptions. Tests must tickle all the exception paths. An exception
that contains a dumb typo _crashes the system_ instead of propagating up to
get handled. Just touching the exception path roots those out.

    
    
      raise ValueError(' '.join('meaningful message with:', parameter)
    

Is a stupid mistake that I _still_ make all too often inside an exception
handler.

------
rraghur
\- Don't write unit tests for testing code, use them to guide your design.

While that statement is itself is somewhat questionable, it's a little better
and IMO reflects the purpose of unit tests better. Usually, if I'm finding it
hard to write or setup a test, that's a signal that something's off. Also, UTs
are absolutely lovely when you're refactoring code - as long as code tests
invariants and publicly visible behavior instead of implementation. So even if
the part you're refactoring has say 60%, you'd still be in a much more
confident of making changes.

Code coverage under anything other than unit tests can be very misleading -
for ex I've seen places where system & integration tests are run on
instrumented code (and little to no unit tests) and since a huge swathe of
code will be hit (mostly happy path), there's a false sense of code quality
and safety which is probably more dangerous than no tests.

Overall, "When a measure becomes a target, it ceases to be a good measure."
applies most often to test coverage.

------
everyone
The author is either crazy, or else maybe works in a very niche area where
this makes sense. I work in games, and some modules are very conducive to
automated testing.

Eg. complex math functions, it'd be very easy for them to be not quite right
and you would never notice without tests, and also they will probably never
change.

An area that would be stupid to test (with automated tests) for example, would
be a UI menu.

* The test (ie. replicating a user pressing buttons via code) is probably gonna be more complex and less maintainable than the code its testing.

* If there is a bug it will be obvious in human acceptance testing.

* UI menus are gonna change drastically during development.

Anyway, I could see a misguided requirement for 100% code coverage massively
hampering a project and potentially killing it.

~~~
cbanek
And yet, UI menus are one of the things that always break, and break right in
front of the user ("I can't click this button. This is greyed out and it
shouldn't be.") Menus should definitely be tested, and I think testing them in
an automated way is fine. Without a working UI, you have no control.

You can even do model based testing.

For example, you're testing a word processor. You know that your state is no
document. Then you click the open file button. Press some other buttons, and
knowing what button it is, you should expect some change.

That being said, usually you can "press" the buttons virtually, for example
sending window messages or calling the same functions as the buttons wire up
to, rather than knowing the pixel layout.

People have been doing these things for years.

~~~
everyone
Your sort of reiterating my point.

I know UI stuff breaks a lot, but my point is that when it does break, _its
obvious_. Bugs should be easily caught in human acceptance testing.

Also an automated test UI is never going to be the same as a human test, a lot
of UI bugs can result from the vagaries of various physical devices
touchscreens / gamepads / motion controls etc.

Also anything that tests such high level functionality is going to be very
complicated and touch many parts of your code, it will probably be more
complicated and less maintainable than the code its testing, which defeats the
purpose imo, do you want to write tests for your tests?

UI stuff tends to change radically over development, so you'd be more than
doubling your work with all the testcode youd have to write.

and it would still never replicate an actual human running the actual game,
you absolutely are still going to be doing human acceptance tests + the closer
you try and get automated tests to that the more complex and less maintainable
they get.

------
dbcurtis
100% of what? Oh... line coverage, in Python.

Well, with Python, the compiler doesn't do type checking, so 100% coverage (or
close) is very important. One of his cases of "what do you not need to check"
is error-handling. Yet, having something in production that should raise an
exception that gets handled versus having the exception blow up and crash the
system is a critical system stability defect. And in fact, in my personal
code, since I tend to try to create exception messages that are as specific as
possible, I must admit with chagrin that screwing up a ''.join() to create the
exception message is one of my most common errors. I at least touch-test my
exception code, without exception.

And then... line coverage? Oh my, that is just the beginning. I pretty much
always use the hypothesis package so that everything get exercised with a
large variety of inputs. Inputs that I didn't have to think up, and avoid my
pre-existing biases of what might get thrown at the function.

Line coverage does do much to really ensure that you have reached "interesting
conditions". I wish Python had a good tool for that kind of instrumentation.
Line coverage is really only a "touch-test" and mainly useful as a check-in
gate, not as production testing.

~~~
DoofusOfDeath
> Well, with Python, the compiler doesn't do type checking, so 100% coverage
> (or close) is very important.

Intuitively I agree with you: In my personal experience, the more dynamically
typed a language is, the more unpleasant surprises I get at runtime.

Some years ago I did a project in MLTON [0], and I was delighted by how many
bugs were caught at compile that that, in other languages, would likely go
unnoticed until runtime.

[0] [http://www.mlton.org/](http://www.mlton.org/)

------
pacaro
For me the absence of coverage, or low coverage, is a useful piece of
information. It's not the whole story but it is a smell, and it should be
taken into consideration with the other smells/signals

Conversely perfect or high coverage is not a positive signal _per se_ , but
rather just the absence of a negative signal

------
alephnan
Edgy, but wrong.

Is there a difference between 79.9% and 80%? Maybe not. My team is increasing
test coverage for a legacy code base. 60% to 80% DOES matter. And it’s not
just the fact that the tests flags regression, but the very process of
increasing test coverage uncovers bugs / dead code.

------
nrook
There's an insight here I haven't seen elsewhere. Everyone but the most blind
ideologue knows 100% test coverage isn't a good use of your time. But it's
hard to care about increasing coverage from 89% to 90%, or to feel justified
in objecting during a code review because the change under review would
decrease coverage from 90% to 89%.

The solution to this dilemma is to make it easy to mark code as "OK not to be
tested". It's annoying to have test annotations in code, but I'd expect mostly
entire modules will be exempted, so it's not that big a deal. And this way,
the coder has to either write tests or make a specific decision not to.

------
choward
> With this setup, test coverage will be at 100% even though we did not test
> the case where the toaster is not plugged because it is too simple. Writing
> a test for this "functionality" would not be worth it. Therefore we
> explicitly exclude it from coverage. You should revisit this decision if it
> becomes more complex and needs to be tested.

This is ridiculous. He writes unreachable code and says he doesn't test it
because it's "too simple". No, you're not testing it because you can't test
it, because it's code that will NEVER run. Instead of deleting the pointless
code he marks it as not needing test coverage.

------
rbongers
Not my experience at all... Tests need to be intelligently written for the
project at hand, which sometimes means testing something to freeze the API,
writing tests for small functions that are easier to test in a suite than
anywhere else, and especially catching a problem where your program keeps
failing, like testing with an older version of a library or runtime, or even
testing something that you know could be fragile.

------
ihaveajob
Seems like the Intelligent Design nuts arguing "what good is half an eye?".
Well, it's half as good as a full eye, a lot better than no eye.

------
the_narrator
I think what really matters is that your test coverage percentage increases
over time and never goes down. A goal of 100% might be ideal, but is rarely
realistic.

------
mabbo
This is such bad advice that's so well written I can only assume the author is
trolling HN-types with it.

I'm serious. I think we're being trolled here everyone.

------
al2o3cr

        Some lines of code should be tested twice to ensure they're correct.
    

Translation: "100% coverage" is a vanity metric

------
m4r35n357
Not only that, you should subject your automated tests to mutation testing. No
link coz wikipedia seems to be down for me.

------
matt2000
Given unlimited time, money and importance of the code in question - go for
100% coverage. Otherwise, probably not.

------
tracker1
100% coverage only ensures that you've looked at everything twice. Nothing
more, nothing less.

------
mpweiher
There is actual empirical evidence that this is false.

~~~
wvenable
Feels like you didn't the read the (whole) article.

The author makes a good point -- everything should either be covered by tested
or _explicitly marked as excluded_. Having a gray area of code that is
included but not covered by tests is a recipe for missing something. That much
seems obvious.

~~~
nimblegorilla
Seems like 50% coverage is better than 0% coverage and 75% coverage is better
than 50% coverage.

The author argues about the horrors of going from 80% to 79.9%, but I don't
see how that is any different from the suggestion to explicitly exclude code.
The excluded code is still in the gray area of code that is included but not
covered by tests. What happens when someone starts making breaking changes to
excluded code?

~~~
derefr
> What happens when someone starts making breaking changes to excluded code?

You... don't. You un-exclude it first, and then, because of the rule, you're
not allowed to commit the code until you write the tests that bring it back up
to 100% coverage. _Then_ you make breaking changes.

~~~
nimblegorilla
I'm not following the benefit of excluding code in the first place. Your
method seems like more work, more error prone, and more likely to leave
unintended gaps in coverage.

