
Most Unit Testing Is Waste (2014) [pdf] - ptr
http://www.rbcs-us.com/documents/Why-Most-Unit-Testing-is-Waste.pdf
======
thatswrong0
> If you want to reduce your test mass, the number one thing you should do is
> look at the tests that have never failed in a year and consider throwing
> them away. They are producing no information for you — or at least very
> little information. The value of the information they produce may not be
> worth the expense of maintaining and running the tests.

This seems.. entirely unreasonable to me. Just because some area of code is
not touched frequently doesn't mean that we should throw out tests pertaining
to that section of code. Can someone justify this to me? This seems like a
_terrible_ mistake.

~~~
ryanmarsh
If you want to reduce your code mass, the number one thing you should do is
look at the code paths that have not been reached in a year and consider
throwing them away.

Still make sense?

~~~
plandis
So remove all code for handling unlikely failure scenarios? ;)

~~~
zachrose
This would depend on the kind of software you're making and the potential for
an unhandled error to cause injury, expense or damage, but yeah.

I would really like some system that can tell me how many times a line of
production code has executed in the last year. For an ecommerce web app, I
can't imagine how a line of code that hasn't run in the last two years would
still need to be there.

------
guessmyname
In my previous job I was maintaining a big project which originally had zero
unit-tests, it was apparently working well but once I started adding TDD I
found several security issues, edge cases, and memory overflows; by the end of
my contract I had written more than +2,800 unit and integration tests and the
code coverage was still low at around 60% but I am very confident that new
developers are going to take ownership of the project from the beginning
without much training just because I wrote those tests. I consider testing an
useful way to allow new contributors to join the development process because
they will not be afraid to touch the code because the tests will throw a
warning when something breaks after any modification, without tests they would
surely have to spend more time checking the details of their modifications
more than integrating into the team.

~~~
_RPM
What was the definition of code coverage at your shop? Does it mean testing
every line of execution?

~~~
exception_e
He/she probably used a tool to capture the coverage %.

~~~
_RPM
This comment reminds me of a typical response from a politician to a pressing
question.

~~~
exception_e
Sorry for the very literal answer. I honestly thought you were interested in
how the coverage was calculated.

As an aside, I've worked on projects where coverage wasn't used and the rule
was to "test the important bits" and I've also been on projects where coverage
had to be >= ~98%. I wonder if a middle-ground approach would be effective.

------
jondubois
I agree that unit testing is a waste if the project requirements change
quickly. Unit tests can take ages to write and if the project requirements are
changing constantly, then you keep having to update them and it slows down
development and kills productivity - I've seen this happen many times in
previous companies.

Unit tests only make sense for components of a system which are BOTH critical
and stable. I think that integration tests are often way more useful for most
projects.

~~~
stcredzero
_Unit tests can take ages to write and if the project requirements are
changing constantly, then you keep having to update them and it slows down
development and kills productivity - I 've seen this happen many times in
previous companies._

In some environments, refactorings to the code will also refactor the tests.
Doing test first is only a win in the situation where you don't have to
duplicate work, and only if management wouldn't otherwise give you the time to
write tests after the fact. Unfortunately, the former is less common, and the
latter is more common.

------
swift
Reading over the comments, a lot of people seem to be concerned that unit
tests compromise their ability to change code quickly as requirements change -
they find themselves spending too much time updating tests instead of doing
real implementation work.

In the experimental stages of a project, I'd buy that. But once a project has
matured to the point where it's working and the architecture is broadly in
place, requirements changes are _usually_ not so fundamental that there is no
resemblance between the old and new requirements. If you're finding yourself
having to rewrite large swathes of unit tests when the requirements change,
you need to ask yourself if the real problem is that your code is poorly
factored. If you're breaking down your code into simple, independent, cleanly
composable pieces, you'll find that changing requirements poses much less of a
testing burden.

------
bit_logic
The biggest problem with unit tests is they are abused as a metric for code
quality. Pretty code coverage graphs and percent numbers are easy to present
to managers. And outsource firms love these because it adds pointless work
they can justify as "code quality".

But what happens is the unit tests become filled completely useless logic.
Like a test with only mock objects that just tests if a method can be called.
Useless but good for padding those code coverage numbers. And ironically it
greatly decreases code quality. Because when there's these useless unit tests
covering everything, it makes refactoring very difficult. And so no one
refactors to improve the code because its too much work.

~~~
stcredzero
_Because when there 's these useless unit tests covering everything, it makes
refactoring very difficult._

We had this solved in Smalltalk in the early 2000's. Why has the rest of the
programming industry screwed the pooch on this, since?

~~~
initram
Care to provide any details on how this was already solved? I'd love to know
and I'm sure the many people here who write tools for developers would also be
interested.

~~~
stcredzero
_Care to provide any details on how this was already solved?_

Our standard refactoring tool, the original Refactoring Browser (RB), covered
Unit Tests just the same as any other code. If you fired off a "canned"
refactoring in it, it was guaranteed to be correct, you could multiply
undo/redo it, and it covered everything currently loaded in the image, which
typically included Unit Tests.

I notice in Visual Studio and also with CLion and some other "refactoring"
tools for C++ I've played around with, that some "refactorings" amount to:
"We'll generate/move code for you, but you're on your own with regards to
correctness." This reduces the usability and productivity of the tool
considerably.

In addition, the Refactoring Browser was considerably more responsive. Most
"modern" refactoring tools feel heavy and lugubrious by comparison. The
Refactoring Browser was very snappy. Brant and Roberts, the authors of the RB,
did this cute demo where they want and replaced every reference to Object in
Smalltalk, and transformed Smalltalk into a Thingy-Oriented Programming
language. It took under a minute. (This was in VisualWorks 3.1.)

On top of that, you could also write SQL-like queries of the code base in a
parser tool, that had full syntactic equivalence to the entire language, with
wildcards, and even fully scriptable conditional parse-node matching. You
could pop the results of this up in a browser, then rinse, repeat. On top of
that, you could use the same parser engine to do wild-card replace! In doing
this, you would have gone beyond the "canned" refactorings, so the undo/redo
stacks in the Refactoring Browser would no longer be valid, but in Smalltalk,
you had the Change Log, which is basically a checkpointed transactional log
for all the changes to the image/loaded-codebase. You could even search and
filter this log in a GUI tool. As this was entirely local to your install,
this was very handy and fast as well. (Until it came time to re-run your
change log, and you hadn't saved your image (checkpointed) in like a week. But
if you were such a lazy dev, you were getting what you deserved!)

Most of the above has to do with the easy access to the meta-level of the
Smalltalk language, combined with the simplicity of the language itself. This
makes it 100x easier to write tools, so more tools get written. (1) In
particular, very high quality parsing/rewriting tools can get written. Such
tools are much lighter weight, and so are more responsive. On the other hand,
the high cost of parsing and accessing the meta level in languages like C++,
Ruby (parsing only), and Java, just to name a few, has a huge impact on the
"landscape" of those tool's programming environments. It's akin to the affect
that different kinds of landscape and climate can have on communities of
people living there. So the lessons here aren't just for the tool makers.
Really, it starts with the language designers, who set what becomes the
landscape for the tool makers.

(1) - You can literally start writing a full Smalltalk parser by hand, using
top-down LL(1) parsing, have most of the language parsed the same afternoon,
and reasonably expect to finish in 1 or 2 days. You can also literally start
writing a Smalltalk debugger and have something that lets you browse exception
stack frames in 10 minutes. EDIT: And to head off the predictable
"obobjection" \-- in VisualWorks 7+ the resulting parser would actually run
faster than the equivalent YACC/Lex parser in naive C.

------
programmarchy
For the most part I find myself writing unit tests to save time debugging.
With a unit test I can create a limited context for my module to execute
within instead of having to manually run through through several steps in the
larger application to test out some piece of functionality.

------
SomeCallMeTim
I think most unit testing is a waste because it duplicates what a good type
checker would do for you. I'm using TypeScript now, and after a break from
using typed languages, it's a huge breath of fresh air.

Much of the code I write doesn't get unit tests at all, because it's simple
enough that _it won 't ever fail_. Refactoring major blocks of code is safe,
even without unit tests, because the type checker ensures that everything is
wired up in a sane manner when you're done. Good design can obviate the need
for many unit tests.

When people talk about test code coverage in JavaScript/Ruby/Python, I think
the main reason they want close to 100% coverage is that many runtime failures
in those languages occur because some line of code somewhere is accessing a
type incorrectly. That doesn't happen if you're using static typing.

If you've got some complex logic, making sure it works using unit tests is
fine. I still do that with anything I consider non-trivial. But if you've got
a really simple function that obviously works, and TypeScript ensures the
function will always get the types it's expecting, writing tests to ensure it
will keep working forever is just a waste of time, unless it's to verify for
your own sake that the "obvious" function does what you think it should. But
in that "TDD" case, keeping the test around just makes the code base more
brittle, since if you decide you need to change the way the function works you
now have two functions to maintain instead of just one.

~~~
aprdm
TDD "got mainstream" with Java which is a compiled language. Therefore I don't
think you're correct.

It's usually the most simple functions that hide the most subtle bugs.

~~~
SomeCallMeTim
Really? I saw it mostly in the Ruby community first, followed by Python and
JavaScript. Looking at the WikiPedia page, it looks like it came from Extreme
Programming and the C3 project [1], which was SmallTalk. And in case you
aren't sure, SmallTalk is dynamically typed. [2]

Java has a huge enterprise presence as well, so I'm not too surprised that
it's popular in Java circles. Enterprise developers aren't famously top-notch
developers in general.

>It's usually the most simple functions that hide the most subtle bugs.

I can't remember the last time a type-checked simple function I wrote hid a
subtle bug. But I'm admittedly a few sigmas above average, so my experience is
probably not typical.

In studies, TDD is a wash: It neither improves overall programmer throughput
nor makes it worse. So use it if it's what makes you happy. I've used it for
more complex goals myself, so I understand the draw. I just usually don't need
it.

[1]
[https://en.wikipedia.org/wiki/Chrysler_Comprehensive_Compens...](https://en.wikipedia.org/wiki/Chrysler_Comprehensive_Compensation_System)

[2]
[https://en.wikipedia.org/wiki/Smalltalk-80](https://en.wikipedia.org/wiki/Smalltalk-80)

~~~
aprdm
You're funny in the "Enterprise developers aren't famously top-notch
developers in general."

and

"But I'm admittedly a few sigmas above average, so my experience is probably
not typical."

I strongly suggest you get off your high horse :)

I also suggest that you read some of the Kent Beck books! Mainly on TDD, he
also wrote the JUnit for Java. He considers himself as an average developer
who follows good processes... perhaps you're above him as well.

~~~
SomeCallMeTim
>I strongly suggest you get off your high horse :)

:P

My "a few sigmas above average" comment was to just point out that I'm not
typical. It's literally true, though. I really am that good. At least when I'm
not writing comments on HN. ;)

Relative to Kent Beck: I've never met him, much less worked with him. But
being famous doesn't automatically make you a top 0.1% developer. A great
manager and process person? Sure, I'd buy that, based on his books. But based
on the relative skill of developers who I've encountered through my career,
I'm at least top 1%, and maybe 0.1%. I've met hundreds of developers, and only
a very few were in the same league.

I did read Extreme Programming when it first came out. I actually feel that
many of the practices in XP, including test-first and pair programming, are
actually far more important for average to below-average developers. I think
Kent Beck himself said that XP was best for broad-but-shallow problems (things
like accounting software with a million simple rules). Broad-but-shallow just
doesn't require the same strength of programmer to conquer -- though it does
require good process, especially if your team size (and breadth) is large,
which is Beck's strength.

------
Roboprog
One key thing the author started to allude to: the value of functions - pure
functions, vs a cyclic graph of mutable objects with temporal coupling of
state changes and spaghetti inheritance out the wazoo.

OK, he didn't say _that_ exactly, but he really did start heading that way.

The "static types will save us crowd" is drowning out a lot of the rest of the
discussions that need to happen.

* Somebody mentioned Eiffel in other comments. Eiffel's design-by-contract assertions about invariants are vastly preferable to Java's magic beans that eventually (maybe) reach a valid useful state. I guess unit tests _sort of_ compensate for this, but not really.

* Mutable state needs to be pushed to the peripheries of our apps, not plastered everywhere.

* Inheritance might not be such a good idea, due to reasoning about which code runs when you call a method / send a message. Do like Go-lang, and have interfaces for polymorphism, but skip inheritance - the use of an interface is a flag for later binding. Likewise, higher order functions and closures help you reuse code without resorting to spaghetti inheritance.

* We need languages that make programming with types easy, using type inference where it is clear to do so (e.g. - local identifier initialization, but perhaps not for multi-line function headers?). BUT, we still need to allow for dynamic runtime types, perhaps with a few minimal flags on identifiers/modules, rather than forcing people into monstrous reflection frameworks and XML situps.

FWIW, I agree (with the paper) that feature/integration tests are very good,
but unit tests are often a waste.

------
fermigier
This is from 2014 and has already been discussed (~300 comments on HN):
[https://news.ycombinator.com/item?id=7353767](https://news.ycombinator.com/item?id=7353767)

See also: [https://henrikwarne.com/2014/09/04/a-response-to-why-most-
un...](https://henrikwarne.com/2014/09/04/a-response-to-why-most-unit-testing-
is-waste/)

------
sakopov
Unit tests enforce good design and naturally provide decent documentation on
how components fit together. Unit tests also allow you to spend a bit more
time on the details of each component and find issues you'd otherwise find in
production. If nothing else at least have some integration tests so that you
can confidently refactor knowing that you're not breaking existing
functionality somewhere.

------
aab0
The obsession with information theory here seems like a classic nail-hammer
thing. The number of bits my tests convey is totally useless to think about
and certainly not worth spending pages on. All I want from my tests from a
code base I maintain for thousands of patches is a tiny fraction of a bit: did
my latest change break an important behavior or invariant encoded in a unit
test? If I only screw up once in every 100 patches, then formally, my unit
tests are doing all that work to emit 0.01 bits of information (-log(99/100)),
which is formally a totally irrelevant thing to know about my unit testing
framework. ('Hey Joe, what have you been up to?' 'Fixing my unit testing
framework - I'm up to 0.03 bits per patch!' 'I see.')

------
nickpsecurity
In summary with my comments in brackets:

1\. Keep regression tests around for up to a year - but most of those will be
system-level tests rather than unit tests. [System-level since context of how
software was being used is important.]

2\. Keep unit tests that test key algorithms for which there is a broad,
formal, independent oracle of correctness, and for which there is ascribable
business value.

3\. Except for No 2, if X has business value and you can use a system or unit
test use a system test for X. [Context again.]

4\. Design a test with more care than you design the code. [They need to do
something meaningful rather than just be there as a metric.]

5\. Turn most unit tests into assertions. [Assertions describe properties you
code should always have before, during, or after executions. Just use
assertions for such checks that instead of hiding them in tests.]

6.Throw away tests that haven’t failed in a year. [Controversial claim. Says
they tell you nothing. I believe the author thinks the design and assertions
should make the software right from the beginning with the tests telling you
where you're screwing up.]

7\. Testing can't replace good development: a high test failure rare suggests
you should shorten development intervals, perhaps radically, and make sure
your architecture and design regimens have teeth.

8\. If you find that individual functions being tested are trivial, double-
check the way you incentivize developers performance. Rewarding coverage or
some other meaningless metrics can lead to rapid architectural decay.

9\. Be humble about what tests can achieve. Tests don't improve quality:
developers do.

There's a lot of good points in there, though. Talks about combinatorial
explosion where tests often don't even measure correctness. Talks about how
maintenance burden goes up over just good system testing and assertions with
minimal unit testing. Meantions Toyota Production System showed you keep human
in the loop and let them do any analysis that requires brains while automating
just the mundane stuff. Mentions how hardware engineers duing Design-for-
Testing philosophy embed little probes in most of their blocks to catch
violation of correctness conditions during testing. Kind of black box plus
deep, white box. Says that systems could similarly be wired so the system
tests would set off alarms when they should. Just a few gems I saw.

I agree a shorter version of this essay would be beneficial.

~~~
specialist
Pondering #5, assertions.

When doing "dynamic" programming (Java + Spring + Maven, Python, anything
JavaScript), I feel compelled to write more unit tests. Because the sand
shifts under my feet and my crap breaks in surprising ways.

I despise most modern development, where entropy continues to creep in and
nothing stays done.

Maybe more assertions would help mitigate my feelings of hopelessness and
despair.

~~~
nickpsecurity
I tried to find you a decent intro to the best approach at this, Design-by-
Contract. The Eiffel site has a nice, short intro with examples and benefits:

[https://www.eiffel.com/values/design-by-
contract/introductio...](https://www.eiffel.com/values/design-by-
contract/introduction/)

It has been implemented in Java with the stuff inside JavaDocs. I know
JML/Krakatoa integrated it with provers. Simplest is using it for interface
checks, though, of expectations of what component is doing vs what it's
actually doing. The DbC style lets you document your assumptions, have formal
properties to derive tests from, work right there with the code itself, and
even automatically generate tests in some tools. That last one is huge benefit
as you know the tests will always be relevant to the module and don't have to
maintain them. You can do your own tests at the system-level on stuff the
module-level contracts might not capture.

I'm not sure I can help you for most of the others. Remember, though, you can
implement the DbC assertions in code that's in the modules themselves called
before and after they run. These can be set to turn off with a flag during
production compiles. You can also write generators that pull them out of
comments in your source files then produce a version of your program with the
checks in them as tests. I can't find my original articles post-PC crash but
the one below builds in Ruby a generator for C language DbC:

[http://www.onlamp.com/pub/a/onlamp/2004/10/28/design_by_cont...](http://www.onlamp.com/pub/a/onlamp/2004/10/28/design_by_contract_in_c.html)

EDIT: Two for Python on StackOverflow that are kludgey but seem to help.

[https://stackoverflow.com/questions/8563464/using-design-
by-...](https://stackoverflow.com/questions/8563464/using-design-by-contract-
in-python)

------
vpeters25
I think it would be worth considering that we are long ways from the punch
card days.

There are programming languages, frameworks and tools which would be unlikely
to exist without "advances" such as TDD.

~~~
joesmo
Please name _one_ technology that wouldn't exist without TDD.

~~~
alblue
JUnit was created by Kent Beck et al to support automated testing with what
came to be known as XP and subsequently TDD. This inspired subsequent testing
frameworks such as TestNG and implementations for other languages like nUnit.

~~~
joesmo
No, JUnit would not exist without unit tests. Unit tests are not TDD. Try
again?

------
z3t4
im a fan of built in tests/traps, like assertions that makes bugs "explode":
file x, line 22 did not expect state x to be y. Call Allen. When discovered,
make a test that repeats the steps that lead up to the error.

------
EugeneOZ
Tests are for refactoring and other changes in code. To make them safe.

------
choward
Can we get a tl;dr? 21 pages? Ain't nobody got time for that.

~~~
dang
It's important for HN to keep its distance from tldr culture. Longer articles
are ok, even if many of us won't read them. It's true of any post that most
people won't read it.

Snap summaries make for reflexive reactions. That's the shallow kind of
discussion we're hoping to avoid here. What we want is reflection, which is
slower, takes more energy, and leads to more considered exchanges.

More on the reflexive/reflective distinction:
[https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...](https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=by%3Adang%20reflexive%20reflect&sort=byDate&type=comment)

~~~
arjie
Scientific papers aim for a similar objective. They also recognize the
importance of being able to summarize. Abstracts are a good idea, and don't
necessarily mean people will make snap judgments.

There's a lot of long rubbish on the Internet. Being able to quickly
distinguish it from good stuff is essential to using your time sensibly.

~~~
noir_lord
The abstracts are often written by the people who wrote the paper, a tldr
often isn't, this allows at best inaccuracies in the summation and at worst
complete bias which then results in a big argument completely unrelated to the
source material.

Frankly if you aren't willing to read the content you simply shouldn't comment
on it until you have.

~~~
corecoder
HN is not just for commenting, it is also for reading other people's
submissions and comments. So, granted, it's better not to write comments
without reading the article first, but why should that mean that people who
have read the article should not write a summary?

------
whatnotests
Totally agree 100%.

Everyone please stop testing your code.

...

...

...

...

Ok folks - everyone who's still testing can keep their jobs (or take the jobs
of those who stopped).

~~~
lj3
There are more types of tests than manually written unit tests. There's a
whole other world out there:
[http://danluu.com/testing/](http://danluu.com/testing/)

------
k__
Did my first big project with >200 Unit-Test, like we learned at university,
haha.

Well, then requirements changed, and most of them started failing. In the end
I spent most of my time fixing the now wrong tests.

On the other hand, stability goes down if I don't do any testing.

At the moment I do automated UI testing. I'm a front-end dev so this seems to
catch many things, especially thanks to test-videos and screen-shoots.

I'll try TypeScript in my next production project, simply because I don't know
how to write the right amount of good unit-tests. But I know that TypeScript
doesn't prevent all bugs, so I guess I'll have a blind spot between the type-
checker and the UI tests, but hopefully it won't be too big :\

~~~
ajmurmann
I think the cost of unit tests when changing code also varies with the quality
of the design. If your code manages to some degree to follow the open/closed
principle you should be able to just throw out some of your classes. On the
other hand that also implies that maybe in those cases the unit tests could
have been tossed earlier. However, there always will be things that won't be
extended but actually need to be changed and you won't know in advance which
is which. I do however like the idea of unit tests as a scaffold that guide
you to build the right thing and then you toss them. Usually I have only very
few construction sides in my code base that I am confident enough to burn the
scaffold down. It would be an interesting idea to TDD a project and always
delete all unit tests after you are done with them and only keep cheaper
acceptance tests around. Maybe you would actually ho faster

