Hacker News new | comments | show | ask | jobs | submit login
Why Most Unit Testing is Waste (2014) [pdf] (rbcs-us.com)
282 points by pmoriarty on May 30, 2016 | hide | past | web | favorite | 280 comments



Ok let's stop writing UT and see what happens. Wait... we've already tried that, and we know the result pretty well:

1. In dynamic languages, simple type mismatches, wrong variable names etc. are now caught in "top level system level test". Yes these are bugs that should have been caught by a compiler had we had one.

2. There's no documentation as to how something should work, or what functionality a module is trying to express.

3. No one dares to refactor anything ==> Code rottens ==> maintenance hell.

4. Bugs are caught by costly human beings (often used to execute "system level tests") instead of pieces of code.

5. When something does break in those "top level system tests", no one has a clue where to look, as all the building blocks of our system are now considered equally broken.

6. It's scary to reuse existing modules, as no one knows if they work or not, outside the scope of the specific system with which they were previously tested. Hence re-invention, code duplication, and yet another maintenance hell.

Did I fail to mention something?

UT cannot assert the correctness of your code. But it will constructively assert its incorrectness.


> 3. No one dares to refactor anything ==> Code rottens ==> maintenance hell.

In my experience, with enough poorly written unit tests and a hard line policy on unit test coverage (which often results in poorly written tests), the unit tests often turn out to be as much a barrier to code maintenance as the regular code is.

I'm an advocate of judicious unit testing, but taking it too far is worse than not doing it at all if you have integration or system / functional tests IMHO.


Sounds like my life these days. Been given some smelly code (2k loc) that I cannot fully comprehend, so I just went out and naively refactored it.

Code was covered without assertions, so I insisted spending some time into writing at least a few while everyone else just asked me to test. And of course once in production tons of bugs started to appear. I still believe I can write system tests though.


This is the real trick. Unit tests encourage you to maintain abstractions and layers. If your abstractions are good it works well. But if you need to change abstractions it now becomes a much harder problem.

One nice part of more system level tests than unit tests. Hi ahead ans refactor a bunch. If the same inputs still give the same outputs you should be golden still.


> If the same inputs still give the same outputs you should be golden still.

That's dangerously subtly wrong. It only holds if your inputs include both positive and negative cases. If you only push positive cases through your code and it still gives you the same outputs then it might be broken in subtle but non-obvious ways, aka happy-path testing.

Better be very careful with this.


You can refactor anything if you are conscious of backwards compatibility.


I think there's some arguing past each other in discussion of this. You've seen some terrible programmers who don't write unit tests, and I (and others) have seen terrible programmers that do write (terrible) unit tests. Tests that mock everything and test nothing. Tests that assert bad implementation details but not important results. Huge volumes of terrible tests that are a huge burden.

I agree, having some good tests is very good. Obviously.

But I disagree with the hypothetical statement "someone writing a lot of tests is always doing a better thing than someone writing a lot fewer tests".

No silver bullet and all that. Also annoyed by "agile methodologies" and "best practices" in general. Good sense and taste really can't be turned into a list of rules you can blindly apply anywhere. I suggest reading http://alistair.cockburn.us/Characterizing+people+as+non-lin...


You are correct that writing lots of tests doesn't imply good programming. But lets consider the logical converse: does good programming imply lots of testing (not necessarily unit tests)?

I say yes. Of course this is a big, complicated, world; it probably contains good programs without tests. But in my usual experience, the moment a programmer - even a good one - neglects testing, bad stuff starts creeping into the code.


In OCaml (or any similar language) 95% of what you "unit test" for in Javascript just goes away.

If people cared about software quality, they would use strongly typed languages before even thinking about "unit tests". What people actually want is a) to hack it out quickly and b) to have a lot of busywork to do as a form of job security. Hence 50,000 lines of "unit tests" per 10,000 lines of application...


Sorry to be pedantic, but I just have to point this out, since the mistake is made over and over again in this thread:

It's not about strongly vs weakly typed languages, but statically vs dynamically typed languages.

OCaml is statically typed. Javascript and Python are dynamically typed.

What you are talking about are benefits of statically typed languages.

For more information, please see:

http://blogs.perl.org/users/ovid/2010/08/what-to-know-before...


Probably more accurate to say strong AND static typing. There are plenty of languages that are statically typed but have very weak type enforcement in the compiler and where idiomatic usage tends to use Any types and reflection and non-hygienic macros and many other type shenanigans, allowing tons of bugs to seep through. OCaml is not one of these languages.


What are some examples of such languages? I can only think of two...


Go for instance, with the use of «interface» everywhere you need generics.


Well, if we're really going to be pedantic, then it's about typed vs untyped languages[1].

[1] http://stackoverflow.com/a/9166519


It is good to maintain an understanding about the difference.

On the other hand, many of the most egregious problems are due to weak typing and not as static advocates, well, advocate, static vs. dynamic.

Having a really good type inference engine helps make statically typed languages much more productive - but lisps and the like have real advantages too.


This is very patronizing, you're not better than some other developer because you fell in love with OCaml or similar strongly-typed languages.

I don't think this kind of criticism is healthy, you're trying to live on top of your ivory tower calling out everyone else who isn't like you as lazy and looking for job security, you really do think there's no other reason for people using languages that you aren't so fond of?

Honest question, because this kind of message is just unnecessary aggression.


It really depends on what kind of software you build.

There are certain kinds of software for which strongly-typed languages (in the sense of ML, not C/Java) are heavily underused. For those cases, parent's rant is quite appropriate and mostly justified.

For example, we really, really want to have our browsers written in a language like Rust, rather than C or C++. Same with web servers, and public-facing web applications.


Exactly, I can agree completely with this kind of statement and I'm also one expecting to see more ML-family languages being used for what they are good at.


Well, the ML-family languages are general purpose languages, so I'd say they are good at most things.


It isn't patronising to point out the failings of weakly typed languages


Only he doesn't - he just makes claims. You happen to agree so you are satisfied - but the test is, would you like that post if it stated something you don't already agree with?


> he just makes claims.

When was the last time you asserted a variable's type in a statically typed language in a unit test where you weren't testing reflection?


I never did that in a dynamically typed language, so what's the difference?

The whole point of dynamic/duck typing is that you don't care about the specific type, only that it works as you expect it to.


But I'm certain that you have written tests to assert that an object implements a method or that a null isn't propagated in some case or another (or if you haven't written those, you likely have some bugs somewhere).

I totally agree with the sentiment up thread that it's super condescending to think you're a better programmer if you use a language with a good static type system, but from a personal perspective, it's so much more pleasant to let the compiler check this trivial stuff so that I can focus on testing actual logic.


You wouldn't write a unit test to assert an object implements a method. You would test if the method works as expected. This will of course also fail if the method is not implemented at all.

You test for functionality, not types.


Here's a concrete example: say you are writing a test for a method that takes an object and asks that object some question, eg. `object.is_something()`. You can either explicitly assert that `object` implements `is_something` or just let it throw the normal error if it doesn't. Either way, you should test the functionality of the method when it receives an object that does not implement that method, because the behavior in that case is visible to callers, may come to be relied upon, and may cause a regression if changed.

I find that lots of tests end up being of this sort, that I'm uncomfortable if I don't write them, that they clutter up test files and obscure the more interesting tests, and that they are exactly what type checking catches automatically.


To some extent verbose UT is the tax I'm paying for the joy of not using statically-typed language.

I'm happy to pay that, but by no means suggest it's the right trade-off for every scenario.


Here is a list of basic errors in Python. Only builtin ones, I'm not talking about ones from other libs. And I'm talking of error categories, as we all know one exception can be used in a lot of different contexts:

  BaseException
   +-- SystemExit
   +-- KeyboardInterrupt
   +-- GeneratorExit
   +-- Exception
        +-- StopIteration
        +-- StopAsyncIteration
        +-- ArithmeticError
        |    +-- FloatingPointError
        |    +-- OverflowError
        |    +-- ZeroDivisionError
        +-- AssertionError
        +-- AttributeError
        +-- BufferError
        +-- EOFError
        +-- ImportError
        +-- LookupError
        |    +-- IndexError
        |    +-- KeyError
        +-- MemoryError
        +-- NameError
        |    +-- UnboundLocalError
        +-- OSError
        |    +-- BlockingIOError
        |    +-- ChildProcessError
        |    +-- ConnectionError
        |    |    +-- BrokenPipeError
        |    |    +-- ConnectionAbortedError
        |    |    +-- ConnectionRefusedError
        |    |    +-- ConnectionResetError
        |    +-- FileExistsError
        |    +-- FileNotFoundError
        |    +-- InterruptedError
        |    +-- IsADirectoryError
        |    +-- NotADirectoryError
        |    +-- PermissionError
        |    +-- ProcessLookupError
        |    +-- TimeoutError
        +-- ReferenceError
        +-- RuntimeError
        |    +-- NotImplementedError
        |    +-- RecursionError
        +-- SyntaxError
        |    +-- IndentationError
        |         +-- TabError
        +-- SystemError
        +-- TypeError
        +-- ValueError
        |    +-- UnicodeError
        |         +-- UnicodeDecodeError
        |         +-- UnicodeEncodeError
        |         +-- UnicodeTranslateError

52 exceptions.

Now, on this list, how many can be linked to language typing ?

        +-- TypeError
        +-- ValueError
        |    +-- UnicodeError
        |         +-- UnicodeDecodeError
        |         +-- UnicodeEncodeError
        |         +-- UnicodeTranslateError
        +-- AttributeError
And ALL those errors can also be triggered with something a type system would no catch : runtime class generation, unexpexted user input, corrupted database, wrong data header, etc)

So basically, relying only on typing will cover about 10% of the error types, most of which are caught with linters such as flake8 and a code intelligence tools such as Jedi. You are making a very weak case.


> And ALL those errors can also be triggered with something a type system would no catch : runtime class generation, unexpexted user input, corrupted database, wrong data header, etc)

Runtime class generation and unexpected user input can actually be handled in types. Any language with Type:Type, generative modules, first-class modules, and other variations all handle various forms of runtime type generation. Something like lightweight static capabilities [1] can ensure runtime input conforms to expectations encoded in types.

Furthermore, you are way underselling types with that restricted list. NameError, SyntaxError, ImportError wouldn't occur in any typed languages at runtime. LookupError, AttributeError, AssertionError, ArithmeticError wouldn't occur in some of them.

Finally, your breakdown doesn't cover how frequent any of these errors occur. For example, even if typing were to solve only TypeError, if TypeError consists of 90% of runtime errors, that would be a huge win.

[1] http://okmij.org/ftp/papers/lightweight-static-capabilities....


> Runtime class generation and unexpected user input can actually be handled in types. Any language with Type:Type, generative modules, first-class modules, and other variations all handle various forms of runtime type generation. Something like lightweight static capabilities [1] can ensure runtime input conforms to expectations encoded in types.

How would you if you can't deternmine type in advance, check for the type it will be ? I don't understand.

> Furthermore, you are way underselling types with that restricted list. NameError, SyntaxError, ImportError wouldn't occur in any typed languages at runtime.

Absolutly not. Those are not linked to typing in any way, and any decent Python editor catch them.

> LookupError, AttributeError, AssertionError, ArithmeticError wouldn't occur in some of them.

AssertionError ? Serioulsy ? Do you even know what it does in Python ?

And LookupError, unless you got constant sized containers, which has nothing to do with types, you can't check that.

ArithmeticError ? Come on ! Are you numbers constants ? You need to check the inputs so that they belong to the domain of your problem, at that's it. Nothing to do with types.

> Finally, your breakdown doesn't cover how frequent any of these errors occur. For example, even if typing were to solve only TypeError, if TypeError consists of 90% of runtime errors, that would be a huge win.

Yes but it's not. My last week have 90% keyerror, and empty values. The input is usually the source of errors.

Types are useful, and they come at a cost. Whether your want to use them or not is a technical choice to make. But selling types the way it's been done on this thread is dishonest. Or you are all here working with algo problems and very little user input. Which are the easy program to code. The hard part would be the algo, not the code. If you are dealing with a user app, a web site, a video game, typing will not save you from the 90% of the bugs, and you DO need unit tests.


Your view of what static type checking is or isn't seems to be limited to your understanding of types (and what can be expressed with them) in Python. There's a world of languages outside Python, even for languages with dynamic typing!


> How would you if you can't deternmine type in advance, check for the type it will be ? I don't understand.

Compilers do this all the time. Just consider all those programs that dynamically generate code based on objects that they haven't seen in advance, like object-relational mappers. Those can all be statically typed and ensure that they generate correct code [1].

> Absolutly not. Those are not linked to typing in any way, and any decent Python editor catch them.

It's all part of compilation, and furthermore, we can type check code generation so that names errors don't even occur in runtime generated code. Sorry, but all of these errors are related to type checking.

> AssertionError ? Serioulsy ? Do you even know what it does in Python ?

http://research.microsoft.com/apps/pubs/default.aspx?id=1386...

Just google "static contract checking" to find plenty of work, including compilers already available for Haskell. Heck, code contracts have been widely deployed on .NET for years now.

> And LookupError, unless you got constant sized containers, which has nothing to do with types, you can't check that.

Please read the lightweight static capabilities paper I already provided. It demonstrates using the Haskell and OCaml type systems to statically assure that all array bounds are in range, even for dynamically sized structures. So yes, you can check that, which doesn't even go into dependent typing where checking such dynamic properties is the whole point.

> ArithmeticError ? Come on ! Are you numbers constants ? You need to check the inputs so that they belong to the domain of your problem, at that's it. Nothing to do with types.

You don't seem to realize that types are simply logical propositions about a program. They can be ANY proposition about a program, including that indexing is within bounds, that a concurrent program has no race conditions or deadlocks [2], that programs terminate, that HTML forms correctly serialize/deserialize values from/to structures [1], and more.

Like most dynamic typing enthusiasts, you don't have a proper appreciation for the true scope of static typing. You are correct that typing has its costs that aren't always warranted, but you are incorrect about where you draw that line because I don't think you fully understand how powerful static typing truly is.

[1] http://www.impredicative.com/ur/ [2] http://www.kb.ecei.tohoku.ac.jp/~koba/typical/


Just a minor note that contracts are available in Python too, via PyContracts.[1] They are very useful.

[1] - https://andreacensi.github.io/contracts/


I think there's a reasonable confusion here - people always talk about "static type checking" when what they really mean is a superset that might be more clearly referred to as "static checking" or maybe even "static analysis". It's true that type systems often improve static analysis capabilities, but I don't see why some useful analyses (like catching SyntaxError and probably NameError) couldn't be done for a language like Python. After all, the editors have basically already implemented it.


"relying only on typing will cover about 10% of the error types". It's not about the percentage of error types. Its about the percentage of errors. A very high percentage of unit tests in a dynamic language are written to catch type errors. That work is a waste of time when a type system could check it for you. The reason for this is that type errors are involved in all code. A FileExistsError only applies if I am working with files. (Plus with types you get the added benefit of inline documentation and better tooling.)


> A very high percentage of unit tests in a dynamic language are written to catch type errors.

I'm not sure where you got that information. You don't actually write tests to catch types. You write tests to check behavior. While it is possible that actual type errors come out of the woodwork while doing that, that is not the reason why the test was written in the first place; and after fixing such a type error, the test itself is still valid (again because it is making sure that the code exhibits the desired behavior).

I don't think I have EVER written a test just to check for types (in Python). Neither do I pepper my code with 'assert isinstance(x, some_class)' because it just goes against the grain of a dynamically typed language.


In strongly typed languages you embed constraints about your programs behavior in the type system.


Really? What language allows you to encode business logic into its type system?

I'm talking about things like, what happens if you press this button; is this calculation correct; does this lookup give me the desired results; does X get stored in the database correctly (or retrieved from it); etc. I don't know of any type system that lets you check these things.

(Also, I assume s/strongly/statically; Python is a strongly typed language. Granted, not everybody agrees about the terminology...)


You're moving the goalposts now; those are integration not unit tests. But the type system certainly can validate/enforce that the column you select from the DB into an integer, really is an integer in the DB too. OCaml does this with Postgres. It can also make sure the end result of a calculation is of the correct dimensions, that you're not adding cms to inches, etc etc.


A lot can be said with types beyond simply checking things like "is this a String or an Int?".

Depending on the power of your type checker, you can say (with varying degrees of convenience) things like "this integer will always be positive", "this is always going to be a non-empty list", "these two functions can be composed", "I've exhausted all possible values for this variable; I know it because the compiler told me so", "it is safe to call this function, and I know it won't touch the database or do any kind of I/O", and many more.

Whenever you called a function and it wrote to a file when you didn't want that, that's also a type error!


It's also about the compiler guaranteeing to enumerate every code path in and out of every function. A human writing tests is limited by what they think will happen, but the compiler knows.


> A very high percentage of unit tests in a dynamic language are written to catch type errors.

Absolutly not.

If you do it you are doing it wrong.

Check this out: https://github.com/Tygs/tygs/blob/master/tests/test_componen...

The type checks are "isinstance". It's 10% of the checks, and most of the time I don't even have those, my colleague insisted.

The rest are the most important tests: they test behavior, not types.

Do not assert a dynamic language means testing types, it's just not true. Bad testers test types.


"isInstance" doesn't cover the full range of type errors. It simply covers whether something is, say, an Int or a String.

Types can encode much more. They can encode behavior.


Most of these would be caught at compile time in a language like Elm or (to a lesser extent) Haskell. A common pattern in some of these languages with actually strong type systems is to encode these failure possibility points in the type of the associated functions.

Also, in Elm as an example, there is no runtime class generation, the type of user input that is possible is checked at compile time, and you're forced to explicitly handle many of these potential failures. If you want to crash the program if you get unexpected user input, you literally need to type "Debug.crash" and then you pretty much deserve what you get.

If you're actually interested in maintainable code you should concentrate on reducing the state space of your program wherever possible, then using exhaustive (if possible) or property testing to fill in most of the rest of the gaps. Unit tests can help too, but I really think of them as less useful than all of the above.


You can not catch a keyerror at run time. You can not check for user input with a type system. Those have nothing to do with types. Input and output are out of your control, they need check sanitizing, and dynamic language are actually great at that.


You are wrong. You can very much catch "key errors" at compile time.


I'd also exclude:

        +-- StopIteration
        +-- StopAsyncIterat
which are used for mere communication rather than describing real error cases.

Also, you should exclude:

   +-- SystemExit
   +-- KeyboardInterrupt
   +-- GeneratorExit

        +-- ImportError
        +-- MemoryError
        +-- SyntaxError

        |    +-- IndentationError
        |         +-- TabError
because unit tests won't help you to catch these, either (because they are either thrown directly on program start, or depend of specific system circumstances rather than bugs in your code).

Finally, I would argue that most instances of this exception are caught by a type system:

        |    +-- KeyError
Why? Because most of the time the key is not user input, but a hard coded string, and accessing non-existing properties of an object/struct is caught by most type systems on compile time.


    > I'd also exclude:
    > 
    > +-- StopIteration
    > +-- StopAsyncIterat
    > 
    > which are used for mere communication rather than describing real error cases.
Absolutly not. If you implement very low level logic, you can use those. I have to catch them several times in my life.

    > Also, you should exclude:
    > 
    >    +-- SystemExit
    >    +-- KeyboardInterrupt
    >    +-- GeneratorExit
    > 
Certainly not. You should DEFINITLY unit test for those. I do it all the time:

    def test_ready_keyboard_interrupt(aioloop, app):
        beacon = Mock()
        real_stop = app._finish

        def beacon_stop():
            beacon()
            real_stop()

        app._finish = beacon_stop

        @app.on('running')
        def stahp():
            beacon()
            raise KeyboardInterrupt()

        @app.on('stop')
        def stop():
            beacon()

        app.ready()

        assert beacon.call_count == 3
You must check if you app behavre well when killed.

    >         +-- ImportError
That a type system can't check for, but a good linter can. I never have import errors because I got proper tooling.

    >         +-- MemoryError
That you should test for of course. Last week I worked on a split/merge algo on big files. I had to test that.

    >         +-- SyntaxError
    > 
    >         |    +-- IndentationError
    >         |         +-- TabError
Well, doh :) Type system or unit tests won't catch that. Tooling, tooling, tooling.

    > Finally, I would argue that most instances of this exception are caught by a type system:
    > 
    >         |    +-- KeyError
No, no no !

KeyError should be super mega tested:

  - keys can be any mutable in Python, not string;
  - keys are very often generated on the fly, not constants;
  - dict are mutable, you can add or remove keys.


Well, in many typed languages, a lookup on a map will return an option type. If the key is in the map, it returns Some(value), but if it isn't, it returns None.

By encoding the possibility of a missing key in the return type, you force the programmer to deal with it in the program.

So in fact, having a good type system can help deal with those kinds of errors as well.


>> [SyntaxError, IndentationError, TabError]

> Well, doh :) Type system or unit tests won't catch that. Tooling, tooling, tooling.

You need neither unit tests or a type system to catch those. And you also don't need any tooling. The Python interpreter will throw these exceptions as soon as your program starts.

(Unless, of course, you rely on stuff like heavy dynamic importing at runtime, but that's really rare. I usually see this only for web servers in debug mode, where they auto-reload the app after a changed source file. But then my above comment applies: The faulty program crashes right away, you can't miss that.)


    > - keys can be any mutable in Python, not string;
You mean immutable.


The bugs are way more subtle than that. You're lucky if your bad programming is caught somewhere at runtime. In most cases, Python and other weakly-typed languages will do everything they can to fit the data you're providing with what the code is expecting. This means converting strings to numbers and the other way around, plus a lot of other oddities you would not always expect (see PHP's intval for a crash course in bad language design). All that results in bugs that are far from obvious, and that don't always raise neat exceptions.

EDIT: I stand corrected, I don't know enough about Python in particular. The argument is still valid for other languages with weak types.


You clearly haven't done much Python. It doesn't do implicit convertion. Byte are separated from strings and you need to decode explicitly. You can't add '1' and 1.

Python is dynamically type. It is NOT weakly typed.


> In most cases, Python and other weakly-typed languages will do everything they can to fit the data you're providing with what the code is expecting. This means converting strings to numbers ...

Python is strongly typed, as well as dynamically typed.

  In [11]: 1 + '1'
  ---------------------------------------------------------------------------
  TypeError                                 Traceback (most recent call last)
  <ipython-input-11-861a99da769e> in <module>()
  ----> 1 1 + '1'
  
  TypeError: unsupported operand type(s) for +: 'int' and 'str'


This is actually something Python doesn't do... generally, if you have a type error in Python, you get an exception at runtime.

I certainly don't disagree with what I guess will be the overall direction of your argument.


Python is not "weakly typed", nor does it ever automatically convert strings to numbers or the other way around.


Heck with Python3 it won't even convert bytes to String.


You absolutely can handle runtime errors via the type system just using a Maybe.


How do unit tests protect you from a corrupted database?

You are making a very weak case.


The ones you write: this month I'm working on medical data, we got tests where we scramble the database to test the reaction of our software of course. Error reporting, cleaning, logging. You mean you don't do that ?


I don't remember when was the last time I've made a type related screw up. In my experience almost all bugs in my code (and there is plenty of them of course) are either something completely stupid like a typo, or some edge-cases not covered properly by code logic (which only proper testing and QA can catch). Back in the days of working in C/C++ things were different, I'd quite often forget to do some required casting. Of course, compiler would warn me about it, but it would still kill my flow, so I honestly never regretted moving to the land of dynamically typed languages. To me it significantly reduced the number of errors since you take one layer of complexity out and thus have less of dull errors you can possibly make. Not the other way around, as you imply.


I am not saying you are wrong, and I think the OP of this thread overstated the benefits of static typing, but here are a couple of counterpoints:

> either something completely stupid like a typo...

Typos are a bigger problem in languages that silently instantiate a new variable in response. If it's on the rhs, would you prefer your IDE to warn you about using an unititialized variable, or let it go? Your coding flow is paid for by more failures (some of which will be WTFs) in testing (and testing is part of coding these days, right?) That may be the right trade-off, in which case you may be able to disable warnings in your IDE.

>...or some edge-cases not covered properly by code logic (which only proper testing and QA can catch).

The 'only' suggests that they are inevitable, and they are in a statistical sense, but whenever my code logic fails to cover an edge case, I ask myself why I overlooked it. Sometimes it is not something I could have anticipated even if I had thought more about what I was doing, but often it is.


Just as you said, statistically speaking being stupid from time to time is an inevitable part of the process :) When I manage to anticipate every angle correctly then usually there is no bugs, so it's not that relevant for this discussion. I was talking about those "other" moments, and usually, yes you are completely right, it's just me not being focused or not looking from the right angle on the problem. Thing that really sucks in programming is that computers almost never make mistakes on their own, it's always us who screw up the code :P


> I don't remember when was the last time I've made a type related screw up

Maybe the languages you are using aren't capable of encoding the kind of screw ups you do make as type errors? Typos are trivially covered by most static type checkers, but like you say that's not a very interesting kind of error (except, of course, with languages that silently introduce fresh variables. Those can introduce hard to find errors...). More interestingly, they can also handle other problems like dereferencing nulls, not considering all possible patterns of a value (think "case" conditions), calling a function on the wrong kind of value, etc.


I'd agree that Haskell/OCaml/... will get rid of some unit tests that are to be written in other languages, but you still need to test if the logic and semantics of your code are correct.


Sure, but removing entire classes of things you have to check is a really good win. Combine that with the power of fuzzers derived from legal values of the type, you can knock out a lot of bugs from your programs.


Sure, but just because a type system can't catch all bugs doesn't mean it isn't valuable to use one to completely eliminate certain classes of bugs.


When you are writing unittests for JavaScript, does 95% of your asserts check that values are of the correct type?


I've been using flow type which has made the transition from F# tolerable, so none of my assertions are type checks but still a lot of null checks because of no language support of option types.


But if you develop without flowtype, do 95% of your JavaScript unittests consist of type checks?


In a proper type system a la Haskell, ML, Rust, etc. you can encode significantly more into the type system than you can with JavaScript's rather anemic offerings, even if it were statically typed. It feels almost like contact programming in that you must handle everything out else the code won't compile.


I have never met anyone who doesn't care about code quality yet writes unit tests...


then you have never worked in anything but a small focussed team.

The majority of development is performed by people in corporate environments, 9->5. They simply know that they "have to" write unit tests. No reasoning is performed, simply churn out the code to satisfy the boss and walk out at 5pm.


Sorry, you are correct; what I wrote was simply stupid. I have, in fact, encountered one method with a cyclomatic complexity of 20+, which had a single test (it verified that the result was not None).

I just tried to put that code behind me and yes, those "tests" were there because they were mandatory.

Blech.


Unit tests as documentation in my experience is only true in a very minor, almost trivial sense, and for reasons that are very much in line with the subject of the original article.

For the most part, unit tests aren't organized around what a unit is supposed to do from a use case perspective, but more about how it happens to be implemented today. The article refers to this with the inability of most developers to indicate which business requirement would fail if the test failed.

When you're attempting to understand or refactor something where the testing for the most part tests the implementation of this particular unit rather than it's purpose, the tests themselves offer little in either documentation or refactoring confidence than the implementation itself.

What's worse, making your code "more testable" often compounds this effect. As code is taken to an extreme level of modularity, and unit isolation is enforced rigidly in tests, the tests often do nothing other than specifically test every line of the implementation of the unit (i.e. method 'foo' on class 'Bar' calls method 'a' on class 'B' and passes it's results to method 'x' on class 'Y', returning that result)


IMHO "unit tests as documentation" isn't quite the right description. Its more like "guaranteed to work example code and tutorials".

One example is where there might be preconditions for calling a function. This type of information is necessary to know but can be hard to work out even with rich documentation. Without documentation, you are going to be reading a lot of code to try and reverse-engineer the internals.

With unit tests, you can do a quick "find all references", see how the method is called within different tests cases and be confident that the setup actions in the given test fixtures will work for you.


If a method has preconditions, say X cannot be null, Y must be positive, and X > Y, you could test this in a unit test. But honestly make it the top 3 lines of the method with Asserts. Living with the code is going to be generally better than living off in some test. You can read it and see it very easily.


Assertions are most of the time completely useless and just a waste of time. If you put these assertions in one method it will break only at runtime. Usually it is a not very used method and the code that caused the assertion to fail is shipped in production and you realise only too late that it's broken. If you had a proper unit test instead of an assertion you would have caught the bug as soon as you pushed the code and the continuous integration kicked in.


Why not put an assertion, then add a test that deliberately checks if it's triggered?


The comment that I replied to was about using assertions instead of tests. You can certainly write an assertion and test it after, but unless it is absolutely necessary I would prefer to avoid polluting the code with "documentation". The right place for that is in the unit tests, not in the middle of production code.


I disagree, the right place is in the code where the functionality is defined. And the comment you replied to i didn't read as "assertions instead of tests". That's absurd. You should have tests that exercise the code that contains the assertions. It's the best of both worlds. You get code automatically tested and you get inline documentation that can't lie to you.


I say code the happy path, and then have assertions.

Saves writing 12 unit tests that show passing null breaks the code. No one cares about that.


And the comment train I was on was using unit test for documentation on valid inputs and outputs.

Asserts are documentations on valid inputs and outputs that very cleanly break when violated.


Sure. I don't mean simple local pre-conditions like arg validation but more system level issues that are inappropriate for asserts e.g. maybe you need to hook an output sink to an event trigger before you invoke a function to see any results. Not having done this or doing it in the wrong order is not strictly a mistake from the system's perspective but would you leave you stuck as a coder.


Integration test should catch if for a given input your entire app does nothing, right?


The issue isn't how to test pre-conditions, its how to know they exist and how to meet them correctly. An integration test just gives a big ball of mud of every production concern whereas a unit test provides simple example code for a single concern that is maintained and tested.

It feels like discussing this without real code is likely to cause confusion. The type of systems I have in mind are those that invert control through e.g. pub/sub events, plugins, strategies, dependency injection etc. Unit tests help document how to use components in a system that has traded clarity of control flow in exchange for extensibility.


I have found these kind of systems that put a ton of effort into unit testing are wasting time. And when it comes to refactor, you end up creating weeks of extra work to make the new unit tests pass.

How a system is wired together is not that exciting and not worth unit tests. If you screw up loading a config file on startup o bet your first integration test fails. Why would you unit test this?


There is a difference between unit tests and integration/ system/ component tests. I think the author is talking about pure unit tests, i.e. the ones that test only one specific piece of code and mock everything else away.

Too fine grained unit tests actually make refactoring more difficult, because a refactor typically touches so much code that they have to be rewritten. But component tests do provide the safety net that you had in mind when you wrote your reaction.


What.

> Ok let's stop writing UT and see what happens. Wait... we've already tried that, and we know the result pretty well:

Stuff still doesn't work and doesn't get delivered on time. That's all we know, and unit tests didn't change that.

> Yes these are bugs that should have been caught by a compiler had we had one.

WHAT? What kind of dynamic languages are you working with?

> 2. There's no documentation as to how something should work, or what functionality a module is trying to express.

Unit tests - at least those made in TDD style - tend to obfuscate that; they usually reflect the implementation instead of the interface anyway.

> 3. No one dares to refactor anything ==> Code rottens ==> maintenance hell.

You seem to know very timid programmers. Or the kind who think tests are their safety net, so they don't bother to actually read the code they're changing and understand what it does.

> 5. When something does break in those "top level system tests", no one has a clue where to look, as all the building blocks of our system are now considered equally broken.

Look at the crash log. Or think for just a second instead of saying "it's broken" and giving up.

> 6. It's scary to reuse existing modules, as no one knows if they work or not, outside the scope of the specific system with which they were previously tested. Hence re-invention, code duplication, and yet another maintenance hell.

Re-invention, code duplication, etc. have nothing to do with tests (hell, most unit tests I've seen have quite a lot of code duplication in them). It's not scary to reuse existing modules without tests, it just requires to... use them.

Pretty much all the points you listed are examples of people shutting down their brains and hoping someone else's unit tests will think for them. This is a very bad approach to programming. It's the wrong kind of laziness.


Pretty much all your answers show that you never worked in a big enough project and that you don't understand unit tests. When working in a codebase with adequate test coverage your bugs are more likely to be caught. Please, don't tell me that you don't need tests because you don't introduce bugs. I heard this argument multiple times in the last 10 years, and it was always proved untrue.

The point number 2 simply proves that you don't know what a proper unit test looks like if you say that a unit test obfuscates the documentation.

The point number 3 just confirms what I thought, that you believe that you don't need tests because you don't introduce bugs. The most interesting part is that you think that other people that write unit tests are lazy. Very interesting position, I would say that whoever tries to refactor a big chunk of legacy code without adding an appropriate coverage is certainly lazy and worse overconfident and probably never did anything similar in the past.

The point number 5 proves that you never, ever, worked in a big project if you think that it's enough to look at the logs to find a bug. It's common knowledge nowadays that earlier a bug is caught the less effort it takes to fix it.

Finally I think that should be clear that the very bad and toxic approach to programming is not the one that involves writing unit tests. This approach tries to formalise the requirements in a set of scenarios, describing each condition that must be satisfied for a given event/input. Your approach seems focused on just writing code, without a proper design phase, without understanding completely the requirements with a close feedback cycle with the users and with a complete disregard of existing functionality that you think you can write better because you are a better programmer given that you don't need unit tests.

In my opinion not writing tests is laziness.


> You seem to know very timid programmers. Or the kind who think tests are their safety net, so they don't bother to actually read the code they're changing and understand what it does.

Muhahahaha! Sometimes it's almost impossible to understand a piece of code, especially if that code is old and has had comb-overs from a lot of developers. Unit tests at least might give some hints to what's happening.


Unit tests are mostly useless. If one wants their tests to be anything more than a happy path and sad path that the developer thought up (combined with an accretion of sad paths that were found later as bugs), then one would do themselves a tremendous favor by adopting property-based testing and mutation testing.

Doing so forces you to define several constraint-based models for your program, and then layering mutation testing over the top helps you figure out how good your tests actually are.


I see a uncredible difference in the architecture quality in projects with unit tests. People test forces people to think about interface, forces them to use their code, torture it, break it, knowing it in and out and see the limits. Just for the intellectual tool UT are, they are valuable.

Now they have a cost, and sometimes they are not worth it, but they are certainly not useless.


Writing property-based tests will incentivize you to think about those things too, and it they will actually exercise your code more strenuously across a wide spectrum of inputs and corner cases.

Unit tests seem mostly worthwhile as documentation or tutorial for how to use something or for what something is "supposed to" do. But as validation of completeness, correctness, or coverage of your design and implementation... they fall extremely short and leave a lot to be desired.

In either case, if software quality and reliability actually matters to your problem domain, then mutation testing should be considered an absolute prerequisite to any kind of testing method (unit, property-based, integration, etc.) otherwise you have almost no way to measure the quality of your tests.


>> Unit tests are mostly useless

The speed difference though, integration tests just take too long to be able to completely replace unit tests.


It depends. Both on project size and how your tests parallelize. If you can integration test an entire project in 30 seconds, that is fast enough.

Fast enough to have it in a loop for by the book TDD? No but I'm not sure by the book TDD adds any value. You mash test once before committing and if all is good you go ahead.


With a decent hierarchical decomposition, integration testing at component/module/subsystem boundary achieves both good coverage and low latency test results.

I worked on a big ETL database migration tool in Python and it had a top level integration test with the smallest possible subset of data, like 500k end to end. Then each subsystem (it was architected like a nanopass compiler) had a more complex set of integration tests that only checked that pass but more rigorously. The whole project (10k lines of code) had under 10 unit tests that double checked tricky low level primitives. Every test was independent and could be run in parallel, my longest test was just under a minute, my shortest was about 1 sec. The shortest tests were run on module import ensuring both an easy affordance to testing and a tight feedback loop with no unit tests to impede refactoring.


There's one insidious thing about no tests when dealing with dynamic languages: when you have no tests, no one knows what data looks like (typical code: `bigJson = getBigJsonFromServer()`). Is 'foobar' an array or string? What kind of string is it exactly? Can it be null, or it will be "unknown"? Put two years of cowboy coding on top and you're screwed.

Of course you can create unit tests with fake data, though IMO it's a bad practice. I think it's best to have unit tests working on real production-like data (in addition to have edge-casey-data tested as well).


>2. There's no documentation as to how something should work, or what functionality a module is trying to express.

Unit tests of complex code is often more complex and obfuscated than the code itself, hence its usefulness as documentation is very limited.

>3. No one dares to refactor anything ==> Code rottens ==> maintenance hell. >4. Bugs are caught by costly human beings (often used to execute "system level tests") instead of pieces of code.

Lack of unit tests doesn't necessarily means lack of automated tests. Functional tests when applicable, are far more useful in my opinion than unit tests.


> Unit tests of complex code is often more complex and obfuscated than the code itself, hence its usefulness as documentation is very limited.

I've found that expressing unit tests in the language of the problem domain and never stressing implementation details wherever possible is the ONLY way to prevent this from happening. Avoid mocking and test doubles where possible, but sometimes they're the right tool. The tests should be treated as production code that gets boyscout rule'd as people go into it for whatever reason.


>"Avoid mocking and test doubles where possible, but sometimes they're the right tool. The tests should be treated as production code that gets boyscout rule'd as people go into it for whatever reason."

I agree, but a lot of people (at least in my environment) would play the "that's not an unit test" card, which is fine to me, we should care about the least intrusive way to automate code tests regardless of the philosophy behind it.


Only bad unit tests are obfuscated. In the good ones even the users can easily understand what the test is doing and can even add new test cases.


>Did I fail to mention something?

Yep. Integration tests under "top level system test" and above "tightly coupled unit tests" are not only excellent at catching bugs in integration code, they don't require hours of tedious mocking and don't have to be completely rewritten when you refactor the code base.

>No one dares to refactor anything

Is a problem on heavily unit tested code. Unit tested = tests that are tightly coupled to the code = refactoring turns the tests red whether or not anything was broken = people prefer leave the code alone.


Don't agree. Refactoring does not mean that you need to refactor the tests, but that you will be testing other pieces.

That is, when you refactor the old pieces are still there (in the libraries), and their tests still work, but you are maybe not using them anymore higher up in the hierarchy.


4. Bugs are caught by costly human beings (often used to execute "system level tests") instead of pieces of code.

The author went to some length to point out (correctly) that test automation is an orthogonal issue. Reading beyond the title, we see the issue is that an obsession with unit tests, and especially an obsession that is focused entirely on a measure as simplistic as code coverage, tends to result in a sub-optimal use of development resources (which are always limited.) A redundant or otherwise unhelpful test (of any sort) does not gain value by being run automatically.


Speaking of (3): I have a bunch of newbies I'm mentoring. Every time I explain about unit tests, I show them a class I wrote that had 37 lines altogether. After I finished working on that project, a new team took over. They didn't write unit tests (or maintain the existing ones). A month later, that class had a method with 1500 lines.

It's been around a year since development has basically frozen on that project.


What has the number of lines got to do with the unit tests?

That functionality had to go somewhere, no matter how badly written it was.

Unit tests don't magically save you from code bloat.

Also I work on 3 code bases written by other people with zero unit tests, I refactor them all the time. Other developers have refactored them. Not having unit test is not a big barrier to refactoring.


> Unit tests don't magically save you from code bloat.

No, but the idea is that such a method would be hard to test, considering it (presumably) does so much. So in order to test it properly, you are encouraged to break it up into smaller methods. (Whether somebody actually does that is a different story, of course...)


Are you saying the requirement to unit tests is the main reason why those developers don’t write 1500-lines methods?


I'm saying that they would be prevented from writing those things if they also had to test them. Since they're not required to test their methods, they're free to write code that's pretty much impossible to reason about.


I think people who write such methods without a good reason (like inner loop of very performance critical C++, or code in any language that’s generated automatically from something else) aren’t terribly good programmers.

Demanding those unit tests might indeed force them to write shorter methods. It doesn’t forces them to write good code. By “good” I mean fit for particular project and requirements: I don’t believe there’re universal criteria for code quality.

The only way to fix that and make them better programmers — teach them, not force them to waste their time writing those unit tests.

Besides, writing and running system tests teaches the developers about other system and OS components — and they better know and understand (to some extent) your complete technology stack, not just the layer they’re currently working on. Writing and running unit tests teaches them nothing interesting; it’s boring and involves a lot of copy-pasting.


To add to that - if the only thing keeping programmers from writing 1500-line monstrosities is the requirement to test their code, that's even more worrying, as anyone who's willing to write garbage like that is going to do a terrifically bad job of breaking out abstractions and modules, and now you have 1500 lines that are impossible to reason about split into 20 different classes.


I don't know... I find it much easier to reason about smaller methods. They didn't write just THAT one, there's plenty of code in that project... it's just a couple of methods that are horrible to work in.


It's absolutely true that all else being equal, it's easier to reason about smaller methods than larger ones. Most of the time though, what you're trying to reason about is an entire process rather than a method in isolation, and how small methods interact with each other is an important part of that.

It's absolutely true that there's almost never a time when a 1500 line description of a business process is the optimal approach. Conversely though, if abstractions are done poorly, those same 1500 lines split into a maze of different classes that don't cleanly abstract a particular, human-comprehensible part of the process can be just as hard to deal with.


Oh god, you are describing my previous project. Especially 3. and 5., which makes the most tiny bit of changes take months and endless amount of money and manpower.


Agree!


The funny thing about unit tests is that it's actually possible to write unit tests that don't really help you at all and that this is the way how many unit tests are written (a pessimistic me would say "most"), but nobody thought that would be possible.

The initial people who came up with the idea thought about writing down the execution of a usecase, or a small part of that, as a test. Then they ran their code against it while developing it. That gave them insight into the the usecase as well as the API and the implementation. This insight could then be used to improve tests, API and implementation.

But most professionals aren't about making quality. They are about paying their rent. So when they started to learn unit tests, they just wrote their code as always, and then tried to write tests, no matter how weird or unreasonable, to increas the line coverage of their test suite. The proudest result for them is not to have a much more elegant implementation, but to find the weird test logic that moved them from 90% coverage to 91%.

I believe that's how you get a lot of clutter in your unit tests. However what is described in the document are sometimes example of people really trying, but that are just early in their development. Of course when you learn to do something by a new method you will first do crappy, inefficient stuff. The idea here is how much do you listen to feedback. If that team that broke their logic to get higher coverage learned that this was bad, then they probably adapted after some time, and then they did exactly what unit tests are there for.


Yes, I was the bit of a shock the first time I realized how misleading coverage was. Being executed doesn't mean tested. Tested doesn't mean it works. Working doesn't mean it does what you want. Doing what you want doesn't mean it will do what the client wants. Doing what's the client want now doens't mean it will do keep doing it in the future.

But anybody who edited a 100 000 lines long project, ran the unit test, and saw red lines, fixed them, ran it again and then saw green now the feeling : it's great. You are way more confident.


> But anybody who edited a 100 000 lines long project, ran the unit test, and saw red lines, fixed them, ran it again and then saw green now the feeling : it's great. You are way more confident.

This is the key insight: unit tests are about feelings more than anything else. People get so defensive in unit testing threads because eliminating unit tests means eliminating their source of confidence.


It depends. In a language like Python having 100% test coverage is in and by itself beneficial. A unit test prevents the kind of problems what would be caught by the compiler in a compiled language.


static-analysis tools, like pyflakes, can also find some of those things (semi-obvious problems like variable name typos that would remain undetected until that actual line was executed)


I'm currently at a phase where I'm just not as worried about TDD or Unit testing. I've realized that most of my buggy code is in integration points.

For example, I'm working on an internal project that creates VMs with some provider (be it Virtual Box, AWS, etc) and then deploys a user defined set of docker container to it. I've found that I don't have bugs in situations I would typically test using mocking/stubbing/etc in traditional unit tests. I usually need to have the real AWS service with the docker service running to get any value out of the test. And at that point it's more work to mock anything else than it is to just start up the app embedded and do functional testing that way.

I'm becoming more of a fan of verifying my code with some good functional tests in areas that feel like high risk and then some contract testing for APIs other apps consume. Then if I find myself breaking areas or manually testing areas often I fill those in with automated tests.


Abandoning unit tests was a thing many companies were proud to tell me while I was interviewing earlier this year. I always thought that was ridiculous - but i suppose their products didn't have so many users, or thought they could tolerate some bad behavior. I'm happy I ended up at a place where we're big on tests and even have SDETs embedded in within our team. Besides being useful, I often use a unit test as my main method of completing a feature. I also believe it's a code smell when it's hard to write a test. Maybe it's a regional thing - I've heard it said that here in nyc were much more strict with testing (a carryover from financial roots).


The article top recommendations are strong:

* Keep system level integration tests [for up to a year].

* Keep unit tests that test key algorithms for which there is a broad, formal, independent oracle of correctness, and for which there is ascribable business value.

In my experience, an overarching test layering strategy works best in providing maximum coverage for minimum friction: Add tests bottom-up, at each layer testing the core functionality of that layer, using directly the layers beneath. The most valuable tests are the system-level tests, which, hopefully, exercise a large swath of the underlying codebase. This reduces to the author's recommendation for most cases.

Some people are proud to ditch UnitTesting[TM] to avoid the unit test cargo cult, in particular, the proliferation of "decoupled" tests via overuse of mocking libraries. There is very little value in "unit tests" that reproduce the code flow via mocking asserts, alas a lot of codebases that are heavy on unit testing degenerate into gratuitous mock fests, which become incredibly onerous to work with.


No.

It should have been.

* Keep system level integration until requirement for system level integration change

* Keep unittests. And increase code coverage. Any code that is not executed in a unittest will eventually break. Especially in interpreted languages.

* Obviously code coverage is not enough. But in my experience code coverage is minimum requirement. Not something to be happy if reached.

* The biz cases also need to be tested, which is usually done by a combination of unittest, integration system, system tests and end 2 end tests.


Honest question. What's the value of a test like this?

    class MyObjectBuilder
      def create(object)
        if validator.valid?(object)
          factory.save(object)
        else
          raise "invalid"
        end
      end
     end

     def object_builder_test_success
       object_builder.mock(validator, mock_validator)
       object_builder.mock(factory, mock_factory)
 
       expect_method_call(validator.validate).return(true)
       expect_method_call(factory.save).return(true)

       object_builder.create(object)
    end

    def object_builder_test_failure
       object_builder.mock(validator, mock_validator)
       object_builder.mock(factory, mock_factory)
 
       expect_method_call(validator.validate).return(false)

       assert_raise(object_builder.create(object))
    end

This test is both extremely typical for strong advocates of TDD and "testable" designs, and also almost completely useless in my mind. The test is literally a more pained version of the implementation of the method (not to mention that there's typically a lot more ceremony around setting up mocks than my pseudocode indicates)

It adds value in the sense that if someone randomly types garbage into that file it will break, but it acts as a barrier to refactoring or business requirements change, as pretty much nothing but the exact implementation of the method will satisfy the test, and offers no documentation benefit over the code itself.


Lol I was on a Ruby project like this, I said these tests are tautologies since we're verbosely rewriting the implementation as a test. My yearly review from that project? "Bill does not understand unit testing"

Nobody gets fired for writing too many tests.


There is no value in a test like this. This is basically a test of a precondition (object must be valid). In fact there is not much value in the entire if statement since it is also testing a precondition. That leaves:

    def create(object)
      factory.save(object)
    end
which is just simple delegation. Perhaps just inline the call to create and get rid of the entire class, method, and tests.

It is a waste of time testing for a pre-condition. Ensuring a pre-condition is the responsibility of the caller and tests in that part of the code, or integration tests, should be responsible for identifying problems in that regard.


And now the burden of changing tests prevents any meaningful refactoring.


I'm all for testing until it gets in the way of a proper solution. I've spent more time fixing tests because we changed something on purpose than I ever have because they caught something we missed. I've also spent time working with developers who cared more about their tests than their code or customers.

Like anything else, all things in moderation.


> I've spent more time fixing tests because we changed something on purpose than I ever have because they caught something we missed.

That's why I love the "dump results into a single large string and compare" style of testing. Declaring a given output as known good (until proven otherwise) is a bit of a gamble, because yes, you do think harder when you are writing individual asserts. But when something changes it is very easy to tell a bad, unexpected change in the output from a good one and replacing the previous known good copy with the new one is just a few keystrokes away. A comfortable anything-to-JSON tool and a good diff viewer should be part of any testing arsenal.


When your cost to fix a bug is very low, maybe you don't need unit tests.

Every middle manager on the planet thinks their product/system/whatever is the most critical part of the business, and any errors are unrecoverable and must be avoided, regardless of cost.

People tend to forget that testing comes with cost, and pretending like there are zero benefits to not performing certain types of testing is just asking for your competitors to outpace you.

If you build your system the right way, bugs simply can't kill you.


The author doesn't argue that unit tests should be abandoned, so I don't see the point of your comment.

It's also contradictory to say that "features" can be tested as a single unit, aka function or class.


James O. Coplin has a long of experience and is steeped in theory and practice. His list of publications is about 200 entries long: https://sites.google.com/a/gertrudandcope.com/info/Publicati...

Unit tests are not free as they are also code that much is obvious. Coplin however delves also into less obvious aspects of impact of unit tests on design and also the organizational aspects. Ultimately coding patterns are going to reflect the incentives that govern the system.

Software development is a lot about trade-offs. There is plenty to be learned here how to do it. A addendum by him can be found here: http://rbcs-us.com/documents/Segue.pdf but the meat is in the 2014 article.


one thing that is really imposed by unit testing is the ability of exercise single part of code independently.

it really brings out code smells: if you need mocks injected everywhere instead of being able to use dependency injection cleanly, it shows. if you have code paths that can only be triggered within events, it shows, etc.

having "wasteful" unit testing is more an investment for the future: when users came with real bugs, the ability of reproducing their step in code and fixing that in a no-regression suite is invaluable, but requires your app to be testable in the first place, lacking which you are stuck with manually testing stuff or even worse coughselenium cough


A lot of times, the argument for why a lot of things are "code smells" turns into circular logic:

Code that is hard to test is poorly architected. Poorly architected means it's hard to test.

Example. Take this code (I'm reusing an example from this thread since I think it represents typical "well-factored" code, and isn't an obtuse example to prove a point):

   class FooBuilder
     def create(object)
       if FooValidator.valid?(object)
         FooFactory.save(object)
       else
         raise "Can't save an invalid object"
       end
     end
   end
It's reasonably easy to intuit the purpose of this code, and what exactly makes for a valid object (since the validator is an explicit dependency of this class.) I've often seen this called "poorly architected code", however, since an isolated unit test needs to depend heavily on mocks to implement, and you end up with something like this:

   class ObjectBuilder
     def initialize(foo_validator, foo_factory)
       @foo_validator = foo_validator
       @foo_factory = foo_factory
     end

     def create(object)
       if @foo_validator.valid?(object)
         @foo_factory.save(object)
       else
         raise "Invalid - can't save object"
       end
     end
   end
From the perspective of a coder coming in and trying to understand what's happening here, this code is much more difficult to understand, despite being more "testable". What makes a foo valid? How do I know what a "foo_factory" is? I suppose I could assume that the class defined in foo_factory.rb is probably one - but I can't actually be sure.

The code is more extensible, for sure, but in a way that probably doesn't matter. I can pass in any validator I want! Amazing! Except, in 99% of cases, I'm going to have one way of validating something. The same goes for saving.

I would posit that at least 90% of the time that I see dependency injection in a codebase, it's there solely to aid testing and almost never adds practical (as in, actually being used and not just theoretical) value to a codebase.


The main advantage of IoC is to decrease coupling. The first code that you posted is highly coupled with FooValidator and FooFactory, every change to those objects (name, namespace, etc.) will have effects on your code. Your code is also less flexible because is bound to exactly that validator and you have to explicitly change it in all the places where it is used if you want to use another one. The better testability of the second code is just a nice side effect of IoC. The fact that you cannot tell which type are your parameters is a ruby problem, certainly not an IoC shortcoming.


That's a false dilemma. One doesn't start criticizing from the most extreme angle, because then it'd be fair to assume the alternative is equally extreme, and I don't see anyone advocating for testing backends only exercising them from the ui level.

Iow "if we take things to the extreme bad stuff will happen" contains its own solution.


I don't think my point was extreme at all. The example I posted was an extremely typical, even tame example of dependency injection and architecture for the benefit of testability. I've seen countless variations of pretty much exactly that or something quite similar (pulling out explicit dependencies to be injected that should never reasonably change in normal circumstances).

Regarding "should we only exercise them from the UI level" - I'm not 100% sure what you're getting at - but if your point is that we should focus our testing on business-facing use cases and not trivialities of what class calls what method, then we're speaking past each other and are in complete agreement.


I think a method called "create" that does in fact save the object is a bad example. Also, why is it possible to have invalid objects in the first place? Wouldn't it be the object's responsibility to make sure it's valid?


The create vs. save is a typo, those could easily both be create and my point is identical.

Regarding validation - depends on who you ask. Single Responsibility Principle taken to an extreme would probably support the idea of having a single class whose purpose is to validate an object.

Regardless of nitpicking though, the point is that dependency injection often makes it harder to reason about code (as large parts of the "business logic" of code are relegated to a dependency that isn't obvious to locate), and are often done strictly for the benefit of the tests, rather than the functionality or comprehensibility of the code.


Unit tests function as a kind of REPL for me and allow me to code considerably faster than without them. Without them, it takes me considerable time each time I want to test the smallest code change since in order to get my app to a testable state I have to click around in the UI, enter a few values in inputs, etc. This is just a waste of time. Moreover, there's a slightly costly context switch which happens when I go from coding the feature to setting up my app to test the feature. With judicious mocking, however, I save a ton of time getting my app to a state where I can actually test the functionality I'm coding and do away with that context switch.


That's great, and I do the same. But you really need to ask yourself what value keeping those tests around in your repository is adding. In some cases, absolutely they will have value, but I don't think there's anything wrong with saying at least some of the tests you write were there as a tool to help you build.

For example, I almost always do "gold master" testing when refactoring a large unit or module of code (test the big picture input / output given a few cases without regard for fine-grained tests within, refactor away as long as you can keep the tests green). It's an amazing way of refactoring as it acts almost like a safety harness - you have immediate feedback when you've done something wrong and changed the behaviour of the class. After the refactoring is done, however, those tests are almost useless, as they don't test the purpose of a class but just dumbly look at the input and output.

I think a lot of the tests done via TDD should be looked at in the same way.


There are things that a machine can do in a much better and reliable way than a human. Comparing lists of items is one of them (e.g: outputs of functions to be verified against expected results). And that's the entire point of using computers in the first place. Otherwise we can go back to pen and paper and process forms by hand.

Does it make more sense for a human to do all the aspects of the testing by hand? Of course not. Nobody has budget for that. It's much better to automate as much testing as possible so testers can focus in higher level tasks. Like the risk assessment involved in marking a build as releasable.

Then, unit testing encourages people to construct their software for verification. This software construction paradigm in itself is enough of a benefit even if unit tests are absent.

Construction for verification diminishes coupling, and encourages developers to separate deterministic logic from logic depending on unreliable processes that require error handling. Doing this frequently trains you to become a better developer.

Unreliable processes can be mocked and error handling can be tested in a deterministic way.


I have not read the whole article - but I don't think he argues against automated testing in general. Unit testing is only one type of automated testing.


I share a similar sentiment about unit tests. When you have a lot of code with fast changing requirements, unit testing can be a HUGE waste of time. In my previous job, we spent much more time writing and fixing unit tests than actually adding new features. I was operating at like 1/10th productivity.

It might make sense if you're working for a huge corporation with a LOT at stake. Unit tests then become a form of risk management - It forces employees to think REALLY LONG AND HARD about each tiny change that they make. It's good if the company doesn't trust their employees basically.

I MUCH prefer integration tests. I find that when you test a whole API/service end-to-end (covering all major use cases), you are more likely to uncover issues that you didn't think about, also, they're much easier to maintain because you don't have to update integration tests every time you rename a method of a class or refactor private parts of your code.

About the argument regarding using unit tests as a form of documentation engine; that makes sense but in this case you should keep your unit tests really lightweight - Only one test per method (no need to test unusual argument permutations) - At that point, I wouldn't even regard them as 'tests' anymore, but more like 'code-validated-documentation'; because their purpose then is not to uncover new issues, but rather to let you know when the documentation has become out of date.

I think if you're a small startup and you have smart people on your team (and they all understand the framework/language really well and they follow the same coding conventions), then you shouldn't even need unit tests or documentation - Devs should be able to read the code and figure it out. Maybe if a particular feature is high-risk, then you can add unit tests for that one, but you shouldn't need 100% unit test coverage for every single class in your app.


We often see criticism against unit testing that is based on facts such as tests are badly written, unmaintainable, incomprehensive, and they drag development down.

Which may very well be true! But I am amazed at the conclusion: That because tests are badly written, writing tests is a bad thing. No! Any code can be badly written, it doesn't mean that writing code is a bad thing. Tests, like any other piece of code, also need to be designed and implemented well. And this is something you need to learn and get experience with.

As to whether well-written unit tests are worth it, I cannot imagine how someone could efficiently maintain a codebase of any size without unit tests. Every little code change is a candidate to break the whole system without them, especially in dynamic languages.



Thanks for the link!


And here's some SQLite perspective: https://news.ycombinator.com/item?id=4616548


I think I know why I don't enjoy writing test. Most of my enjoyment from programming comes from feeling of power when I am able to write concise code that does a lot for me.

Testing spoils the fun as now I need to write another piece of code for each, single thing that my original piece of code is doing.

I am no longer a wizard casting fireball into a room. I'm also the guy that has to go over the corpses and poke each one with a stick strong enough and for long enough to check if they are absolutely totally dead.


Extending that analogy, I enjoy having a robot that will tell me if all the enemies are dead. Then I can try out different spells to my heart's content without having that nagging doubt about their effectiveness.


Except it's not a robot. It's a robot per body. Sure some components are common but you still have to build and grease each bugger yourself and adjust them whenever you change the spell in any meaningful way because enemies will drop differently. I'm too lazy for that. I'll just walk with the people I should get through the room and kill the enemies that are meaningfully not dead. Besisdes, after this room, there's another and who has the time.

From time to time some of the people I lead will die due to my sloppiness but such is life. It's not like I'm leading royal family through this dungeon.


I once had a talk with Kent Beck way way back in the day (early 2000s) about how much unit tests there should be, etc... and I think its nicely captured by his reply here

http://stackoverflow.com/questions/153234/how-deep-are-your-...

A lot of people seem to miss many of Kent's subtly, but intentionally phrased advice. Unit Tests are a liability, so use them responsibly and as little as possible, but not at the expense of removing confidence in your software.

Also, delete tests that aren't doing you any favors.


Thanks for the link! I never knew that Kent Beck, one of the fathers of TDD, said such a thing already in 2008. And in the mean while herds of developers have started to chase goals like 90% test coverage, thinking that this is the way that guys like Kent Beck do TDD. If only they would have known earlier!


The problem is that to follow his advice you need judgement. From my experience more and more devs want strict self-imposed rules that need to be followed so they don't have to think.


If I'm writing Perl or something, I'll write unit tests just to verify the code runs in the basic cases.

I like Haskell because I can skip most of the unit tests. Integration tests are still good, and some unit tests like "confirm that test data are equal under serialization and then deserialization" help with development speed. But I can usually refactor vast swathes of code all I want without having to worry about breaking anything.

If you do write unit tests and your test passes on the first try, make sure you change the output a little bit to ensure it fails. It's more common than you'd think to accidentally not run a test.


I couldn't agree more. I am doing a lot of nodejs development these days and it's usually a pain to verify that the code works the way it should. Writing a quick bunch of unit tests is very superior to loading your code in an interactive console and manually fiddling with inputs / outputs. These unit tests often break at the first refactoring and we rewrite them at that point. I'm a bit surprised that people argue that unit tests facilitate "aggressive" refactorings. Aggressive refactoring to me means architecture changes and unit tests almost never survive those. Integration testing is what ensures proper functioning in those situations for me.


I generally try to keep my HN comments positive, but this is total Bullshit, yes, with a capital 'B'.

Unit tests are not albotrosses around the neck of your code, they are proof that the work that you just did is correct and you can move on. After that they become proof that any refactor of your code was correct, or if the test fails and doesn't make sense, that the expectations of your test were incorrect. When you go to connect things up after that, and they don't work, you can look at the tests to verify that at least the units of code are working properly.

I am no TDD fan, but I do believe that writing your code in a why that makes it easy to test generally also improves the API and design of the entire system. If it's unit testable, then it has decent separation of concerns, if not, then there may be something wrong (and yes this applies to all situations). I use this methodology for client/server interactions as well where I can run the client code in one thread and the server in another, with no sockets, to simulate their functioning together (thus abstracting out an entire area of potential fault that can be tested in isolation from network issues).

The article/paper raises good points about making sure that the tests are not just being written for the sake of code-coverage, but to say they are useless is just sloppy. Utilize the testing pyramid [1], if you adhere properly to that, everything about your system will be better.

I have a serious question, given that this was written by a consultant, is it possible that tests get in the way of completing a project in a timely manner, thus causing a conflict of interest in terms of testing?

[1] - http://martinfowler.com/bliki/TestPyramid.html


System level integration tests are also proof that the work we did is correct and we can move on. And it is better proof than unit tests, because we aren't mocking everything. And they are far more likely to survive a large and risky refactor than unit tests.

Don't confusing testing in general with unit testing. Just use the right tool for the job. If your unit tests aren't catching a material number of bugs compared the the effort spent, compared to other testing methods, then don't do them. Unit tests have benefits such as quicker execution time, etc. - but that has to be weighed against cost.


The biggest benefit of unit test isn't catching bugs, but making aggressive refactorings possible.


My experience is that more often than not, unit tests get thrown away in aggressive refactorings, and system level tests survive. If you have a system that depends on units A, B and C, and refactor it to use a totally different structure divided into units D, E and F, then the unit tests for A B and C get thrown away, as they don't really have an equivalent in D, E and F. But the system level test stays, as the system as a whole does the same thing. So, the system level test is what guarantees the safety of the refactor.

That is not always the case, but that happens in most of the large refactorings i've been involved in.


This is true of any code you throw out. If you throw out the code, you should definitely throw out the corresponding tests.

It doesn't invalidate the fact that unit tests increase your confidence in your own code.


The point is, as your tests get more and more fine grained, and the units themselves get more and more fine grained (in my experience these go hand-in-hand with a TDD approach), the changes that the units themselves maintain intact approaches zero, because otherwise the only refactorings possible are completely trivial.


One could also say that bad unit tests make aggressive refactoring impossible. Both sides of the coin don't necessarily make for a persuasive argument.


Huh. Here was me thinking that was what static type systems were about. ;-)


Static typing does not allow you to refactor code with confidence that you didn't break the application logic. Only automated tests can do that, regardless of type system.


Sure. But an awful lot of failures in refactoring in dynamically typed languages would be avoided by a static type system.

As it happens, I am a Rust fanboy with all that entails, so I (like everyone else in such discussions) am clearly biased.

My point was in jest, as the “;-)” (because HN simply discarded a U+1F609) indicated.


static type systems != unit tests


That's what strong type systems are about. A strong explicit dynamic type system with appropriate tests can track down bugs more easily than a statically typed weak type system.


Unit tests can also prevent aggressive refactoring by being heavily coupled to the implementation due to the nature of mocking essentially private details of the code under tests. You then have to aggressively refactor your tests along with your code and all the safety that the test would give is gone.


units are likely to be aggressively refactored, no?


You can achieve this with non unit tests as well.


Sure. I should have just said "test".


System level integration tests also tend to be more flaky. Both unit and integration tests are useful.

Unit tests are also pretty quick to write once you have the mocking setup.


That's one of the dirty little secrets with end to end tests that almost no one talks about. You will probably spend more time running after ghosts in the machine than finding actual bugs.


I've had that experience too. But also the opposite.

I've worked both on code where writing the tests was more effort than the code, and on code where writing the tests was easy, quick and helpful. The latter makes sense, after all a good test is straightline code, zero ifs, zero loops. But the former?

I think the key is that mocking should be used sparingly, but without hesitation.


Plenty of people talk about it:

http://googletesting.blogspot.co.uk/2015/04/just-say-no-to-m...

The dirty little secret that nobody talks about is the cause of the ghosts: poorly engineered tests.


>System level integration tests also tend to be more flaky.

That's usually a sign that they've been engineered poorly or you have bugs in your code.

System level integration tests need appropriate environmental isolation and solid asynchronous & multithreaded code. Nobody can be bothered to write these properly for tests, hence the flakiness ("ooh let's just insert a sleep here" / "eh, does it really matter which version of postgres we run?").


They usually tend to be quick to run too; integration tests are rarely fast enough to run during development.


"Unit tests are [...] proof that the work that you just did is correct and you can move on."

Unit tests only can proof your software to be buggy. Even for the extremely simple "two, a function that, given no arguments, returns the number 2", your unit test can't verify that to be _always_ true. It may fail to return on Tuesdays (https://bugs.launchpad.net/ubuntu/+source/file/+bug/248619), internally use a web service that doesn't work on leap days (http://www.wired.com/insights/2012/02/leap-day-azure-outage/), other calls may overwrite the 'constant' it returns (http://programmers.stackexchange.com/questions/254799/ever-c...), etc.


> Even for the extremely simple "two, a function that, given no arguments, returns the number 2", your unit test can't verify that to be _always_ true

That is true, I don't think anyone argues against that point. What unit tests do, is verify that under the code works under specific conditions (as defined in the test's set up). When unit tests fail under those specific conditions, then you know there is a problem.

Another thing I like about unit tests is that it is much easier (and faster) to test different combinations of conditions


You can verify it to always be true as long as the dependencies (which you've mocked) hold. If they don't... if that's a possibility then you might need some error recovery somewhere, not necessarily in this particular unit.


In functional programming languages, the type of "two" would indicate whether it has side effects. If not, then "two() == 2" is conclusive proof.


And they are also "executable documentation". If you as a developer want to know how to use something, unit tests can be of help.

Then, writing tests for the sake of code coverage, is like judging take a driving license exam and being judged on the streets you visited or how much gas you used. Tests need to be written with logic verification in mind.

I am deeply sad this article got upvoted. People should have a sense of responsibility when giving visibility to stuff like this.


Have you seen people reading tests much in practice?

I've heard the "executable documentation" line but my general impression is that people usually read the production code. They'll only look at the tests when they fail or they need to modify them.


Sorry if this is what you were specifying when saying "executable documentation", but have you seen the way Rust handles documentation tests [1]? The idea is to provide documentation via comments before a function definition, and inside this documentation you can provide example uses. The test-runner automatically extracts these examples and verifies them. In this way, your documentation and tests are always in agreement with each other, and a developer can be aware of expected behavior by reading either the code comments or the auto-generated html docs.

I'm rather fond of this system, personally.

1: https://doc.rust-lang.org/book/testing.html#documentation-te...


Thanks, you got me testing rustlang testing facilities. Very enjoyable. Makes you want to do low level programming 10x more than with c/cpp.


That's assuming the code even makes sense. Have you ever been looking at some code, having no precise idea what it does, with no commit history to give an explanation? That's when unit tests would be useful documentation.


If I start working on a new project and they have tests, I always start there. Heck, if I resume working on an old project of mine, I start with the tests.


> I am deeply sad this article got upvoted. People should have a sense of responsibility when giving visibility to stuff like this.

Did you actually read it? Because what you just wrote, is almost exactly the same as what the article author wrote.

His point, as I read it, was that unit tests can have purpose when you don't blindly make them for code coverage.

His case for having unit tests was:

> Keep unit tests that test key algorithms for which there is a broad, formal, independent oracle of correctness, and for which there is ascribable business value

But I don't think he would disagree with the case of providing some "executable documentation" as well. A few lines of code that demonstrate how to use a class or library, along with expected results, is definitely valueable. And those are exactly the kind of interfaces you don't expect to change even when refactoring.


> Unit tests are not albatrosses around the neck of your code All code is a liability. The less code usually the better. KISS principle.

> writing your code in a why that makes it easy to test generally also improves the API and design of the entire system You can get the same result applying SOLID principles, without the cost of maintaining extra code. The presence of unit testing doesn't guarantees easy to test code either (and this is the worse case scenenario, bad designed code with a lot of test coverage that is broken in each refactor).

> is it possible that tests get in the way of completing a project in a timely manner? Are you dismissing this as a point? Time is a resource and writing unit tests adds to that cost.

Unit testing needs to:

* Make you faster because there are less bugs in the long run.

and/or

* Prevent money loss caused by bugs compared to the money lost caused by the opportunity cost of writing unit tests instead of new features.

Also when writing unit tests developers are less prone to refactoring when requirements change as it will require to refactor also all the unit test affected.

You can agree or disagree with the points int he article, but it is NOT just "bullshit".


Tests are not proofs, types provide proofs. Tests demonstrate that the exact conditions being tested actually work. That's not a bad thing but when working with static typing and robust type systems the need for unit testing goes down tremendously compared to dynamic languages.


While I agree with you to an extent, I'm a huge Rust fan, longtime C/C++ and Java engineer, I disagree that a strongly typed language allows you to not write unit tests.

Unit tests verify logic, the compilers only guarantee syntax. Which yes, handles a huge set of issues that in a dynamic language would require 100% test coverage, you still need to check the logic of the code. Thus, unit tests are still a huge benefit.

One reason I love Rust is the build in #[test] feature. Builtin tests from the ground up, awesome.


I didn't say not write tests but that it reduces the volume of tests needed for largely the reasons you stated


Types provide semantic information as well so checking verifies semantics as well (extent depends on the strength of the type system).

Saying you're a Rust fan and then saying compilers only guarantee syntax when the entire point of that language is to have a constrained subset in which types guarantee no data races/leaks at compile time doesn't make sense to me.


I didn't mean to imply that rust doesn't provide more than just syntax. You are correct that rust provides a huge number of guarantees around semantics, but honestly, I think it's not a traditional compiler, it provides a ton of features that you don't find in C compilers for example.

Rust is the greatest thing since sliced bread, my comment was only about statically typed and compiled languages in general. I think you read more into my comment than I intended.


Of course it's a compiler. A compiler verifies that its input conforms to the syntax and semantics of its own language, and generates machine code (or some other language) equivalent to the input.

The only difference is that Rust's semantics is waaayyy more strict than C's.


I'd counter that with dynamic language, easy modularization and the ability to create single function modules that are composable, and written as such, that code is more easily understood and doesn't require the same type of testing as more complex systems in static language, that lend to layers of indirection and complexity for the sake of testability.

I'm not saying that testing isn't needed in either place, but a lot of the times "enterprise" practices that leak into scripted environments make code far more complicated and difficult to understand than it needs to be.


Why do you assume that modules in a statically-typed language are necessarily more complex? (Except for the type annotations, of course.)

Not every statically-typed language is Java. ;)


True, it does depend on the language... C#, for example is limited to Namespaces and Classes, you can't write a simple function without a lot of wrapper... partial & static classes help, but it's not the same.

Though, I will say writing tests against and with a dynamic language tends to be far easier when things are written with modularity in mind.


It gets voted up because there's a grain of truth. Some (usually inexperienced) programmers really do write a lot of mock-heavy, boilerplate tests that don't verify any behavior of importance.

Maybe you haven't seen it. But keep in mind that consultants get called in when something has gone wrong. They tend not to see clean codebases by teams that are working well.


Actually, unit testing only ensures two things. 1. Tha you've at least looked at and/or considered every piece of code at least twice. Two, it adds friction when writing complex code that isn't modular or a functional composition.

It is not a guarantee of anything beyond that. However, having something close to automated deployment means that you should have very close to 100% coverage if only to ensure the above two things.

One thing I've rallied against at several points, more so in scripted environments is DI/IoC, it's often not needed, and there are often simpler solutions than what is generally used. The benefits are being able to have multiple targets for an interface, and being able to more cleanly unit test some systems in a strongly typed compiled language.

All of that said, I don't always go for thorough testing, but I try to write code in such a way that it's more modular, and would be easier to test, should the need arise.


> The benefits are being able to have multiple targets for an interface, and being able to more cleanly unit test some systems in a strongly typed compiled language.

The benefit for DI/IOC for me is mainly not having to worry about how to compose the system.

If some new module I'm writing needs a particular dependency, then I just ask for that in my constructor.

I neither need to know nor care how to set it up or initialise it. That's taken care of by the container - either automatically (convention based registration) or by the maintainer of FooService.

This makes it dead easy to create smaller integration/test harnesses: run the normal container build step, and then resolve an instance of my class, out it pops with all the dependencies handled.


> they are proof that the work that you just did is correct and you can move on

Well... they should be. The number of Tests I've seen that are functionally useless is far too high.

Because what you test is much more important. I know I'm not imparting new wisdom here but if your test can't survive a refactor it's probably a) far too fragile b) poorly written and c) testing the wrong thing.

I actually would not be surprised if a large percentage of Unit Tests are useless, other than coverage stats, but I agree with you that this article is very full of hyperbole to the point of ruining any point it might have had.


Yes, I totally agree with this. I've worked in large orgs, where there were QA engineers that needed work, would come and ask, what tests they could write. My response was always, "well, I think I handled all the positive use cases, so you can add negative test cases", which I've never really seen a huge point in (though that depends on the interface contract).


> given that this was written by a consultant, is it possible that tests get in the way of completing a project in a timely manner, thus causing a conflict of interest in terms of testing?

That's not really what conflict of interest means, but it's certainly possible that his being in that position gives him a different set of priorities than, say, a project manager, and he's working backwards to come up with an argument that justifies his opinion.


Perhaps it would have been better to phrase that as 'competing priorities', but you got the point.


>they are proof that the work that you just did is correct

Nope. They may be evidence, but they aren't proof. Proof comes from formal systems like type systems.


I've also found unit testing to be a fairly good way of documenting code. A unit test will often times shine light on an angle traditional documentation doesn't always cover.


> I have a serious question, given that this was written by a consultant, is it possible that tests get in the way of completing a project in a timely manner, thus causing a conflict of interest in terms of testing?

Doubt so, he works for a software testing consultancy. They've been around since 1994 apparently. http://rbcs-us.com/


I don't belive that you read the document. You are arguing against points which have not been made in the article or which are more nuanced than one would believe by reading posts such as yours.

The article is very reasonable, logical and a good counterpoint to the countless TDD anecdotes which obsessively focus on unit-tests to the detriment of any other types of testing or design practices.


It's not a proof that your work is correct. It's a proof that a few cases, out of potentially millions, or billions, work.


Yeah, anyone that invokes Turing machines to claim unittests don't work has completely missed the point of testing their code. Perhaps if the article was talking about "formal verifications of code" it would bear some resemblance to good advice.


> If it's unit testable, then it has decent separation of concerns

But then the separation of concepts is driven by the testability, not the problem per se. It means that there's possibly a lot of complexity just for supporting testing.


I don't ever unconditionally trust code, but my faith rises:

* a little, if it passes a decent unit and/or integration test suite;

* a fair bit more, if it survives a round with humans who didn't write it but did set out to break it;

* quite a bit more, once it's been in production for a while without complaints tracing back to it;

* gradually approaching, but never reaching, perfect trust the longer it survives exposure to the world.

A lot of bugs will make it past step 1--even through tests with 100% code coverage. Hopefully no one takes them as anything near proof of correctness.


I'm not a strict TDD follower, yet a byproduct of my development process leaves me with unittests. They give me confidence that the code I just wrote behaves the way I want it, and deleting them after that just ensure that refactoring isn't on the table without a significant time investment.

The drawback of this, is that the test suit grow fast, and need to me groomed pretty often, but it's a fair price to pay and in the end it's faster for me, and produce far less bugs.


>Unit tests are not albotrosses around the neck of your code, they are proof that the work that you just did is correct and you can move on.

IMHO, this is a failing of the type system. If you have functional purity, you don't have a need for unit tests, as they are blended into the code you write. Functional purity gives you the power to formally verify the correctness of your program.


> but to say they are useless is just sloppy

or link bait


Yes, that.


Hilarious:

They informed me that they had written their tests in such a way that they didn't have to change the tests when the functionality changed.


We have a test team that requires the product to stay backwards compatible with tests, not the other way.


People start writing tests like they start coding - badly. They forgot all the DRY principles and don't architect their tests to reduce their coupling on the code they are trying to test. The result is tests that are a drag on additional new functionality. Even these are better than no tests as you can refactor the tests to reduce the coupling.

The debate about whether UT or system tests or something in the middle is better is missing the point. A test should be understandable at any level. 5+ mocks per test generally doesn't help the next guy understand what you are trying to test.

If you can abstract your system behind an API to drive and test it, you'll have much longer lasting tests that are more business focused and importantly are clearer for the next person to understand.

I can see great value in identifying the slow and rarely failing tests and running them after the quick / more information producing tests. Aee there any CI support for such things? I know TeamCity can run failing tests first...


That would be true also if the next one trying to understand what you are trying to test is actually a girl.


There was also a follow-up article [1]. My take on the two articles is that he argues that integration tests should be able to replace unit tests in most cases. However, in my own experience, both kinds of tests have their palces.

Why unit tests are good: - You get well-tested parts that you can use in your integration tests, so that the integration tests truly catch the problesm that couldn't be caught at a lower level. This makes trouble-shooting easier.

- Decoupled design - one of the key advantages of TDD

- Rapid feedback. Not all integration tests can be run as quickly as unit tests.

- Easier to set up a specific context for the tests.

There are more details in the blog post I wrote as a response [2].

[1] http://rbcs-us.com/documents/Segue.pdf

[2] https://henrikwarne.com/2014/09/04/a-response-to-why-most-un...


If the code is decoupled it's easier to reason about. The tricky part is breaking things down to the right level of granularity.


Operative word 'most', and then only when done by someone who doesn't understand the goal of unit testing. Any tool can be abused, including testing.

I became a 'convert' after having to clean up a fairly large mess. Without first writing a bunch of test code there would have been no way whatsoever to re-factor the original code. That doesn't mean I'm a religious test writer and that there is 150% test code for each and every small program I write. But unit testing when done properly is certainly not wasteful, especially not in dynamic languages and in very low level functions. The sooner you break your code after making changes the quicker you can fix the bug and close the black box again. It's all about mental overhead and trust.

Unit tests are like the guardrails on the highway they allow you to drive faster confident that there is another layer that will catch you in case something goes wrong rather than that you'll end up in the abyss.


Why are so many people saying that if some developers write bad unit tests, then all unit tests are pointless and a waste of time?

Yes, I've seen thousand line files of boilerplate unittests that don't actually say anything useful about the system. I've also written unit tests that tell me in 2 minutes rather than in 3 weeks that somebody has broken my code.

If your standard for a system of testing is that it guarantees that people can only write good code, you're insane.


Just to add another perspective. There is a (pretty sad but real) business case for unit tests (don't need be good just need to exist :( ). In some b2b/enterprise fields it is an excellent sales pitch to be able to say we have X% test-coverage or 500 unit tests OMG. The people that make the buy decisions want to sleep well no matter if it is based on sound reasoning or not. Test coverage is almost a "feature" (in the bad enterprise IT sense). Sounds less risky to buy the product with more test coverage. Less risk I can understand it's like buying insurance...which is great since insurance turns risk into a budgetable item...yes please sell me this awesome software with that high test coverage.

/cynic


Throw away tests that haven't failed in a year. That just a ridiculous point IMO. Written properly tests act as perfect docimentation for the system. So just because a screen or a process hasn't changed in a year, so therefore the tests have been consistently passing does not mean you throw the tests away! You never know when someone will need to maintain that piece of the system.


I think that the author never understood why unit tests have been introduced. Nowadays, with BDD slowly gaining momentum,unit tests are even more imprescindible and they really do add business value. In the past TDD used correctly added business value, but BDD takes it one step further helping to write more valuable tests. When he speaks about testing all the registers states and all the possible combinations of inputs it is apparent that he is missing the point. The only thing that a well written unit test should accomplish is to make sure that a requirement is implemented correctly. And it must be written while designing the code. When I write some new feature I try to understand the requirements and model them into scenarios. At the same time I start writing the code to implement the feature required that is complete only when the respective scenario is green. During this workflow I change multiple times both the code and the tests and, contrary to what the author experienced, my velocity is not hampered by the tests, but it's actually increased. This happens because I can improve my design thanks to the tests that force me to see the problem from the users point of view. I don't really think that a person that only focuses on writing code without writing any scenarios can really tackle the problem in the best way.


>The only thing that a well written unit test should accomplish is to make sure that a requirement is implemented correctly.

A common side effect of writing these tests is discovering missing or conflicting cases. A recent example for me in eCommerce is our algorithm for determining if a particular product can be added to your cart. Things start simple (is it in stock?), but they get complicated really quickly. Is there a dropship vendor with stock? Do we expect to receive more stock in the next 24 hours? Is it discontinued? Clearance? Does it look like we'll run out in the next day and accidentally oversell the stock we have? (realtime inventory is a little fuzzy still).

When we unit test the module that handles all of this we don't care about internal state or exhaustive input testing, we care about whether the user sees "In Stock" or "Out of Stock" for the situations we encounter


There's actually a section of this where he basically says that a test that always passes is not useful (provides no information) and a test that fails sometimes is useful (provides lots of information.) I'm not sure exactly what failure of reasoning lead to this conclusion but it's totally bogus. They are both useful.

The ideal case is that your codebase is entirely made up of code that never fails and tests that always pass. Obviously sometimes you are going to have tests that fail and introduce bugs that cause tests that used to pass to fail. But that's the reason that you write those tests, to find those problems.

The author gives the silly example of a method that always sets x to 5, and a test that calls it and makes sure x is now 5. That seems like a bad test but anyone who's actually done work as a developer understands why it isn't. If you skip the tests that are simple and straight forward and seem like a waste of time and only write more complicated tests then you will have a hard time reasoning what failed when the complicated test fails. Was your x = 5 method faulty? You don't think so but you don't have proof since it wasn't tested. Having the test, as silly as it seems, lets you know that method is working.

Anyone who has been on a team that skips easy/simple tests knows what a mistake it is. And if you don't, you will eventually.


The beginning of the essay discusses traditional bottom-up programming to top-down programming (using object orientation).

I have written a very large amount of Java code in my career, but after having spent a lot of (personal) time on a Common Lisp project (web application) I can safely say it's still possible to build modern applications using a bottom-up approach. I recommend people try it, it can be quite refreshing.


Any recommended reading to describe the bottom-up approach you are taking?


Unfortunately there isn't much that I know of. There are several frameworks out there, but I built my own. It only requires very little code to get something that mimics, for example, Ruby on Rails.

As I mentioned in another reply, I intend to do some writeups on this stuff, but unfortunately it doesn't have as high priority as it should. But at least a video or two should be doable soon enough.


I recently also spent some time on a CL webapp; do you have any details about what you did?


Well, the project is a Slack-like application. It's open source, so you're welcome to have a look at the code: https://github.com/cicakhq/potato

I have written a few blog posts about the architecture, but unfortunately not too much on the web part. I intend to write some more, and also make some videos showing how nice it is to develop a web application in the same process as the server is running.


Mandate from Above: "All code that is checked in must have seventy percent code coverage"

Developer Response: Keep tests the way they are. Puff up shipping code by adding layers that just check and transpose arguments; enough of these layers and you get your 70% metric of code that's basically guaranteed not to have issues, and check in.

My response, trying to take that stuff and port it: Rip out about 75% of the junk code while cursing clueless management and developers who, ultimately, didn't write very good tests (none of the tests ran anyway, because they required an elaborate lab setup that involved undocumented APIs, installs of binaries meant for OSes that were nearly end-of-life, a SQL server and a bunch of other shit infrastructure. Beware test environment creep).


I've read this advice couple of times "convert your unit tests to assertions". What does it actually mean? Say in the context of web dev, you add assertions to the code and when they fail you log an exception and move on?

Any links related to it will be helpful.


Assertions are usually only run in development or test mode. So if you run an integration test, the unit tests happen inline through the assertions.


    > When I look at most unit tests — especially those
    > written with JUnit — they are assertions in disguise.
Very often they're assertions not at all in disguise. Python's `unittest`, for example.

I'd always assumed the point was to run these assertions at 'test-time', prior to distribution, and not have that code in the 'real' program.

Besides, most (?) of the time we probably want to fail more gracefully than that. (Okay we could `except AssertionError`, but typically it's going to be better to return something else, or raise and handle a more specific exception.)


>I'd always assumed the point was to run these assertions at 'test-time', prior to distribution, and not have that code in the 'real' program.

That's the point of assertions, as they are usually not included in release mode builds in compiled languages. In Python you could just write an assert function that does nothing if some global "release" variable is set to true.

You only gain something from unit-tests if they test something that can't be equally well tested by writing the assertions into the regular code and running the software with typical input.


Is it possible to do TDD without unit tests? I ask because a place where I interviewed had no concepts of mocking, injecting, and would just test the whole chain of classes. But dev leads would insist on TDD. Hmm?


Yep:

https://en.wikipedia.org/wiki/Acceptance_test-driven_develop...

I much prefer this approach. The looser coupling gives you more freedom to undo your initial architectural mistakes.

You also can't effectively do unit test driven development on big balls of mud.

I dislike Gherkin-based languages though. The syntax design was not particularly well thought through.


No I think they are conflating TDD with 'having automated tests'


Yes. You could do BDD with something like cucumber or codeception. It's also possible that the tests use a test database without mocks. You'll generally see tests without mocks or injections in older codebases where the codebase was not built with injection in mind and adding mocks would man rewriting the entire application.


Wow.

> That means that tests have to be at least as computationally complex as code.

My BS sense is tingling. No matter how complex the code, in the end it comes down to comparing the output of a function (or state after execution) against what you expected.

Granted, OO regularly leads to design where unit testing goes straight to hell and after way to many lines of test-setup you essentially only test that your mocking framework works. But spare me these incorrect blanket statements – they don't help – thank you.


My kneejerk reaction is that this is BS but the text has good points and can function as an anecdote on how not to do things.

"... Large functions for which 80% coverage was impossible were broken down into many small functions for which 80% coverage was trivial. This raised the overall corporate measure of maturity of its teams in one year, because you will certainly get what you reward. Of course, this also meant that functions no longer encapsulated algorithms. It was no longer possible to reason about the execution context of a line of code in terms of the lines that precede and follow it in execution,"

Unit tests which break code are stupid. Refactoring is good, but just splitting a large function into smaller pieces does nothing to improve the value of code unless it's done so that there is an understanding of the algorithm available and communicated.

Everything can be abused if not used craftly.


At my company, we (developers) tend to develop mainly automated functional tests and we test our product in blackbox mode. I also prefer favouring functional tests and found that actually many integration tests become useless. For unit tests, I keep that for pieces of code that are used thoroughly in different contexts, i.e, actually code that could be turned into an external library and/or code that is hard to test in integration/functional tests => it makes non-regression much faster to test.

For blackbox mode, I am not that convinced that it is the proper strategy, especially when the product is being built incrementally. The typical example is when an entity state is being modified in a UC and the function to test the state is not yet developped. I'd prefer in that case to have the test verify that the state has been properly updated directly in DB.

More

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: