Unit Testing Is Overrated

chimprich · on July 9, 2020

I tend to mentally divide code into roughly two types: "computational" and "plumbing".

Computational code handles your business logic. This is usually in the minority in a typical codebase. What it does is quite well defined and usually benefits a lot from unit tests ("is this doing what we intended"). Happily, it changes less often than plumbing code, so unit tests tend to stay valuable and need little modification.

Plumbing code is everything else, and mainly involves moving information from place to place. This includes database access, moving data between components, conveying information from the end user, and so on. Unit tests here are next to useless because a) you'd have to mock everything out b) this type of code seems to change frequently and c) it has a less clearly defined behaviour.

What you really want to test with plumbing code is "does it work", which is handled by integration and system tests.

knubie · on July 9, 2020

I've seen this concept called by many names including CQS[0] and Functional Core, Imperative Shell[1]. I'm just leaving this comment here for those that are interested in reading more.

[0] https://en.wikipedia.org/wiki/Command%E2%80%93query_separati... [1] https://www.destroyallsoftware.com/screencasts/catalog/funct...

pydry · on July 9, 2020

Functional / imperative doesn't exactly map on to these two concepts. "Computational" is often imperative and integration code isnt always imperative (a lot of react code would fit in this box, for instance).

kjeetgill · on July 9, 2020

This is always a fun corner of programming terminology to me.

A ton of dense, mathy code like hash computation, de/serialization, sin/cos computation, etc. is usually best implemented in a memory efficient C-style way but lends itself to be used in a very functional way; inputs and outputs without any retained state or side effects.

I think that subtlety is hard to articulate and gets lost.

adamkl · on July 9, 2020

I agree, but if you watch the linked screen cast you can see that its just Gary Bernhardt's take on Hexagonal/Clean/Onion Architecture.

The idea that your business logic should be isolated from external dependencies (in his case, by making the code (pure) functional). That makes it easy to unit test the business logic, and your integration tests should be minimal (basically testing a single path to make sure everything is talking to each other).

pydry · on July 9, 2020

This is adapting the structure of your codebase to the capabilities of unit tests.

It has advantages but it's an expensive waste of resources if you have cheap, effective integrarion tests.

adamkl · on July 9, 2020

Cheap integration tests is usually an oxymoron (when cheap refers to the tightness of the developer feedback loop).

Gary was coming from the land of Ruby-on-rails where a full set of integration tests could take hours. In that environment, structuring your code to enable easy testing of complex logic makes a lot of sense.

Likewise in a large enterprise environment, where integration testing across a (usually messy) set of interconnected dependencies is a pipe dream.

It's true that over-architecting is something to be wary of, but as usual, there's no one-size-fits-all answer.

pydry · on July 10, 2020

It's cheaper to have a TDD test that takes 15 seconds to run and launches into an embedded kernel than it is to re-architect your whole system.

It doesn't matter if the whole test suite takes hours. CI servers don't need to be supervised.

apazzolini · on July 9, 2020

One of the benefits of writing tests is that it makes it painfully obvious which parts of your codebase are poorly architected. Difficulty in writing tests is a code smell.

pydry · on July 10, 2020

That's because unit tests couple tightly to your code. If you're trying to couple something additional to your already tightly coupled code it's gonna be painful.

It's a really expensive way of discovering that you wrote shit code.

eropple · on July 9, 2020

"Computational" code that isn't by the vernacular definition of "functional"--and you can write functional code regardless of programming language--is something of a red flag to me.

Operate only on your inputs. Return all of your outputs. No side effects.

dgb23 · on July 9, 2020

I think OP meant that functional code is inherently imperative under the hood.

eropple · on July 10, 2020

Sure, but "functional" code, in the vernacular, means a rather less specific version of that.

You can, after all, write "functional C". (It can be hard, though.)

marcosdumay · on July 9, 2020

I do like "abstract" and "concrete".

Abstract code solves made-up problems, while concrete code solves real ones. Normally the best way to solve a real problem is by rewriting it as a series of made-up problems, and solving those made-up ones instead.

The made-up problems don't need to be pure computational. Instead, if you restrict them to pure ones, you'll lose a lot of powerful ones. They also don't need to fit functional programming well, but there is no loss of generality on imposing that restriction.

Also, the more abstract you make that code, the less they'll need changing and the better unit tests will fit. At the extreme, once debugged they'll never change. Instead, if your needs change too much, your concrete programs will simply stop using them and use some completely different ones.

tsimionescu · on July 9, 2020

I think the problem with this line of thinking is that, most often, the difficulty lies exactly in rewriting the real problem into the made-up problems. So, to have useful tests, you need to check if your made-up problem solutions actually solve the real problem, which is difficult to express in the first place.

For example, let's say you want to get some users from your DB in response to an HTTP call. We rewrite this problem in terms of crafting some SQL query, taking some data from the HTTP request to create that query. We can of course easily test that the code creates the query we designed that the query contains the right information from the HTTP request etc. But, if we don't actually run the query on the actual DB with the actual users, we don't really know if our query does the right thing, even if we know our code creates the query we intended. And, if the DB changes tomorrow, our very abstract code that parametrizes a particular SQL query will still need to change, so our existing unit tests will be thrown away as well.

This is the kind of plumbing code the OP was talking about, and I don't think you can reduce the problem in any way to fix this (especially if the DB is an external entity).

marcosdumay · on July 9, 2020

There's nothing abstract about that query. You can easily confirm it by looking how you described it exclusively by business terms. Instead, it's the most concrete component on your comment, and it's not prone to unit testing in any way.

greymalik · on July 9, 2020

Humble Object[0] is another version of this.

[0] http://xunitpatterns.com/Humble%20Object.html

abernard1 · on July 9, 2020

> I tend to mentally divide code into roughly two types: "computational" and "plumbing".

I agree with this and would go even further. Divide your code into "stateless" functional code and "stateful" objects code.

Original OO was encapsulating things like device drivers that did I/O--it didn't represent data.

If you don't interleave your stateless business logic with your stateful persistence, it's easy to mock "objects" that do the plumbing, and all the meat of the program is unit tests.

Fwiw, the DI model (Guice, Spring, etc.) in modern Java/Scala shops closely hews to this, even if people don't mentally categorize it as such.

danans · on July 10, 2020

> Divide your code into "stateless" functional code and "stateful" objects code.

IINM you are basically referring to the difference between static and instance methods in languages like C++ and Java.

Putting code that neither reads nor writes the object state and instance methods is a common mistake made in both those languages.

That said, both stateful and stateless code are good candidates for unit testing, especially when the code under test is a state machine, rather then just a data encapsulation mechanism.

abernard1 · on July 10, 2020

Nah brah. (I'm gonna say "brah" because I'm feeling especially salty)

Your _program_ should have the flow of a function. At the architectural level, who-the-ef cares about about static vs instance methods in Java (I say as a person with 23 years of Java experience.) It has nothing to do with languages. You can do this in any language you want.

You want to have your inputs go through a process where you have (1) INPUT state transfer, (2) some computation F(INPUT), (3) some output and state transfer, or RESULT = F(INPUT).

If you do not have (1) or (3)--I hate to break it to you--but all your program does is burn CPU. If you don't have (2), your program does nothing at all.

The key thing with scalable systems is they manage complexity well. If you're at the level where you're worried about "static or instance methods", you're not dealing with how data changes in large systems at all. Those words are at the level of state within a language.

You need to optimize at the global systems level.

danans · on July 10, 2020

> At the architectural level, who-the-ef cares about about state vs instances methods in Java

Who-the-ef should care is anyone who has to implement or maintain the code. After all, the debate at hand is what is worth unit testing, which very much concerns the programming language and the actual implementation. Don't know about you, but I both architect the system and write the code.

> If you do not have (1) or (3)--I hate to break it to you--but all your program does is burn CPU.

I haven't written production code that doesn't have (1) or (3) in my 25 years of programming, so not sure who you are talking to here.

> If you're at the level where you're worried about "static or instance methods", you're not dealing with how data changes in large systems at all. Those words are at the level of state within a language.

You have to tend to this stuff at both the generic data processing and language level. Using a given language's constructs for differentiating between stateful and stateless code is an important part of making the code document itself.

Coding style matters.

abernard1 · on July 10, 2020

> Coding style matters.

It. does. not.

If it did, PHP wouldn't be running half the world. Structure and systems matter.

danans · on July 10, 2020

> It. does. not.

OK 'brah', whatever works for you!

> If it did, PHP wouldn't be running half the world

PHP has a style guide, and there is such a thing as clean, readable PHP code.

https://www.php-fig.org/psr/psr-1/

I bet massive scale PHP based apps (like you know, Facebook) probably enforce style in their codebase.

Jestar342 · on July 9, 2020

This kind of categorising I've always found to be orthoganal to what should really be the measure of "does it deserve a [unit] test?" . I believe the correct way to assess how much (if any) automated testing at whatever level is decided only by how valuable that thing, and the inverse of the impact of that thing going wrong, is.

If you are writing on one shot script to transmute data from one format to another for say an upgrade, I don't care if you have unit tests if I am confident it has been manually tested to satisfaction. No repeatability, no regression requirement. There could and likely is value in TDD so tests might still be a thing if that is how you work. No objection there.

If you are developing the plumbing code that will ensure my system adheres to financial regulations and, if it were to break, land me in jail for negligence, you can be damn sure I'm demanding a test that will be run everytime that system is built/deployed.

I wrote unit tests >10 years ago for formatting a string for postal codes that I know are still run to this day on every commit because if they get it wrong there is legal recourse for the company that owns that system.

It's also super quick to fix and failing at build is quicker and cheaper than failing in prod, even without the recourse. That test took me all of 1 minute to write. Bargain.

zerd · on July 9, 2020

> I wrote unit tests >10 years ago for formatting a string for postal codes that I know are still run to this day on every commit because if they get it wrong there is legal recourse for the company that owns that system.

If it's critical for your business I'd categorize that as business logic, not plumbing code, well deserving of unit test coverage.

Jestar342 · on July 9, 2020

I don't really care for the distinction is my point. It's valuable and that's all that matters.

marcosdumay · on July 9, 2020

> I believe the correct way to assess how much (if any) automated testing at whatever level is decided only by how valuable that thing, and the inverse of the impact of that thing going wrong, is.

Unit tests and automated tests are two completely different concept.

Jestar342 · on July 9, 2020

Vehemently disagree. Unit tests are subset of automated tests.

tarkin2 · on July 9, 2020

Wonderful description. In brief:

Unit test algorithmic code; use integration tests for everything else, i.e plumbing code.

pydry · on July 9, 2020

There's a very controversial conclusion to this : some projects ought to have zero or close to zero unit tests.

I agree very strongly with this but a lot of people will be very unhappy with this idea.

collyw · on July 9, 2020

I agree with this, I also admit that 95% of the code I write is plumbing code. (There is an art to making your plumbing nice).

eweise · on July 9, 2020

Yes I agree as well. Our company uses Spring to write banking software and there is rarely a case that involve purely logic that can be separated from its dependencies. I used to try isolating code into separate methods that took no dependencies but it just made the code harder to read. Now we just test invoking the grpc endpoints and include the db (with rollback) and it works quite well.

abernard1 · on July 9, 2020

I would suggest making the business logic stateless methods ("functions") that take data records that are immutable passed between it.

That allows strict separation of all I/O from testable business logic.

If you can't separate pure logic from your I/O, it means you have a Russian-doll program that looks like:

readFromApi {

  doSomeBusinessLogic{

    writeToPersistence{

      ...

Instead of a pipeline like:

a <- readFromApi

b <- doBusinessLogic(a)

c <- writeToPersistence(b)

If you do things this way, you can always isolate your business logic from your dependencies.

tsimionescu · on July 9, 2020

The problem is that doBusinessLogic(a) is often entirely about transforming a into whatever the current DB accepts. Sure, you can write a test to check that b.Field_old == a["field"] , but this buys you very little. The real question is whether you should have mapped a["field"] or a["oldFields"]["Field"] to b.Field_old, and your unit test is not going to tell you that, you need an integration test to actually verify that you made the right transformations and you're getting the correct responses.

By all means, if the transformation is non-trivial, and it is captured entirely in the logic of this method, not in the shape of the API and the DB, then you should unit test it (e.g. say you are enforcing some business rules, or computing some fields based on othee fields). But if you're just passing data around, this type of testing is a waste of time (you don't have reasons to change the code if the API or DB don't change, so the tests will never fail), and brittle (changes in the API or in the DB will require changing both the code and the tests, so the tests failing doesnt help you find any errors you didn't know about).

abernard1 · on July 9, 2020

> The real question is whether you should have mapped a["field"] or a["oldFields"]["Field"] to b.Field_old, and your unit test is not going to tell you that, you need an integration test to actually verify that you made the right transformations and you're getting the correct responses.

So I would argue you don't actually have business logic then. Your service is anemic, and you have a data transformation you need to do. I definitely think that you should do an integration test for that.

Moving JSON -> Postgres or whatever is something that you absolutely still can test with the output of the DML statement by your DB library. It may be a silly test, but that's because if there's no business logic, it's a silly program _shrug_.

abernard1 · on July 9, 2020

While it's bad form to reply to your own post, I might add this is just what a function is in the large, but you're viewing your program this way.

a <- readFromApi ( Input x )

b <- doBusinessLogic(a) ( f(x) )

c <- writeToPersistence(b) ( Output y = f(x) )

You can also imagine that there are more than one lookup from the db or service calls as I/O in different parts of the pipeline (g(f(x) etc.), but it's always possible to have state pulled in explicitly and pushed down explicitly into business logic as an argument. It tends to make programs have flatter call stacks as well.

marcosdumay · on July 9, 2020

The people that place a controversy on this can be safely ignored.

rmujica · on July 9, 2020

Wise words spoken before me:

>> Write tests. Not too many. Mostly integration.

monksy · on July 9, 2020

Watch the boeing fall from the sky. (Developer missed that particular configuration in testing)

toast0 · on July 9, 2020

The amount of effort spent finding errors before you ship it has to be related to to cost of fixing errors including the consequences of the errors if they're found after you ship.

If errors in your system result in death, and if changes must go through an expensive and time consuming process to be approved, and then an expensive and time consuming process to be applied, you should spend a lot of time ensuring your design is sound, and your implementation matches your design. A good place for formal methods.

If you're writing server side code, and deploy takes 5 minutes, you can be a cowboy for most things that won't leave a persistant mess or convince customers to leave.

If you're writing client side code that needs to go through a pre-publication review, neither cowboy or formal methods is a good choice.

monksy · on July 9, 2020

What if you're a libffmpeg or libsdl developer?

greyhair · on July 9, 2020

Where do you slot device drivers in that hierarchy?

mrloba · on July 9, 2020

Yes! I do something similar which is sometimes referred to as functional core imperative shell. My goal is to put as much code as possible in the computational/functional part. This part is easy to test since it's pure. The remaining plumbing/imperative part has much less code, less dependencies, and less logic, which as you say doesn't need unit tests anymore. It needs less dependency injection as well, which is a huge bonus.

tsimionescu · on July 9, 2020

You should still be careful that your pure logic is actually doing something by itself, rather than just massaging data from one external format to another external format.

A lot of code can be in this are where it is absolutely unit testable, but the unit tests are almost entirely useless, as the code only ever changes because the input or output types change, so the tests also need to change.

I think of this in terms of code that is 'authoritative' for its logic or not.

For example, a sorting method is authoritative - it is the ultimate definition of what sorting means. Also, a piece of code that validates some business rule defined in a document is the authority for that business rule.

But a piece of code that takes input from the user and passes it to some other piece of code is not authoritative for this transformation. The functionality of this kind of code is not defined by some spec, but by 'whatever the other piece of code wants to receive', which may be arbitrarily hard to define.

Depending on the complexity of the transformation, there may still be reasons to test parts of this code, at least to ensure that a new field here doesn't affect the way we transform that other field there, but often only small pieces of it are actually worth testing.

jfengel · on July 9, 2020

This has been a problem for me with my quarantine project. It's little more than a CRUD app: get some data, download it, display it on the screen. There's practically no business logic to it; the entire project is wiring up various XaaS. By the time I mocked everything that needed to be mocked, I'd have put more effort into mocking than the project itself.

I test the parts that are actually mine as best I can, but most of my debugging consists of driving it by hand.

toast0 · on July 9, 2020

> By the time I mocked everything that needed to be mocked, I'd have put more effort into mocking than the project itself.

More importantly, that your app works with the mocks doesn't give you good information about weather your app works with the actual services.

thrownaway954 · on July 9, 2020

THIS....

i unit test business logic since that is the core of the application and MUST work as expected.

i'm not going to unit test a link that someone clicks on goes to the page they expect.

davidcorbin · on July 9, 2020

This. Making the distinction between the two is huge. Save time and money testing the right pieces of a codebase.

nateroling · on July 9, 2020

Absolutely. One other takeaway is "write less plumbing code". Write library code that simplifies your plumbing, and unit test that.

cylon13 · on July 9, 2020

This is sort of what I've done with some success in developing games. Games in general are grossly under-tested, but there are a few good reason for that. Lots of systems can be effectively tested by just playing, and often it's tough to tease out as small of units for useful isolated testing as you would in other types of programs.

What I've been doing is writing as many parts of the game as libraries as is possible, and then implementing the minimal possible usage of that library as a semi-automated test. For instance, our collision system is implemented as a library, and you can load up a "game" that has the simplest possible renderer, no sound, basic inputs, etc. and has a small world you can run around in that's filled with edge cases. This was vastly easier than trying to write automated tests for 3d collision code, and you get the benefit of testing the system in isolation, if not automatically. For other libraries like networking, the tests are much more automated, but they poke the library as a unit, rather than testing all the little bits and pieces individually.

reallydontask · on July 9, 2020

I really wish I had come up with this, it really neatly captures my experiences and how sometimes unit tests were really useful (Developing a (Benefit) Claims Engine which essentially did a bunch of complex calculations and then spit data out) whereas other times, unit tests just feel like a massive chore with mocks and similar stuff that add little to no value and certainly should've been at a higher level (integration or system tests) but the powers that be wanted coverage.

danans · on July 10, 2020

> I tend to mentally divide code into roughly two types: "computational" and "plumbing".

I think of the "computational" type more as a "deterministic data transformation" type. That applies to transformations of any data whether text, images, or the state of a machine.

I think of plumbing as the movement of data without any transformation, or if a transformation occurs, it occurs at and abstracted layer that must be unit tested itself independently.

hahamrfunnyguy · on July 9, 2020

My thoughts exactly. Unit tests are a huge help in computational-heavy portions of a project and are easy to write. The other areas of a project don't benefit as much and the tests are harder to write and keep maintained.

mercer · on July 9, 2020

I'd add that perhaps for this 'plumbing code', the way you describe it, gradual/static typing is a great solution.

dmos62 · on July 9, 2020

Computation can be seen as plumbing. I think what you mean by "computational" is complicated plumbing.

heyoo · on July 9, 2020

Same point being made in the blog post. I do recommend others to read the post though - good stuff.

HumblyTossed · on July 10, 2020

I completely agree, but it doesn't help me hit the code coverage goal foisted upon me.

qmmmur · on July 9, 2020

What if you write code that isn't for a business? How does your workflow apply then?

jugg1es · on July 9, 2020

business logic does not mean it has to be for a business. It's more like calling the pointy part of a spear the "business end". It's the part that does the job.

hoorayimhelping · on July 9, 2020

Business Logic is a euphemism. It doesn't mean literally business logic, it means the 'core functionality of your code.' When you design software, you typically model some real world process or system in the abstract. Business logic is the core problem of your model. You can also call it model code, or core functionality. It all means the same thing - it's the important part of your app.

Using the old Asteroids arcade game [1] as an example: The business logic is how many lives the player has, what happens when you shoot asteroids (they break up, or disintegrate if they're small), what happens when you reach the edge of the map (you wrap around the other side), what kind of control scheme there is (there's momentum in asteroids, you don't stop on a dime) etc.

1) https://www.youtube.com/watch?v=WYSupJ5r2zo

SamuelAdams · on July 9, 2020

"domain logic" might be a better euphemism. Consider a library that encrypts text with AES-256. You might want unit tests that verify the IV, cypher block, plaintext and encrytped text (result) of that function. The method, "encrypt" might be your "business logic" that ought to be unit tested.

gnulinux · on July 9, 2020

"Business logic" is just another name of "logic". E.g. something like "if X is even, then print 'fizz' otherwise print 'fuzz'" is considered business logic.

jugg1es · on July 9, 2020

Wish I could upvote this more than once.

claudiusd · on July 9, 2020

I can't believe I'm wasting my time on another testing debate.

Speaking as a formerly young and arrogant programmer (now I'm simply an arrogant programmer), there's a certain progression I went through upon joining the workforce that I think is common among young, arrogant programmers:

1. Tests waste time. I know how to write code that works. Why would I compromise the design of my program for tests? Here, let me explain to you all the reasons why testing is stupid.

2. Get burned by not having tests. I've built a really complex system that breaks every time I try to update it. I can't bring on help because anyone who doesn't know this code intimately is 10x more likely to break it. I limp to the end of this project and practically burn out.

3. Go overboard on testing. It's the best thing since sliced bread. I'm never going to get burned again. My code works all the time now. TDD has changed my life. Here, let me explain to you all the reasons why you need to test religiously.

4. Programming is pedantic and no fun anymore. Simple toy projects and prototypes take forever now because I spend half of my time writing tests. Maybe I'll go into management?

5. You know what? There are some times when testing is good and some times where testing is more effort than it's worth. There's no hard-set rule for all projects and situations. I'll test where and when it makes the most sense and set expectations appropriately so I don't get burned like I did in the past.

drchopchop · on July 9, 2020

One of the dark arts of being an experienced developer is knowing how to calculate the business ROI of tests. There are a lot of subtle reasons why they may or may not be useful, including:

- Is the language you're using dynamic? Large refactors in Ruby are much harder than in Java, since the compiler can't catch dumb mistakes

- What is the likelihood that you're going to get bad/invalid inputs to your functions? Does the data come from an internal source? The outside world?

- What is the core business logic that your customers find the most value in / constantly execute? Error tolerances across a large project are not uniform, and you should focus the highest quality testing on the most critical parts of your application

- Test coverage != good testing. I can write 100% test coverage that doesn't really test anything other than physically executing the lines of code. Focus on testing for errors that may occur in the real world, edge cases, things that might break when another system is refactored, etc.

msclrhd · on July 9, 2020

I now tend to focus on a black box logic coverage approach to tests, rather than a white box "have I covered every line of code" approach. I focus on things like format specifications, or component contract definitions/behaviour.

For lexer and parser tests, I tend to focus on the EBNF grammar. Do I have lexer test coverage for each symbol in a given EBNF, accepting duplicate token coverage across different EBNF symbol tests? Do I have parser tests for each valid path through the symbol? For error handling/recovery, do I have a test for a token in a symbol being missing (one per missing symbol)?

For equation/algorithm testing, do I have a test case for each value domain. For numbers: zero, negative number, positive number, min, max, values that yield the min/max representable output (and one above/below this to overflow).

I tend to organize tests in a hierarchy, so the tests higher up only focus on the relevant details, while the ones lower down focus on the variations they can have. For example, for a lexer I will test the different cases for a given token (e.g. '1e8' and '1E8' for a double token), then for the parser I only need to test a single double token format/variant as I know that the lexer handles the different variants correctly. Then, I can do a similar thing in the processing stages, ignoring the error handling/recovery cases that yield the same parse tree as the valid cases.

hnick · on July 10, 2020

I think you missed an important one, which is: how much do bugs even matter?

A bug can be critical (literally life-threatening) or unnoticeable. And this includes the response to the bug and what it takes. When I write code for myself I tend to put a lot of checks and crash states rather than tests because if I'm running it and something unexpected happens, I can easily fix it up and run it again. That doesn't work as well for automated systems.

monksy · on July 9, 2020

You should understand when those tests are low effort: Look for other frameworks that help you to develop those tests easier or frameworks that remove that requirement for you. I.e. Lambok for generation of getters/setters. You only have to unit test code that you wrote.

High test coverage comes from a history of writting tests there. Sadly people include feature and functional tests in the coverage.

jariel · on July 9, 2020

There's an easier answer and that is - as an experienced programmer - don't write any tests for your 'toy project' - at least not at the start.

The missing bit in the discussion is 1) churn, and 2) a devs ability to write fairly clean code.

Early stage and 'toy' projects may change a lot, in fundamental ways. There maybe total re-writes as you decide to change out technologies.

During this phase, it's pointless to try to 'harden' anything because you're not sure what it's entirely supposed to do, other than at a high level.

Trying Amazon Dynamo DB, only to find a couple weeks in that it's not what you need ... means it probably wouldn't make sense to run it through the gamut of tests.

Only once you've really settled on an approach, and you start to see the bits of code that look like they're not going to get tossed, does it make sense to start running tests.

Of course the caveat is that you'll need to have enough coding experience to move through the material quickly, in that, no single bit of code is a challenge, it's just 'getting it on the screen' takes some labour. The experience of 'having done it already many times' means you know it's 'roughly going to work'.

I usually try to 'get something working' before I think too hard about testing, otherwise you 3x the amount of work you have to do, most of which may be thrown out or refactored.

Maybe another way of saying it, is if a dev can code to '80% accuracy' - well, that's all you need at the start. You just want the 'main pieces to work together'. Once it starts to take shape, you've got to get much higher than that, testing is the way to do that.

LouisSayers · on July 9, 2020

This is the approach I take as well, and also think about it in terms of “setting things in stone”.

When you’re starting out a project and “discovering” the structure of it, it makes very little sense to lock things in place, especially when manual testing is inexpensive.

Once you have more confidence in your structure as it grows you can start hardening it, reducing the amount of manual testing you do along the way.

People that have hard and fast rules around testing don’t appreciate the lifecycle of a project. Different times call for different approaches, and there are always trade offs. This is the art of software.

whb07 · on July 9, 2020

I agree with all your points. Have you looked at any strongly typed functional language from ML like Ocaml, F#, Rust, or say similar like Haskell?

If you do make a slight tweak somewhere, the compiler will tell you there’s something broken in obscure place X that you would find out at runtime say with Ruby or Python.

THATS the winning formula. I’ve written so many tests for Python ensuring a function’s arguments are validated rather than the core logic/process of it.

marcosdumay · on July 9, 2020

> THATS the winning formula.

Not so fast. For some problems it's great, for other ones it's not.

Have you tried writing numeric or machine leaning core in Haskell? You'll notice that the type system just doesn't help you enforce correctness. Have you tried writing low level IO? The logic is too complex to capture on types, if you try to use them you'll have a huge problem.

wizzwizz4 · on July 9, 2020

> Have you tried writing low level IO? The logic is too complex to capture on types, if you try to use them you'll have a huge problem.

Rust's got a very Haskell-like type system, but it's a systems programming language. People are literally writing kernels in it. I think this is a pure-functional-is-a-bad-way-to-do-real-time-I/O thing, not a typing thing.

steveklabnik · on July 9, 2020

While this is true in some senses, Rust's type system is very different than Haskell's when it comes to handling IO.

That said, I don't think it's impossible to type IO. https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-typ... isn't the same problem, but it's related.

marcosdumay · on July 9, 2020

Hum... Pure functional is a bad way to do real time I/O, but my point was about types.

If you try to verify the kind of state machines that low level I/O normally use with Haskell-like types, you will gain a huge amount of complexity and probably end with more bugs than without.

wizzwizz4 · on July 10, 2020

Low-level I/O doesn't seem to have that much complexity, unless you're trying to handle all of the engineers' levels of abstraction at once.

Let's say you're writing a /dev/console driver for an RS-232 connection. Trying to represent "ring indicator", "parity failure", "invalid UTF-8 sequence", "keyboard interrupt", "hup" and "buffer full" at the same level in the type system will fail abysmally, but that's not a sensible way of doing it.

I could definitely implement this while leveraging the power of Rust's type system – Haskell would be a stretch, but only because it's side-effect free and I/O is pretty much all side-effects.

claudiusd · on July 9, 2020

I have only done a little bit but I know exactly what you're talking about and it's great.

whb07 · on July 9, 2020

Really give it a go! It is beyond worldly. If you think Typescript is great, then ocaml/f# will make it look inferior.

If you're doing React + Typescript give Reasonml which is a syntax sugar on top of Ocaml that compiles using bucklescript a go. Ocaml has the fastest compiler out there.

[0] https://reasonml.github.io/

goostavos · on July 9, 2020

You could always go even further to the FP darkside and join the Purescript community >:)

whb07 · on July 9, 2020

How’s the tooling for that? Haskell has the “best” compiler and garbage tooling that should be built on top of the ol’ rolls Royce engine it’s rocking on.

Meanwhile the plugins and IDE integrations for Reason/Ocaml and F# are ready to go from the start and work pretty well.

tunesmith · on July 9, 2020

Just a data point, with my current team, everyone jumped right in and wrote code and wrote tests from the start. The tests were integration tests that depended on the test database. Worked great at first, but then tests started failing sporadically as it grew. Turning off parallelism helped a bit, but not entirely. Stories starting taking longer too, where features entailed broad changes - it felt like every story was leading to merge conflicts and interdependency, where one person didn't want to implement their fix until someone else finished something that would change the code they were going to work on.

So then I came along and said, "hey, why don't we have any unit testing?" and it turns out because it was pretty impossible to write unit tests with our code. So I refactored some code and gave a presentation on writing testable code - how the point of unit testing isn't just to have lots of unit tests, how it's more that it encourages writing testable code, and that the point of having testable code means that your codebase is then easier to change quickly.

I even showed a simple demonstration based off of four boolean parameters and some simple business logic, showing that if it were one function, you'd have to write 16 tests to test it exhaustively, but if you refactored and used mocking, you'd only have to write 12. That surprised people. Through that we reinforced some simple guidelines of how we'd like to separate our code, focusing on pure functions when possible, making layers mockable. We don't even have a need for a complicated dependency injection framework as long as we reduce the # of dependencies per layer.

Since that time we've separated our test suite into integration tests and unit tests, with instructions to rewrite integration tests to unit tests if possible. (Some integration tests are worthwhile, but most were just because unit tests were hard at that time.) We turned parallelism back on for the unit test suite. The unit tests aren't flaky, and now people are running the unit test suite in an infinite loop in their IDE. Over that time our codebase has gotten better structured, we have less interdependence and merge conflicts, morale has improved, velocity has gone up.

Anyway, according to this article it sounds like we've done basically the opposite of what we should have done.

Yhippa · on July 10, 2020

Do you have a link to your presentation?

tunesmith · on July 14, 2020

Sorry, nothing that's so good for the general public, but the general gist is that the goal for a test is something that is simultaneously small, fast, and reliable.

And that by following those three principles, it kind of drives you to writing testable code. Because if you don't, you might have tests that are only small (simple integration tests), or only fast and reliable (testing unfactored code with lots of mocking) - and that the only way to do all three is by refactoring to write testable code that has good layer separation and therefore minimal mocking requirements.

There was stuff in there about how mutable state and concurrency leads to non-determinism and therefore unreliable tests, which is part of what justifies pushing towards pure functions that can be easily unit tested without mocking.

senorjazz · on July 9, 2020

> because I spend half of my time writing tests.

Only half your time? You're doing testing wrong if it doesn't take 80% of the time ;-)

I have a love hate relationship with testing. Working for myself as a company of one, some of the benefits testing bring just don't apply. I have a suite of programs built in the style of your point (1). The programs were quick to market and hacked out whilst savings ran out not knowing if I would make a single sale.

Sales came, customer requests came, new features were wanted, sales were promised "if the program could just do xyz". More things was hacked on. The promise of "I will go back and do this properly and tidy up this god unholy mess of code" slowly slipped away that I stopped lying to myself I would do it.

Yes there was a phase of fix one problem add another, but I have most of that in my head now and has been a long time since that happened.

Not a single test. Developing the programs was "fun" and exciting. Getting requests for features in the morning and having the build ready by lunch kept customers happy.

Now I am redoing the apps as a web app for "reasons". This time am doing it properly, testing from the start. I know exactly what the program should do and how to do it, unlike the first time when I really had no idea. But still, I Come to a point and realise the design is wrong and I hadn't taking something into consideration. Changing the code isn't so bad, changing the tests, O.M.G.

I am so fed up of the project, I do all I can to avoid it, it is 2 years late, I wish I never started it. The codebase has excellent testing, mocks, no little hacks, engineering wise am proud of it. The tests have found little edge cases that would have been found out by customers so avoided that. But there is no fun in it. No excitement. Is just a constant drudging slog.

Am trying to avoid dismissing testing all together, as I really want to see the benefit of it in a production substantially code base. If I ever get there. At the moment, the code base is the best tested unused software ever written IMO

moreaccountspls · on July 9, 2020

Well, then stop! Delete all the tests right now and do it however you want to do it.

The thing about testing that never really gets talked about it is, what's the penalty for regressions? What's the consequences if you ship a bug so bad the whole system stops working?

Well, if you're building a thing that's doing hundreds of millions in revenue, that might be a big deal. But you? You're a team of one! You rollback that bad deploy and basically no one cares!

Your customers certainly don't care if you ship bugs. If it was something important enough where they REALLY cared, they wouldn't be using a company of one person.

So, go for it. Dismiss tests until you get to a point where you fear deploying because of the consequences. Then add the bare minimum of e2e tests you need to get rid of that fear, and keep shipping.

bleah1000 · on July 9, 2020

There is another cost, if you try and fix a bug and break something else. If your codebase becomes so brittle that you feel like you can't do anything without breaking something else, that makes it unbearable to keep going with that project.

Having said all that, I find that it's better to avoid doing some unit tests when building your own project. It can be better to do the high level tests (some integration, focused on system) to make sure the major functionality works. In many cases, for an app that's not too complicated, you can just have a rough manual test plan. Then move to automated tests later on if the app gets popular, or the manual testing becomes too cumbersome.

It's still good to have a few unit tests for some tricky functions that do complicated things so you aren't spending hours debugging a simple typo.

moreaccountspls · on July 9, 2020

Sure. My point wasn't really whether to write unit tests or not. It's more, do what works for you / your team to enable you to ship consistently. For the OP, spending all of their time writing tests clearly isn't working for them if they haven't shipped at all.

monksy · on July 9, 2020

> Well, if you're building a thing that's doing hundreds of millions in revenue, that might be a big deal. But you? You're a team of one! You rollback that bad deploy and basically no one cares!

Human lives, customer faith in product, GDPR violations, HPPA violations, data, time/resources in space missions

https://medium.com/@ryancohane/financial-cost-of-software-bu...

https://en.wikipedia.org/wiki/Mars_Climate_Orbiter

mercer · on July 9, 2020

> But you? You're a team of one! You rollback that bad deploy and basically no one cares!

I somehow doubt that comparing this 'team of one project' to the Mars Climate Orbiter leads to any useful conclusions. It's a nice bit of hyperbole though!

monksy · on July 9, 2020

Rollbacks can create data loss. Also, rollbacks are not always a viable option.

Anyways..this was to address the issue of a bug. I took the comment of "it's just a team of one" as a way of trying to justify not putting your engineering due diligence into delivering a product to the customer.

mercer · on July 10, 2020

> Rollbacks can create data loss. Also, rollbacks are not always a viable option.

I've delivered a number of products (in the early days of my career) to clients where data loss happened and while not fun, it also didn't significantly harm the product or piss off said client. I saw my responsibility primarily to do the best I could and clearly communicate potential risks to the client.

> I took the comment of "it's just a team of one" as a way of trying to justify not putting your engineering due diligence into delivering a product to the customer.

That I do agree with, but 'due diligence' is a very vague concept. I guess honest communication about the consequence of various choices is perhaps the core aspect?

And of course 'engineering due diligence', in my opinion, includes making choices that might lead to an inferior result from a 'purely' engineering perspective.

moreaccountspls · on July 9, 2020

> not putting your engineering due diligence into delivering a product to the customer.

Yes. This is exactly what this person should do. Stop worrying about arbitrary rules and just deliver the damn product already. A hacky, shitty, unfinished product in your customer's hands that can be iterated on beats one that never got shipped at all every day of the week.

moreaccountspls · on July 9, 2020

I don't understand your point. We're specifically talking about a one person company.

claudiusd · on July 9, 2020

LOL. I guess I was being a bit conservative with that estimate!

I've worked for myself as well and know what you mean. In my situation, I was able to save myself from testing by telling my customers "this is a prototype so expect some issues".

commandlinefan · on July 9, 2020

My observation around codebases that weren't written with/for unit tests is that they always end up being a monolith that you have to run all of in order to run any of. Having decent code coverage means that it's at least possible to run just that one function that fails on the second Tuesday of the month when that one customer in Albania logs in.

heyoo · on July 9, 2020

Your points are fine, but I do not see how they apply to the blog post.

Overall, the blog post says, unit tests take a long time to write compared to the value they bring - instead (or also) focus on more valuable automated integration tests / e2e tests because it is much easier than it was 10-20 years ago.

claudiusd · on July 9, 2020

My point is that OP is in step 1 of 5. It's not to say there aren't any good thoughts there, but the overall diatribe comes from a place of inexperience so take their advice with a grain of salt.

heyoo · on July 9, 2020

I don't think OP is step 1. OP is not arguing against testing, although the title could lead one into thinking that. OP is arguing for better, more reasonable testing.

indigo945 · on July 9, 2020

OP appears to be arguing what you call step 5 of 5. They're not even saying you should never unit test, only that it should be avoided where it doesn't make sense, and that this happens more often than step-3 people like to think. Furthermore, the main direction of the article is that it's arguing for integration testing as a viable replacement for unit testing in a lot of situations, which doesn't relate to your overall point at all.

senorjazz · on July 9, 2020

the comment is relatable to a testing mindset progression, which is relevant.

Your comment on the other hand, less so...

adrianmonk · on July 9, 2020

Step 5 touches on what I like to call "engineering judgment".

One of the things that distinguishes great engineers is that they make good judgment calls about how to apply technology or which direction to proceed. They understand pragmatism and balance. They understand not to get infatuated with new technologies but not to close their minds to them either. They understand not to dogmatically apply rules and best practices but not to undervalue them either. They understand the context of their decisions, for example sometimes code quality is more important and other times getting it built and shipped is more important.

As in life, good and bad decisions can be the key determiner of where you end up. You can employ a department full of skilled coders and make a few wrong decisions and your project could still end up a failure.

Some people never develop good engineering judgment. They always see questions as black and white, or they can't let go of chasing silver bullet solutions, etc.

Anyway, it's one thing to understand how to do unit tests. It's another thing to understand why you'd use them, what you can and can't get out of them, what the costs are, and take into account all that to make good decisions about how/where to use them.

jwr · on July 9, 2020

This. These days I write unit tests only for functions whose mechanism is not immediately clear. The tests serve as documentation, specification of corner cases, and assurance for me that the mechanism does what it was intended to do.

I keep tests together with the code, because of their documentation/specification value.

I do not write tests for functions which are compositions of library functions. I do not test pre/post-conditions (these are something different).

And I definitely do not try to have "100% test coverage".

GuB-42 · on July 9, 2020

Spot on.

Personally, I fast tracked through 2-4 out of sheer laziness but that's definitely my progression in regards to testing and pretty much everything related to code quality. It includes comments, abstraction, purity, etc...

More generally:

- Initially, you are victim of the Dunning–Kruger effect, standing proudly on top of Mount Stupid. You think you can do better than the pros by not wasting time on "useless stuff".

- Obviously, that's a fail. You realize the pros may have a good reason for working the way they do. So you start reading books (or whatever your favorite learning material is), and blindly follow what's written. It fixes your problems and replace them with other problems.

- After another round of failure, you start to understand the reasoning behind the things written in the books. Now, you know to apply them when they are relevant, and become a pro yourself.

koonsolo · on July 9, 2020

Sounds like my life story ;).

One thing I do religiously all the time is putting asserts everywhere. It's the only thing you can go crazy on. The rest is indeed always a balancing act.

tgv · on July 9, 2020

Hear, hear.

jacquesm · on July 9, 2020

Unit tests are not a goal, they are a tool. Striving for 100% test coverage is nonsense, not testing your software at all levels is bad. Middle ground and moderation are where it's at, not a black vs white choice. Just like every other tool you should understand it's strengths and weaknesses and you should apply it properly, not dogmatically or it will bite you.

lloeki · on July 9, 2020

I read this complementary one the other day, and one thing that is readily apparent to me is that a lot of people have a lot of different opinions about testing (move to system tests! more unit tests! regression tests!), but many are not asking the zillion dollar question:

What are you testing for?

This is critical because it basically gives you immediately what you should and should not test, and how. While mindless, dogmatic, metric oriented testing is a waste, testing with higher intent and purpose is extremely useful.

An example: test that something working on current vX also works on vA to vW, and when vZ is out, have the answer readily. Or that a biz feature fulfills the requirements. Or that someone not as well versed on intricate details of your piece of ownership will be confident in that piece still working after a simple fix when you’re on vacation. It can be one, some, but probably not all.

With that in mind, what to test, what doesn’t make sense to test, and what to test against becomes more clear: should I mock this? or should I run it against some staging environment? Should I perform (yikes? not!) manual testing?

The answers are highly dependent on the piece of code being tested.

Tests are here to help you answer a question, if you aren’t sure what the question is then your tests will miss the point.

https://flak.tedunangst.com/post/against-testing

hnick · on July 10, 2020

I feel like a lot of unit testing is just another form of bikeshedding. It's easily understood, everyone can talk about it, and you can spend a lot of time on it with no clear goal but feel like you're getting something done.

mcv · on July 9, 2020

> "Striving for 100% test coverage is nonsense"

100% coverage of what exactly? Tests that go through all your lines of code without testing any of the logic, is useless. If you want to be thorough, you need to do mutation testing, which is a system that tests the quality your unit tests by mutating your logic (changing a > for >=, a + for -, etc) and then expects at least one test to fail. If no test fails, that piece of logic wasn't tested.

Without that, it's entirely possible your high code coverage doesn't actually test anything meaningful. Also, this sort of logic is exactly the kind of stuff you want to unit test. All the standard plumbing boilerplate code is not something that needs to be unit tested. The logic does.

2rsf · on July 9, 2020

and it's not just that, for example branch coverage only says that you branched but not how, a complex branch might need more than one test to be fully, or decently, tested.

zzzcpan · on July 9, 2020

Right, essentially you have to be aware that you are testing the state space the code might end up in, which is very different from just hitting every line of code or every branch.

yaktubi · on July 9, 2020

On that note it is a great tool during development to get a piece working without connecting it to the broader application. Test driven development gives a nice debugging context that is easier to work with. The code coverage and regression part comes as a nice bonus feature.

I’d also like to add that if you contribute code to an open source project it is extremely beneficial to have iron-clad unit tests. Since there is so many devs it would be easy for someone to accidentally break something you fixed already.

KingOfCoders · on July 9, 2020

The benefit of 100% test coverage is that there are no more discussions on what to test and what not. When in doubt, test it. In larger groups of developers there are otherwise ongoing discussions if A needs to be tested or not. I have seen culture wars around this, from people who don't want to test and are in the eyes of others always testing not enough and vice versa. Especially with a diverse development force with different ages, seniority and cultural background.

It's often easier to just aim for 100% test coverage instead (with excluding some categories of files).

EDIT: I would not and did not start with 100% unit testing. But if there are ongoing culture wars and discussions didn't lead to a workable compromise, 100% test coverage worked for me and after some days test coverage was a non issue.

WrtCdEvrydy · on July 9, 2020

> just aim for 100% test coverage instead (with excluding some categories of files).

That's where the 'gaming' comes in.

The tests start just going through lines without hitting a single expect statement.

The ignore files start becoming battlegrounds in the PRs because people just exclude half the damn project.

We just have a simple rule... if you wrote code, you have to write coverage for it. If it breaks and your test doesn't catch the breakage, the bug fix goes back to you. Some people will ask "but what about what I'm working on now", you'll have to communicate that you feel your previous work was far more important.

deleuze · on July 9, 2020

> the bug fix goes back to you. Some people will ask "but what about what I'm working on now", you'll have to communicate that you feel your previous work was far more important.

this feels punitive, especially in the eyes of management. unless you're in a safety critical area where fully testing every code path is a hard requirement, people will eventually write bugs.

i'd rather work somewhere that recognizes defects occur and has a fast iterative process to push out new changes rather than one based on shame for having written a bug.

WrtCdEvrydy · on July 9, 2020

> has a fast iterative process to push out new changes rather than one based on shame for having written a bug.

That is the fastest most iterative process we have found so far... as the expert on the original code, you are able to deliver the best outcome.

You're not being shamed for writing a bug, you're being shamed for not testing your code.

shawnz · on July 9, 2020

Even with 100% coverage, that doesn't mean you've found every bug that can possibly be found with unit tests. Your tests could always cover more inputs, more situations, etc.

KingOfCoders · on July 9, 2020

No, you can't find bugs that you can't think of.

Property based testing and random values in unit tests find lots of bugs you didn't think of though.

brlewis · on July 9, 2020

> you can't find bugs that you can't think of

Rather, you won't find bugs that you choose not to think of because you've let the "100%" number lull you into complacency, even though you know it's 100% of lines/branches, not 100% of inputs.

KingOfCoders · on July 9, 2020

Yes.

My problem in 40y of programming is still making bugs and those I make come from not thinking about edge cases or from wrong assumptions and not from being lulled into writing tests to meet a 100% number.

But personalities differ and if being lulled into security by writing towards a 100% number is a problem for you I would be careful, I totally agree here.

monksy · on July 9, 2020

I agree that fuzz testing+linting can help you. However from my persective lower level tests help you to build trust about the software you're releasing.

tams · on July 9, 2020

100% test coverage is pretty easy to game as long as the metric is known. Just exercise all paths and write as few assertions as you can.

I'd prefer <100% coverage plus discussions about what to test (and how) much more than working with a test suite built on the wrong incentives.

KingOfCoders · on July 9, 2020

If you are someone who games metrics for his benefit or hire people who are gaming metrics for their benefit, I assume yes, this metric is very easy to game as are most metrics. Metric systems are not cheater proof.

misja111 · on July 9, 2020

This argument, that it avoids discussions, does make sense but only if you're in a team where discussions tend to be a waste of time.

When my colleagues are knowledgable and open minded I would embrace every opportunity to have a good discussion.

sanderjd · on July 9, 2020

I like thinking about the trade off between a simpler rule that is mostly right vs. decisions require judgement and consensus. I think the simpler rule is usually the better side of the trade-off.

But in this case I think the cure might be worse than the disease. Tests for plumbing code often end up being brittle tests of methods getting called on mocks in the right order. People will notice that these require a lot of toil to keep them running as code changes while providing very little benefit in avoiding mistakes. People will rankle at being told that they must write these tests, which they can see are a waste of time.

I've done it both ways. I'm much happier with my work when I'm not trying to write tests that are tedious and don't seem to provide any value, in order to hit an arbitrary coverage metric. I suspect my teammates feel the same way, so on teams where I have input into the decision on this, I do not advocate for 100% coverage. It does make it harder to have the discussion of which tests should and shouldn't be written, but I think it's worth that cost.

KingOfCoders · on July 9, 2020

Yes I have seen bad unit tests very often.

Writing good testing code is harder than writing business code. Especially junior developers struggle with this, most often because many companies write not enough tests to learn writing good tests.

And if you're in an environment, where this is a non-issue I think thats great. Don't fix something that doesn't need to be fixed.

ben509 · on July 9, 2020

Yup. The debate over unit testing, being political, is a far bigger impediment to progress than the actual tests, which are a technical hurdle. It's essentially the same reason for linters and auto-formatters.

It has a side benefit that it forces devs to write testable code, which inclines them to reasonbly factored code.

chucky · on July 9, 2020

> It's often easier to just aim for 100% test coverage instead (with excluding some categories of files).

Congratulations, now you have a war over which categories of files are excluded from the "100% test coverage" rule. ;)

KingOfCoders · on July 9, 2020

Thank you!

redleader9345 · on July 9, 2020

"larger group of developers" is the phrase that caught my attention. Humans don't scale well. This is where microservices do become attractive. This service is owned by a small team, and that team makes these types of judgements. It may be very different from how other service is owned and maintained, and that's ok.

KingOfCoders · on July 9, 2020

From my limited experience you get cross team discussions about unit testing, especially if one microservice has too many bugs in the eyes of other teams giving development a bad reputation or making working with a microservice hard. Especially if it breaks with releases and other teams get paged.

DodgyEggplant · on July 9, 2020

Yes. you move the people don't scale well issues to multiple teams don't scale well

tonyedgecombe · on July 9, 2020

Rule by diktat is a recipe for demotivated staff.

KingOfCoders · on July 9, 2020

Ongoing culture wars are a recipe for demotivated staff.

Perhaps I am wrong, and I would not start with a 'diktat' for obvious reasons.

As a manager, you didn't have discussions about the level of necessary code coverage? Would be interested on how you managed unit testing without 'dictat'. How it would fit into integration testing and explorative testing. What level did developers in your deparment usually find "adequat" ? If you considered it too low, how did you raise test coverage as a manager without defining a coverage level?

rimliu · on July 9, 2020

This is dangerously close to not understaning what you are doing. _Thinking_ what to test has a benefit of making you think, not just cargo-culting.

KingOfCoders · on July 9, 2020

Obviously you need to think about how to test. Usually this leads to finding omissions in your code adding edge cases.

1337shadow · on July 9, 2020

Exactly, the 80-20 rule also applies to unit tests. I don't have 100% coverage on big projects, but everything I write that's meant to go in production is in TDD anyway, so there's always enough tests to prevent a junior from breaking my stuff, and I save a lot of time because thanks to TDD I don't need to manually test much, and most of the times not at all, I just wait for user feedback: it's always much easier to have code working in real conditions, if it already works in test conditions, not the other way around.

2rsf · on July 9, 2020

there is real world research that actually shows that the 20/80 works in unit tests, the last 20% hardly catches any issues or contribute very little to quality

DodgyEggplant · on July 9, 2020

And like other tools their importance is part of the entire tool-set. In many shops tight schedules, management by Product managers or people who are too removed from code cause you compromise every other principle of responsible sane coding. When this happens, unit tests are your only shield from doom. If everybody knows and allowed to write sane, good code with reasonable time to build it, the unit tests are nice to have but not a must

pjmlp · on July 9, 2020

Yeah, but those QA dashboards monitored by management....

majikandy · on July 9, 2020

Actually it is pretty common to have 100% coverage with some extra redundancy too (where some things get accidentally covered multiple times). Striving for 100% is indeed nonsense, but having 100% coverage is usually accidental in clean code that you want to work, and merely a by-product of TDD.

I strive for working code. Sometimes I miss something in the TDD cycle and don’t have 100% and it is that which usually comes back to bite you.

I have never found 100% test coverage has bitten me, dogmatic or otherwise.

mytherin · on July 9, 2020

That heavily depends on the size of your codebase and perhaps also the language you are writing in. Writing in C++, for example, I often have switch statements in the form of:

  switch(type) {
  case X:
    ...
  case Y:
    ...
  default:
    throw InternalException("Unsupported type!");
  }

Now if all goes well the default case will never be covered. At some point I thought "why have this code if it's not supposed to run; let's rewrite this so we can get 100% code coverage!", and I ended up with the following code:

  switch(type) {
  case X:
    ...
  default:
    assert(type == Y); 
    ...
  }

Now we can get 100% code coverage... except the code is much worse. Instead of an easy-to-track down exception we now trigger either an assertion (debug) or weird undefined behaviour (release) when the "not supposed to happen" inevitably does happen because of e.g. new types being added that were not handled before.

Is worse code worth getting 100% code coverage? In my eyes, absolutely not. I think good code + testing should be able to reach at least 90% typically, likely 95%, but 100% is often not possible without artificially forcing it and messing up your code and/or making it much harder to change your code later on.

tremon · on July 9, 2020

Why can't you have a test that checks if the correct exception type is thrown on invalid input? The exception is part of your API too.

mytherin · on July 9, 2020

This behavior occurs in internal functions and is not triggerable by the user. The only way to trigger this behavior would be to create unit tests that test small internal functions by feeding them specifically invalid input. This is possible, but I would argue this falls under "dogmatically trying to reach 100% code coverage". Testing small internal functions adds very little value and is detrimental to a changing codebase. After adding these tests every single change you make to internals will result in you needing to hunt down all these tiny tests, which adds a big barrier to changes for basically no pay-off (besides a shiny "100%" badge on Github, of course).

hansvm · on July 9, 2020

As always, I think the answer here is more along the lines of "it depends." It's not that uncommon of a task to make an existing function more performant, and a well thought out test suite makes that leaps and bounds easier even for small, internal functions.

steerablesafe · on July 9, 2020

It's arguable that this is a programming bug an not really recoverable, so throwing doesn't make much sense.

You can be defensive to various degrees about assertions:

1. You can just use assert() to fail in Debug and do nothing in Release. 2. You can be more defensive and define your always_assert() to fail in Release as well. 3. You can double down on the UB with hints to the compiler and provide assume(), which explicitly compiles to UB when it's triggered in Release (using __builtin_unreachable() for example).

About the organization of the if statement: I agree that the former is better, I would use assert(false) though.

mytherin · on July 9, 2020

Indeed it is a programming bug - but programming bugs happen. In my experience writing programs as if bugs will not happen is typically a bad idea :)

Throwing an exception here is basically free (just another switch case) and gives the user a semi-descriptive error message. When they then report that error message I can immediately find out what went wrong. Contrasting with a report about a segfault (with maybe a stacktrace), the former is significantly easier to debug and reason about.

assert_always would provide a similar report, of course. However, as we are writing a library, crashing is much worse than throwing an internal error. At worst an internal error means our library is no longer usable, whereas a crash means the host program goes down with it.

kccqzy · on July 9, 2020

__builtin_unreachable() is your friend.

Better yet, omit that default case, so that in the future when you do add a new value to the enum, the compiler will warn you and force you to add a new case.

But I agree with your general thesis that it's just not worth getting to 100% coverage.

shawnz · on July 9, 2020

If it didn't catch any bugs, either during initial development or through later changes, then it bit you via wasting your time. I don't think it's fair to say that having tests necessarily makes the code under test any cleaner.

alecco · on July 9, 2020

> Striving for 100% test coverage is nonsense

Tell that to SQLite guy.

ben-schaaf · on July 9, 2020

Striving for very high test coverage seems like exactly the kind of thing you want for your database infrastructure.

1337shadow · on July 9, 2020

And that's exactly why you need less tests in your project that uses a DB: there's no need to test the DB because it's covered by its own tests already.

bpicolo · on July 9, 2020

Interactions with your DB are often the most fragile piece of your code, because there's an impedance mismatch between the language you're writing and SQL. Some languages/frameworks abstract this more safely than others, though.

Jtsummers · on July 9, 2020

Interaction, in general, is where many, if not most, errors lie. Unit testing verifies that things work in isolation. But if you code up your unit tests for two components that interact, but with different assumptions, then the unit tests alone won't do you any good: X generates a map like `%{name => string()}`, Y assumes X generates a map like `%{username => string()}`. Now, hopefully that won't happen if you're writing both parts at the same time, but things like this can happen. Now your unit tests pass because they're based on the assumptions of their respective unit of code, but put it together and boom!

1337shadow · on July 9, 2020

Exactly, though I believe there's still a thin line between testing the interaction, and testing the db itself. Just like, the difference between testing some code, and testing the language itself.

kelvin0 · on July 9, 2020

You should definitely watch this presentation on SQLite exploits.

SELECT code execution from using SQlite - DEF CON 27 Conference

https://www.youtube.com/watch?v=HRbwkpnV1Rw

Don't get me wrong, it's a great product and I use it often, but 100% test coverage does NOT equal 100% safe.

nottorp · on July 9, 2020

Something like sqlite can actually be unit tested to 95% or so. If you stretch the definition of 'unit test' a bit (don't mock i/o).

Try that with a networked application that takes user input though...

laumars · on July 9, 2020

Mocking networked services is something very much worth doing because you can then check you’ve set timeouts correctly, handling incomplete or junk responses gracefully etc. Those are the kind of hidden problems that can bite you on production deployments.

nottorp · on July 9, 2020

You can mock networks all you want and you'll still cover only what you can think of.

It will always break on the user's internet, because it's too diverse to predict.

Doesn't mean you can't have some networking unit tests, just that you shouldn't believe in them too much.

Edit: you said services. You thinking of server? I'm thinking of clients.

tremon · on July 9, 2020

"mocking networked services" is exactly what you would do when testing clients.

And it doesn't have to be a static mock. It's not too hard to inject a fuzzer in your mock service response, although that's probably left to a separate testing routine, and not part of your unit test setup. But if you have no mock for your network service, you can't fuzz it either.

jfkebwjsbx · on July 9, 2020

And yet they had bugs and could lose data.

Which shows 100% unit test coverage is not better than spending that time in other kinds of tests.

WrtCdEvrydy · on July 9, 2020

unit tests don't account for timing-based effects and you generally can't test for ACID properties.

tanilama · on July 9, 2020

TBH

SQLite is suitable for 100% coverage.

A lot of application code or workflow style code is hard to reach 100% coverage as they are rarely triggered.

ben509 · on July 9, 2020

The rarely triggered code is exactly where you want unit tests.

That's an opportunity to describe, in code, what that path is supposed to do, and then make sure it does it.

DougBTX · on July 9, 2020

Perhaps the problem isn't mocking dependencies, but trying to hide the fact that GetSolarTimesAsync needs two pieces of data to work: a date and a location.

But the original signature is just this:

    public async Task<SolarTimes> GetSolarTimesAsync(DateTimeOffset date)

That introduces a lot of complexity:

* The SolarCalculator needs to be able to work out its own location, so it needs a LocationProvider

* SolarCalculator needs to be IDiposable since it owns a LocationProvider

* The SolarCalculator will need more methods if it ever needs to calculate the times in a different location

* If fetching the location is slow, but the application needs to calculate times for multiple dates (eg to build up a table of times), then the SolarCalculator will need an method that takes in an array of dates to be efficient

But all that could be solved by making the function take all of the arguments it needs to return its value:

    public SolarTimes GetSolarTimes(DateTimeOffset date, Location location)

No location provider needed, no IDiposable, just one efficient stand-alone method.

Unit testing this is now just:

    var calculator = new SolarCalculator();
    var actual = calculator.GetSolarTimes(new Date(...), new Location(...));
    var expected = new SolarTimes(...);
    actual.Should().BeEquivalentTo(expected);

...so, perhaps the issue isn't that unit testing is a bad idea, but that code which is hard to use in a unit test might also be hard to use in a wider application? And perhaps the fix is to make the code easier to use?

leoedin · on July 9, 2020

Completely agree. The author is trying to blame tests for code complexity. GetSolarTimes() is a simple function evaluating an equation - treat it as such.

If your code is broken down clearly into logic and plumbing, unit testing the logic becomes super easy. It allows you to construct software using blocks you have absolute confidence in. Unit testing plumbing is harder, and that's when integration testing shines.

chrisandchris · on July 9, 2020

I agree with you and that‘s most times also my observation: Unit tests tend to show how decoupled and re-usable your code is. Is a function gets hard to test with a unit test, this points usually towards an architectural issue.

fernandotakai · on July 9, 2020

if i got a dollar every time i started testing a piece of code, realized that it was waaaay to complex, refactored the code so the tests were easier to write/understand... i would have a lot of dollars.

100% of the time, it was the right idea and the code became a lot better.

underdeserver · on July 9, 2020

Exactly.

The author's tests are overly complex. Instead of gleaning the actual value of this insight, which is that you're not cleanly separating your inputs and your outputs, the author concludes that unit tests are a waste of time.

Nope. Unit tests are a tool, but writing proper unit tests and understanding the value they give you is an art and a science. It requires experience and deliberate design.

dorianh_ · on July 9, 2020

Exactly! Thank you!

It seems that most devs (me included) learn at school to write pure functions, which is great. Then they come to the industry and all of the sudden the "parseXml" function takes a ftp port as a parameter... ("be in my case the xml was on a ftp server!")

Why there is no CS course that explains this kind of stuff?

monocularvision · on July 10, 2020

Mentioned elsewhere but that is largely the heart of Functional Core, Imperative Shell: https://www.destroyallsoftware.com/screencasts/catalog/funct...

(And I am sure a bunch of other similar but differently-named concepts)

robalni · on July 9, 2020

I thought about this too when I read the article but then I thought that it might not always be possible to rewrite the code like that in other cases. Maybe the example used in this article was just not the best example.

romwell · on July 9, 2020

Didn't expect to scroll this far to find this.

TL;DR: GetSolarTimes(Location, Date) is a unit-testable function.

Had some thought been put into writing with unit tests, there would be no problems with that example.

andycunn · on July 10, 2020

that feels pretty dismissive / reductive to me...

seems like you're all saying that

  var actual = calculator.GetSolarTimes(new Date(...), new Location(...));

is JUST SIMPLER and better than

  public async Task<SolarTimes> GetSolarTimesAsync(DateTimeOffset date)

with an internal location provider as a dependency because it's easier to test.

but i think that ignores the reason why DI containers were invented in the first place and assumes that the solar calculator is just a simple entrypoint-type application, rather than being a component in real application. You might have 20 layers of THING, somewhere inside which, this solar time calculator lives and is used... and you still have to get Location from SOMEWHERE to pass it into the calculator.

so what happens when Whatever uses the location provider to get the location and pass it along needs to be tested? and through how many layers of stack do you need to pass Location before you realize that every test of every intermediate layer needs to know about location, but only for the purpose of passing it along?

I think it's a more nuanced case than you're making it seem. Beyond some level of complexity in an application, it becomes simpler to co-locate dependencies where they're actually used.

aakresearch · on July 10, 2020

The author puts forward exactly same point, somewhat closer to the conclusion of the article:

  public SolarTimes GetSolarTimes(Location location, DateTimeOffset date)

m12k · on July 9, 2020

I disagree with the notion that making your code testable in isolation serves no other purpose than to write unit tests. It very specifically forces you to think about how and why each piece of code is coupled with other code, and generally requires you to make this coupling as loose as possible, to make testing in isolation possible. Loosely coupled code is also easier to reason about and easier to refactor. So testing doesn't just provide you with the value of tests, it also nudges you toward a saner architecture.

barrkel · on July 9, 2020

I strongly disagree for the reasons listed in the article; it induces the construction and testing of abstractions which exist solely to enable testing, and do not enable simpler reasoning.

Refactoring is even worse. Refactoring after you've split something up into multiple parts and tested their interfaces in isolation is far more work. Any refactoring worth a damn changes the boundaries of abstractions. I frequently find myself throwing away all the unit tests after a significant refactoring; only integration tests outside the blast radius of the refactoring survive.

commandlinefan · on July 9, 2020

> the construction and testing of abstractions which exist solely to enable testing

One of his examples from the article is injecting, IOC-style, the HttpClient instance into his LocationProvider class. He insists that this is a waste of time, and that the automated tests (if you have any at all), should be calling out to the remote service anyway. I can't disagree more! Hopefully you're configuring the automated tests to interact with a test/dev instance of the service and not the production instance (!). But what invariably happens is that the tests fail because the dev instance happened to be down when they ran. And they take a long time to run anyway, so everybody stops running them since they don't tell you anything useful anyway. This is even worse when the remote service is not a web service but a database: now you have to insert some rows before you run the test and then remember to delete them... and hopefully nobody else is running the same test at the same time! To be useful in any way, automated tests must be decoupled from external services, which means mocking, which means some level of IOC.

On the other hand, he also introduces the example of SolarCalculator mocking LocationProvider. I agree that that level of isolation is overkill and will unapologetically write my own SolarCalculator unit test to invoke a "real" LocationProvider with a mocked-out HttpClient, and I'll still call it a unit test. (On the other hand, the refactored designed with the ILocationProvider really is better anyway).

So I think the reason people argue about this is because they can't really agree on what constitutes a unit test. I'd rather step back from what is and isn't a unit test and focus on what I want out of a unit test: I want it to be fast, and I want it to be specific. If it fails, it failed because there's a problem with the code, and it should be very clear exactly what failed where. A bit of indirection to permit this is always worthwhile.

gonzo41 · on July 9, 2020

Maybe your unit's are too big. Unit tests are tricky because it's about coming to a personal and team agreement on what a 'unit' of functionality is.

I find the same issue in throwing away tests when I'm writing small scale integration tests with junit. Usually I'm mocking out the DB and a few web service calls. So those tests become more volatile because their surface is exposed more. But smaller level, function and class level tests can have a really good ROI and they do push you design for testing which makes everything a bit better imo.

UK-Al05 · on July 9, 2020

It's normally the opposite. Unit tests are too small.

If you unit test all of the objects(Because their all public) then refactor the organisation of those objects then all your tests break. Since you've changed the way objects talk to each other, all your mock assumptions go out the window.

If you define a small public api of just a couple of entry points, which you unit test, you can change the organisation below the public api quite easily without breaking tests.

Where to define those public apis is a matter of skill working out what objects work well together as a cohesive unit.

bpicolo · on July 9, 2020

The notion of a public API is really more fluid in the context of internal codebases as well. It's important to maintain your contract for forwards/backwards compatibility when publishing a library for a world. When you can reliably rewrite every single caller of a piece of code, you don't have that problem.

szatkus · on July 9, 2020

I usually test whatever subset of code could be tested with less than about a dozen of test cases. If it's larger then test logical parts of it with mocks in the leaves. For small projects it could be usually a single controller with only some mocks on the edge of the system (database, external APIs etc.). Refactoring the code where there is one test suite per class could be a nightmare.

1337shadow · on July 9, 2020

> I frequently find myself throwing away all the unit tests after a significant refactoring

Good, this time you can get it right.

teknopaul · on July 9, 2020

if it is worth rewriting the code it's worth rewriting the tests.

david-s · on July 9, 2020

Seems to defeat the point of the tests. At least partially.

RHSeeger · on July 9, 2020

Unit tests test that the units do what they are supposed to do. Functional tests test that parts of the system do what it's supposed to do.

If you change the implementation for a unit, a small piece of code, then the unit test doesn't change; it continues to test that the unit does what it's supposed to do, regardless of the implementation.

If you change what the units are, like in a major refactor, then it makes sense that you would need whole new unit tests. If you have a unit test that makes sure your sort function works and you change the implementation of your sort, your unit test will help. If you change your system so that you no longer need a sort, then that unit test is no longer useful.

I don't see why the fact that a unit test is limited in scope as to what it tests makes it useless.

ashearer · on July 9, 2020

If a particular test never finds a bug in its lifetime (and isn't used as documentation either), you might as well as not have written it, and the time would be better spent on something else instead--like a new feature or a different test.

Of course, you don't know ahead of time exactly which tests will catch bugs. But given finite time, if one category of test has a higher chance of catching bugs per time spent writing it, you should spend more time writing that kind of test.

Getting back to unit tests: if they frequently need to be rewritten as part of refactoring before they ever catch a bug, the expected value of that kind of test becomes a fraction of what it would be otherwise. It tips the scales in favor of a higher-level test that would catch the same bugs without needing rewrites.

RHSeeger · on July 9, 2020

> If a particular test never finds a bug in its lifetime (and isn't used as documentation either), you might as well as not have written it

That's like saying you shouldn't have installed fire alarms because you didn't wind up having a fire. Also, tests can both 1) help you write the code initially and 2) give a sense of security that the code is not failing in certain ways.

> It tips the scales in favor of a higher-level test that would catch the same bugs without needing rewrites.

Writing higher level tests that catch the same bugs as smaller, more focused tests is harder, likely super-linearly harder. In my experience, you get far more value for your time by combining unit, functional, system, and integration tests; rather than sticking to one type because you think it's best.

ashearer · on July 9, 2020

My comment went on to say that you don't know ahead of time exactly which tests will prove useful. So you can't just skip writing them altogether. They key point is that if you have evidence ahead of time that a whole class of tests will be less useful than another class (because they will need several rewrites to catch a similar set of bugs) that fact should inform where you spend your time.

To go with the fire alarm analogy and exaggerate a little, it would work like this: you could attempt to install and maintain small disposable fire alarms in the refrigerator as well as every closet, drawer, and pillowcase. I'm not sure if these actually exist, but let's say they do. You then have to keep buying new ones since the internal batteries frequently run out. Or, you could deploy that type mainly in higher-value areas where they're particularly useful (near the stove), and otherwise put more time and money in complete room coverage from a few larger fire alarms that feature longer-lasting batteries. Given that you have an alarm for the bedroom as a whole, you absolutely shouldn't waste effort maintaining fire alarms in each pillowcase, and the reason is precisely that they won't ever be useful.

There are side benefits you mentioned to writing unit tests, of course, like helping you write the API initially. There are other ways to get a similar effect, though, and if those provide less benefit during refactoring but you still have to pay the cost of rewriting the tests, that also lowers their expected value.

To avoid misunderstanding, I also advocate a mixture of different types of tests. My comment is that based on the observation that unit tests depending on change-prone internal APIs tend to need more frequent rewrites, that fact should lower their expected value, and therefore affect how the mixture is allocated.

RHSeeger · on July 9, 2020

I get what you're saying and it makes sense to me.

> unit tests depending on change-prone internal APIs

This in particular is worth highlighting. I tend to now write unit tests for things that are getting data from one place and passing it another, unless the code is complex enough that I'm worried it might not work or will be hard to maintain. And generally, I try to break out the testable part to a separate function (so it's get data + manipulate (testable) + pass data).

RHSeeger · on July 10, 2020

Sorry, that should be "tend to not write".

david-s · on July 9, 2020

I'm definitely a fan of higher level tests that frequently survive refactorings.

I'm not arguing unit tests are useless.

majikandy · on July 9, 2020

Not if you rewrite/change the tests first, since you know the code currently works and you are safe to refactor the tests. Equally you are safe to change the tests to define the new behaviour, and then follow on with changing the code to make it green.

david-s · on July 9, 2020

The point was to change the code structure without changing the tests (possibly to enable a new feature or other change). The challenge being when the tests are at the wrong "level", probably by team policy IME. If you change the tests, how can you be sure the behavior really matches what it was before?

ric2b · on July 9, 2020

Agreed. I see tests as the double entry accounting of programming, they let you make changes with a lot more confidence that you're only changing what you want and not some unexpected thing.

They're not for catching unknown bugs, they're for safer updates.

dboreham · on July 9, 2020

You often end up reimplementing the "comsumer" module for your code in order to test. This is problematic because a) extra work and b) that fixture layer probably doesn't behave exactly like the real caller code and c) now you have to keep those two implementations in sync.

HelloNurse · on July 9, 2020

There shouldn't be anything particularly complex in test code, limiting the extent of any "reimplementation". Moreover, if the test client is different from real clients and not "in sync" it's a good thing: unit tests that do something differently within the limits of documented acceptable behaviour expose assumptions and bugs. For example, suppose outputs should be sorted in a certain way, but they are sorted if a client presents sorted inputs and not because they are actually checked and sorted: a test with random inputs can expose the hole.

jordanpg · on July 9, 2020

Put another way, if your code is very difficult or complicated to unit test, you've probably abandoned best practices for the language you're writing in somewhere along the way, in the name of expediency.

tonyedgecombe · on July 9, 2020

in the name of expediency

or productivity.

alecbenzer · on July 9, 2020

but BeSt PrAcTiCeS