Computational code handles your business logic. This is usually in the minority in a typical codebase. What it does is quite well defined and usually benefits a lot from unit tests ("is this doing what we intended"). Happily, it changes less often than plumbing code, so unit tests tend to stay valuable and need little modification.
Plumbing code is everything else, and mainly involves moving information from place to place. This includes database access, moving data between components, conveying information from the end user, and so on. Unit tests here are next to useless because a) you'd have to mock everything out b) this type of code seems to change frequently and c) it has a less clearly defined behaviour.
What you really want to test with plumbing code is "does it work", which is handled by integration and system tests.
A ton of dense, mathy code like hash computation, de/serialization, sin/cos computation, etc. is usually best implemented in a memory efficient C-style way but lends itself to be used in a very functional way; inputs and outputs without any retained state or side effects.
I think that subtlety is hard to articulate and gets lost.
The idea that your business logic should be isolated from external dependencies (in his case, by making the code (pure) functional). That makes it easy to unit test the business logic, and your integration tests should be minimal (basically testing a single path to make sure everything is talking to each other).
It has advantages but it's an expensive waste of resources if you have cheap, effective integrarion tests.
Gary was coming from the land of Ruby-on-rails where a full set of integration tests could take hours. In that environment, structuring your code to enable easy testing of complex logic makes a lot of sense.
Likewise in a large enterprise environment, where integration testing across a (usually messy) set of interconnected dependencies is a pipe dream.
It's true that over-architecting is something to be wary of, but as usual, there's no one-size-fits-all answer.
It doesn't matter if the whole test suite takes hours. CI servers don't need to be supervised.
It's a really expensive way of discovering that you wrote shit code.
Operate only on your inputs. Return all of your outputs. No side effects.
You can, after all, write "functional C". (It can be hard, though.)
Abstract code solves made-up problems, while concrete code solves real ones. Normally the best way to solve a real problem is by rewriting it as a series of made-up problems, and solving those made-up ones instead.
The made-up problems don't need to be pure computational. Instead, if you restrict them to pure ones, you'll lose a lot of powerful ones. They also don't need to fit functional programming well, but there is no loss of generality on imposing that restriction.
Also, the more abstract you make that code, the less they'll need changing and the better unit tests will fit. At the extreme, once debugged they'll never change. Instead, if your needs change too much, your concrete programs will simply stop using them and use some completely different ones.
For example, let's say you want to get some users from your DB in response to an HTTP call. We rewrite this problem in terms of crafting some SQL query, taking some data from the HTTP request to create that query. We can of course easily test that the code creates the query we designed that the query contains the right information from the HTTP request etc. But, if we don't actually run the query on the actual DB with the actual users, we don't really know if our query does the right thing, even if we know our code creates the query we intended. And, if the DB changes tomorrow, our very abstract code that parametrizes a particular SQL query will still need to change, so our existing unit tests will be thrown away as well.
This is the kind of plumbing code the OP was talking about, and I don't think you can reduce the problem in any way to fix this (especially if the DB is an external entity).
I agree with this and would go even further. Divide your code into "stateless" functional code and "stateful" objects code.
Original OO was encapsulating things like device drivers that did I/O--it didn't represent data.
If you don't interleave your stateless business logic with your stateful persistence, it's easy to mock "objects" that do the plumbing, and all the meat of the program is unit tests.
Fwiw, the DI model (Guice, Spring, etc.) in modern Java/Scala shops closely hews to this, even if people don't mentally categorize it as such.
IINM you are basically referring to the difference between static and instance methods in languages like C++ and Java.
Putting code that neither reads nor writes the object state and instance methods is a common mistake made in both those languages.
That said, both stateful and stateless code are good candidates for unit testing, especially when the code under test is a state machine, rather then just a data encapsulation mechanism.
Your _program_ should have the flow of a function. At the architectural level, who-the-ef cares about about static vs instance methods in Java (I say as a person with 23 years of Java experience.) It has nothing to do with languages. You can do this in any language you want.
You want to have your inputs go through a process where you have (1) INPUT state transfer, (2) some computation F(INPUT), (3) some output and state transfer, or RESULT = F(INPUT).
If you do not have (1) or (3)--I hate to break it to you--but all your program does is burn CPU. If you don't have (2), your program does nothing at all.
The key thing with scalable systems is they manage complexity well. If you're at the level where you're worried about "static or instance methods", you're not dealing with how data changes in large systems at all. Those words are at the level of state within a language.
You need to optimize at the global systems level.
Who-the-ef should care is anyone who has to implement or maintain the code. After all, the debate at hand is what is worth unit testing, which very much concerns the programming language and the actual implementation. Don't know about you, but I both architect the system and write the code.
> If you do not have (1) or (3)--I hate to break it to you--but all your program does is burn CPU.
I haven't written production code that doesn't have (1) or (3) in my 25 years of programming, so not sure who you are talking to here.
> If you're at the level where you're worried about "static or instance methods", you're not dealing with how data changes in large systems at all. Those words are at the level of state within a language.
You have to tend to this stuff at both the generic data processing and language level. Using a given language's constructs for differentiating between stateful and stateless code is an important part of making the code document itself.
Coding style matters.
It. does. not.
If it did, PHP wouldn't be running half the world. Structure and systems matter.
OK 'brah', whatever works for you!
> If it did, PHP wouldn't be running half the world
PHP has a style guide, and there is such a thing as clean, readable PHP code.
I bet massive scale PHP based apps (like you know, Facebook) probably enforce style in their codebase.
If you are writing on one shot script to transmute data from one format to another for say an upgrade, I don't care if you have unit tests if I am confident it has been manually tested to satisfaction. No repeatability, no regression requirement. There could and likely is value in TDD so tests might still be a thing if that is how you work. No objection there.
If you are developing the plumbing code that will ensure my system adheres to financial regulations and, if it were to break, land me in jail for negligence, you can be damn sure I'm demanding a test that will be run everytime that system is built/deployed.
I wrote unit tests >10 years ago for formatting a string for postal codes that I know are still run to this day on every commit because if they get it wrong there is legal recourse for the company that owns that system.
It's also super quick to fix and failing at build is quicker and cheaper than failing in prod, even without the recourse. That test took me all of 1 minute to write. Bargain.
If it's critical for your business I'd categorize that as business logic, not plumbing code, well deserving of unit test coverage.
Unit tests and automated tests are two completely different concept.
Unit test algorithmic code; use integration tests for everything else, i.e plumbing code.
I agree very strongly with this but a lot of people will be very unhappy with this idea.
That allows strict separation of all I/O from testable business logic.
If you can't separate pure logic from your I/O, it means you have a Russian-doll program that looks like:
a <- readFromApi
b <- doBusinessLogic(a)
c <- writeToPersistence(b)
If you do things this way, you can always isolate your business logic from your dependencies.
By all means, if the transformation is non-trivial, and it is captured entirely in the logic of this method, not in the shape of the API and the DB, then you should unit test it (e.g. say you are enforcing some business rules, or computing some fields based on othee fields). But if you're just passing data around, this type of testing is a waste of time (you don't have reasons to change the code if the API or DB don't change, so the tests will never fail), and brittle (changes in the API or in the DB will require changing both the code and the tests, so the tests failing doesnt help you find any errors you didn't know about).
So I would argue you don't actually have business logic then. Your service is anemic, and you have a data transformation you need to do. I definitely think that you should do an integration test for that.
Moving JSON -> Postgres or whatever is something that you absolutely still can test with the output of the DML statement by your DB library. It may be a silly test, but that's because if there's no business logic, it's a silly program _shrug_.
a <- readFromApi ( Input x )
b <- doBusinessLogic(a) ( f(x) )
c <- writeToPersistence(b) ( Output y = f(x) )
You can also imagine that there are more than one lookup from the db or service calls as I/O in different parts of the pipeline (g(f(x) etc.), but it's always possible to have state pulled in explicitly and pushed down explicitly into business logic as an argument. It tends to make programs have flatter call stacks as well.
>> Write tests. Not too many. Mostly integration.
If errors in your system result in death, and if changes must go through an expensive and time consuming process to be approved, and then an expensive and time consuming process to be applied, you should spend a lot of time ensuring your design is sound, and your implementation matches your design. A good place for formal methods.
If you're writing server side code, and deploy takes 5 minutes, you can be a cowboy for most things that won't leave a persistant mess or convince customers to leave.
If you're writing client side code that needs to go through a pre-publication review, neither cowboy or formal methods is a good choice.
A lot of code can be in this are where it is absolutely unit testable, but the unit tests are almost entirely useless, as the code only ever changes because the input or output types change, so the tests also need to change.
I think of this in terms of code that is 'authoritative' for its logic or not.
For example, a sorting method is authoritative - it is the ultimate definition of what sorting means. Also, a piece of code that validates some business rule defined in a document is the authority for that business rule.
But a piece of code that takes input from the user and passes it to some other piece of code is not authoritative for this transformation. The functionality of this kind of code is not defined by some spec, but by 'whatever the other piece of code wants to receive', which may be arbitrarily hard to define.
Depending on the complexity of the transformation, there may still be reasons to test parts of this code, at least to ensure that a new field here doesn't affect the way we transform that other field there, but often only small pieces of it are actually worth testing.
i unit test business logic since that is the core of the application and MUST work as expected.
i'm not going to unit test a link that someone clicks on goes to the page they expect.
What I've been doing is writing as many parts of the game as libraries as is possible, and then implementing the minimal possible usage of that library as a semi-automated test. For instance, our collision system is implemented as a library, and you can load up a "game" that has the simplest possible renderer, no sound, basic inputs, etc. and has a small world you can run around in that's filled with edge cases. This was vastly easier than trying to write automated tests for 3d collision code, and you get the benefit of testing the system in isolation, if not automatically. For other libraries like networking, the tests are much more automated, but they poke the library as a unit, rather than testing all the little bits and pieces individually.
I test the parts that are actually mine as best I can, but most of my debugging consists of driving it by hand.
More importantly, that your app works with the mocks doesn't give you good information about weather your app works with the actual services.
I think of the "computational" type more as a "deterministic data transformation" type. That applies to transformations of any data whether text, images, or the state of a machine.
I think of plumbing as the movement of data without any transformation, or if a transformation occurs, it occurs at and abstracted layer that must be unit tested itself independently.
Using the old Asteroids arcade game  as an example: The business logic is how many lives the player has, what happens when you shoot asteroids (they break up, or disintegrate if they're small), what happens when you reach the edge of the map (you wrap around the other side), what kind of control scheme there is (there's momentum in asteroids, you don't stop on a dime) etc.
Speaking as a formerly young and arrogant programmer (now I'm simply an arrogant programmer), there's a certain progression I went through upon joining the workforce that I think is common among young, arrogant programmers:
1. Tests waste time. I know how to write code that works. Why would I compromise the design of my program for tests? Here, let me explain to you all the reasons why testing is stupid.
2. Get burned by not having tests. I've built a really complex system that breaks every time I try to update it. I can't bring on help because anyone who doesn't know this code intimately is 10x more likely to break it. I limp to the end of this project and practically burn out.
3. Go overboard on testing. It's the best thing since sliced bread. I'm never going to get burned again. My code works all the time now. TDD has changed my life. Here, let me explain to you all the reasons why you need to test religiously.
4. Programming is pedantic and no fun anymore. Simple toy projects and prototypes take forever now because I spend half of my time writing tests. Maybe I'll go into management?
5. You know what? There are some times when testing is good and some times where testing is more effort than it's worth. There's no hard-set rule for all projects and situations. I'll test where and when it makes the most sense and set expectations appropriately so I don't get burned like I did in the past.
- Is the language you're using dynamic? Large refactors in Ruby are much harder than in Java, since the compiler can't catch dumb mistakes
- What is the likelihood that you're going to get bad/invalid inputs to your functions? Does the data come from an internal source? The outside world?
- What is the core business logic that your customers find the most value in / constantly execute? Error tolerances across a large project are not uniform, and you should focus the highest quality testing on the most critical parts of your application
- Test coverage != good testing. I can write 100% test coverage that doesn't really test anything other than physically executing the lines of code. Focus on testing for errors that may occur in the real world, edge cases, things that might break when another system is refactored, etc.
For lexer and parser tests, I tend to focus on the EBNF grammar. Do I have lexer test coverage for each symbol in a given EBNF, accepting duplicate token coverage across different EBNF symbol tests? Do I have parser tests for each valid path through the symbol? For error handling/recovery, do I have a test for a token in a symbol being missing (one per missing symbol)?
For equation/algorithm testing, do I have a test case for each value domain. For numbers: zero, negative number, positive number, min, max, values that yield the min/max representable output (and one above/below this to overflow).
I tend to organize tests in a hierarchy, so the tests higher up only focus on the relevant details, while the ones lower down focus on the variations they can have. For example, for a lexer I will test the different cases for a given token (e.g. '1e8' and '1E8' for a double token), then for the parser I only need to test a single double token format/variant as I know that the lexer handles the different variants correctly. Then, I can do a similar thing in the processing stages, ignoring the error handling/recovery cases that yield the same parse tree as the valid cases.
A bug can be critical (literally life-threatening) or unnoticeable. And this includes the response to the bug and what it takes. When I write code for myself I tend to put a lot of checks and crash states rather than tests because if I'm running it and something unexpected happens, I can easily fix it up and run it again. That doesn't work as well for automated systems.
High test coverage comes from a history of writting tests there. Sadly people include feature and functional tests in the coverage.
The missing bit in the discussion is 1) churn, and 2) a devs ability to write fairly clean code.
Early stage and 'toy' projects may change a lot, in fundamental ways. There maybe total re-writes as you decide to change out technologies.
During this phase, it's pointless to try to 'harden' anything because you're not sure what it's entirely supposed to do, other than at a high level.
Trying Amazon Dynamo DB, only to find a couple weeks in that it's not what you need ... means it probably wouldn't make sense to run it through the gamut of tests.
Only once you've really settled on an approach, and you start to see the bits of code that look like they're not going to get tossed, does it make sense to start running tests.
Of course the caveat is that you'll need to have enough coding experience to move through the material quickly, in that, no single bit of code is a challenge, it's just 'getting it on the screen' takes some labour. The experience of 'having done it already many times' means you know it's 'roughly going to work'.
I usually try to 'get something working' before I think too hard about testing, otherwise you 3x the amount of work you have to do, most of which may be thrown out or refactored.
Maybe another way of saying it, is if a dev can code to '80% accuracy' - well, that's all you need at the start. You just want the 'main pieces to work together'. Once it starts to take shape, you've got to get much higher than that, testing is the way to do that.
When you’re starting out a project and “discovering” the structure of it, it makes very little sense to lock things in place, especially when manual testing is inexpensive.
Once you have more confidence in your structure as it grows you can start hardening it, reducing the amount of manual testing you do along the way.
People that have hard and fast rules around testing don’t appreciate the lifecycle of a project. Different times call for different approaches, and there are always trade offs. This is the art of software.
If you do make a slight tweak somewhere, the compiler will tell you there’s something broken in obscure place X that you would find out at runtime say with Ruby or Python.
THATS the winning formula. I’ve written so many tests for Python ensuring a function’s arguments are validated rather than the core logic/process of it.
Not so fast. For some problems it's great, for other ones it's not.
Have you tried writing numeric or machine leaning core in Haskell? You'll notice that the type system just doesn't help you enforce correctness. Have you tried writing low level IO? The logic is too complex to capture on types, if you try to use them you'll have a huge problem.
Rust's got a very Haskell-like type system, but it's a systems programming language. People are literally writing kernels in it. I think this is a pure-functional-is-a-bad-way-to-do-real-time-I/O thing, not a typing thing.
That said, I don't think it's impossible to type IO. https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-typ... isn't the same problem, but it's related.
If you try to verify the kind of state machines that low level I/O normally use with Haskell-like types, you will gain a huge amount of complexity and probably end with more bugs than without.
Let's say you're writing a /dev/console driver for an RS-232 connection. Trying to represent "ring indicator", "parity failure", "invalid UTF-8 sequence", "keyboard interrupt", "hup" and "buffer full" at the same level in the type system will fail abysmally, but that's not a sensible way of doing it.
I could definitely implement this while leveraging the power of Rust's type system – Haskell would be a stretch, but only because it's side-effect free and I/O is pretty much all side-effects.
If you're doing React + Typescript give Reasonml which is a syntax sugar on top of Ocaml that compiles using bucklescript a go. Ocaml has the fastest compiler out there.
Meanwhile the plugins and IDE integrations for Reason/Ocaml and F# are ready to go from the start and work pretty well.
So then I came along and said, "hey, why don't we have any unit testing?" and it turns out because it was pretty impossible to write unit tests with our code. So I refactored some code and gave a presentation on writing testable code - how the point of unit testing isn't just to have lots of unit tests, how it's more that it encourages writing testable code, and that the point of having testable code means that your codebase is then easier to change quickly.
I even showed a simple demonstration based off of four boolean parameters and some simple business logic, showing that if it were one function, you'd have to write 16 tests to test it exhaustively, but if you refactored and used mocking, you'd only have to write 12. That surprised people. Through that we reinforced some simple guidelines of how we'd like to separate our code, focusing on pure functions when possible, making layers mockable. We don't even have a need for a complicated dependency injection framework as long as we reduce the # of dependencies per layer.
Since that time we've separated our test suite into integration tests and unit tests, with instructions to rewrite integration tests to unit tests if possible. (Some integration tests are worthwhile, but most were just because unit tests were hard at that time.) We turned parallelism back on for the unit test suite. The unit tests aren't flaky, and now people are running the unit test suite in an infinite loop in their IDE. Over that time our codebase has gotten better structured, we have less interdependence and merge conflicts, morale has improved, velocity has gone up.
Anyway, according to this article it sounds like we've done basically the opposite of what we should have done.
And that by following those three principles, it kind of drives you to writing testable code. Because if you don't, you might have tests that are only small (simple integration tests), or only fast and reliable (testing unfactored code with lots of mocking) - and that the only way to do all three is by refactoring to write testable code that has good layer separation and therefore minimal mocking requirements.
There was stuff in there about how mutable state and concurrency leads to non-determinism and therefore unreliable tests, which is part of what justifies pushing towards pure functions that can be easily unit tested without mocking.
Only half your time? You're doing testing wrong if it doesn't take 80% of the time ;-)
I have a love hate relationship with testing. Working for myself as a company of one, some of the benefits testing bring just don't apply. I have a suite of programs built in the style of your point (1). The programs were quick to market and hacked out whilst savings ran out not knowing if I would make a single sale.
Sales came, customer requests came, new features were wanted, sales were promised "if the program could just do xyz". More things was hacked on. The promise of "I will go back and do this properly and tidy up this god unholy mess of code" slowly slipped away that I stopped lying to myself I would do it.
Yes there was a phase of fix one problem add another, but I have most of that in my head now and has been a long time since that happened.
Not a single test. Developing the programs was "fun" and exciting. Getting requests for features in the morning and having the build ready by lunch kept customers happy.
Now I am redoing the apps as a web app for "reasons". This time am doing it properly, testing from the start. I know exactly what the program should do and how to do it, unlike the first time when I really had no idea. But still, I Come to a point and realise the design is wrong and I hadn't taking something into consideration. Changing the code isn't so bad, changing the tests, O.M.G.
I am so fed up of the project, I do all I can to avoid it, it is 2 years late, I wish I never started it. The codebase has excellent testing, mocks, no little hacks, engineering wise am proud of it. The tests have found little edge cases that would have been found out by customers so avoided that. But there is no fun in it. No excitement. Is just a constant drudging slog.
Am trying to avoid dismissing testing all together, as I really want to see the benefit of it in a production substantially code base. If I ever get there. At the moment, the code base is the best tested unused software ever written IMO
The thing about testing that never really gets talked about it is, what's the penalty for regressions? What's the consequences if you ship a bug so bad the whole system stops working?
Well, if you're building a thing that's doing hundreds of millions in revenue, that might be a big deal. But you? You're a team of one! You rollback that bad deploy and basically no one cares!
Your customers certainly don't care if you ship bugs. If it was something important enough where they REALLY cared, they wouldn't be using a company of one person.
So, go for it. Dismiss tests until you get to a point where you fear deploying because of the consequences. Then add the bare minimum of e2e tests you need to get rid of that fear, and keep shipping.
Having said all that, I find that it's better to avoid doing some unit tests when building your own project. It can be better to do the high level tests (some integration, focused on system) to make sure the major functionality works. In many cases, for an app that's not too complicated, you can just have a rough manual test plan. Then move to automated tests later on if the app gets popular, or the manual testing becomes too cumbersome.
It's still good to have a few unit tests for some tricky functions that do complicated things so you aren't spending hours debugging a simple typo.
Human lives, customer faith in product, GDPR violations, HPPA violations, data, time/resources in space missions
I somehow doubt that comparing this 'team of one project' to the Mars Climate Orbiter leads to any useful conclusions. It's a nice bit of hyperbole though!
Anyways..this was to address the issue of a bug. I took the comment of "it's just a team of one" as a way of trying to justify not putting your engineering due diligence into delivering a product to the customer.
I've delivered a number of products (in the early days of my career) to clients where data loss happened and while not fun, it also didn't significantly harm the product or piss off said client. I saw my responsibility primarily to do the best I could and clearly communicate potential risks to the client.
> I took the comment of "it's just a team of one" as a way of trying to justify not putting your engineering due diligence into delivering a product to the customer.
That I do agree with, but 'due diligence' is a very vague concept. I guess honest communication about the consequence of various choices is perhaps the core aspect?
And of course 'engineering due diligence', in my opinion, includes making choices that might lead to an inferior result from a 'purely' engineering perspective.
Yes. This is exactly what this person should do. Stop worrying about arbitrary rules and just deliver the damn product already. A hacky, shitty, unfinished product in your customer's hands that can be iterated on beats one that never got shipped at all every day of the week.
I've worked for myself as well and know what you mean. In my situation, I was able to save myself from testing by telling my customers "this is a prototype so expect some issues".
Overall, the blog post says, unit tests take a long time to write compared to the value they bring - instead (or also) focus on more valuable automated integration tests / e2e tests because it is much easier than it was 10-20 years ago.
Your comment on the other hand, less so...
One of the things that distinguishes great engineers is that they make good judgment calls about how to apply technology or which direction to proceed. They understand pragmatism and balance. They understand not to get infatuated with new technologies but not to close their minds to them either. They understand not to dogmatically apply rules and best practices but not to undervalue them either. They understand the context of their decisions, for example sometimes code quality is more important and other times getting it built and shipped is more important.
As in life, good and bad decisions can be the key determiner of where you end up. You can employ a department full of skilled coders and make a few wrong decisions and your project could still end up a failure.
Some people never develop good engineering judgment. They always see questions as black and white, or they can't let go of chasing silver bullet solutions, etc.
Anyway, it's one thing to understand how to do unit tests. It's another thing to understand why you'd use them, what you can and can't get out of them, what the costs are, and take into account all that to make good decisions about how/where to use them.
I keep tests together with the code, because of their documentation/specification value.
I do not write tests for functions which are compositions of library functions. I do not test pre/post-conditions (these are something different).
And I definitely do not try to have "100% test coverage".
Personally, I fast tracked through 2-4 out of sheer laziness but that's definitely my progression in regards to testing and pretty much everything related to code quality. It includes comments, abstraction, purity, etc...
- Initially, you are victim of the Dunning–Kruger effect, standing proudly on top of Mount Stupid. You think you can do better than the pros by not wasting time on "useless stuff".
- Obviously, that's a fail. You realize the pros may have a good reason for working the way they do. So you start reading books (or whatever your favorite learning material is), and blindly follow what's written. It fixes your problems and replace them with other problems.
- After another round of failure, you start to understand the reasoning behind the things written in the books. Now, you know to apply them when they are relevant, and become a pro yourself.
One thing I do religiously all the time is putting asserts everywhere. It's the only thing you can go crazy on. The rest is indeed always a balancing act.
What are you testing for?
This is critical because it basically gives you immediately what you should and should not test, and how. While mindless, dogmatic, metric oriented testing is a waste, testing with higher intent and purpose is extremely useful.
An example: test that something working on current vX also works on vA to vW, and when vZ is out, have the answer readily. Or that a biz feature fulfills the requirements. Or that someone not as well versed on intricate details of your piece of ownership will be confident in that piece still working after a simple fix when you’re on vacation. It can be one, some, but probably not all.
With that in mind, what to test, what doesn’t make sense to test, and what to test against becomes more clear: should I mock this? or should I run it against some staging environment? Should I perform (yikes? not!) manual testing?
The answers are highly dependent on the piece of code being tested.
Tests are here to help you answer a question, if you aren’t sure what the question is then your tests will miss the point.
100% coverage of what exactly? Tests that go through all your lines of code without testing any of the logic, is useless. If you want to be thorough, you need to do mutation testing, which is a system that tests the quality your unit tests by mutating your logic (changing a > for >=, a + for -, etc) and then expects at least one test to fail. If no test fails, that piece of logic wasn't tested.
Without that, it's entirely possible your high code coverage doesn't actually test anything meaningful. Also, this sort of logic is exactly the kind of stuff you want to unit test. All the standard plumbing boilerplate code is not something that needs to be unit tested. The logic does.
I’d also like to add that if you contribute code to an open source project it is extremely beneficial to have iron-clad unit tests. Since there is so many devs it would be easy for someone to accidentally break something you fixed already.
It's often easier to just aim for 100% test coverage instead (with excluding some categories of files).
EDIT: I would not and did not start with 100% unit testing. But if there are ongoing culture wars and discussions didn't lead to a workable compromise, 100% test coverage worked for me and after some days test coverage was a non issue.
That's where the 'gaming' comes in.
The tests start just going through lines without hitting a single expect statement.
The ignore files start becoming battlegrounds in the PRs because people just exclude half the damn project.
We just have a simple rule... if you wrote code, you have to write coverage for it. If it breaks and your test doesn't catch the breakage, the bug fix goes back to you. Some people will ask "but what about what I'm working on now", you'll have to communicate that you feel your previous work was far more important.
this feels punitive, especially in the eyes of management. unless you're in a safety critical area where fully testing every code path is a hard requirement, people will eventually write bugs.
i'd rather work somewhere that recognizes defects occur and has a fast iterative process to push out new changes rather than one based on shame for having written a bug.
That is the fastest most iterative process we have found so far... as the expert on the original code, you are able to deliver the best outcome.
You're not being shamed for writing a bug, you're being shamed for not testing your code.
Property based testing and random values in unit tests find lots of bugs you didn't think of though.
Rather, you won't find bugs that you choose not to think of because you've let the "100%" number lull you into complacency, even though you know it's 100% of lines/branches, not 100% of inputs.
My problem in 40y of programming is still making bugs and those I make come from not thinking about edge cases or from wrong assumptions and not from being lulled into writing tests to meet a 100% number.
But personalities differ and if being lulled into security by writing towards a 100% number is a problem for you I would be careful, I totally agree here.
I'd prefer <100% coverage plus discussions about what to test (and how) much more than working with a test suite built on the wrong incentives.
When my colleagues are knowledgable and open minded I would embrace every opportunity to have a good discussion.
But in this case I think the cure might be worse than the disease. Tests for plumbing code often end up being brittle tests of methods getting called on mocks in the right order. People will notice that these require a lot of toil to keep them running as code changes while providing very little benefit in avoiding mistakes. People will rankle at being told that they must write these tests, which they can see are a waste of time.
I've done it both ways. I'm much happier with my work when I'm not trying to write tests that are tedious and don't seem to provide any value, in order to hit an arbitrary coverage metric. I suspect my teammates feel the same way, so on teams where I have input into the decision on this, I do not advocate for 100% coverage. It does make it harder to have the discussion of which tests should and shouldn't be written, but I think it's worth that cost.
Writing good testing code is harder than writing business code. Especially junior developers struggle with this, most often because many companies write not enough tests to learn writing good tests.
And if you're in an environment, where this is a non-issue I think thats great. Don't fix something that doesn't need to be fixed.
It has a side benefit that it forces devs to write testable code, which inclines them to reasonbly factored code.
Congratulations, now you have a war over which categories of files are excluded from the "100% test coverage" rule. ;)
Perhaps I am wrong, and I would not start with a 'diktat' for obvious reasons.
As a manager, you didn't have discussions about the level of necessary code coverage? Would be interested on how you managed unit testing without 'dictat'. How it would fit into integration testing and explorative testing. What level did developers in your deparment usually find "adequat" ? If you considered it too low, how did you raise test coverage as a manager without defining a coverage level?
I strive for working code. Sometimes I miss something in the TDD cycle and don’t have 100% and it is that which usually comes back to bite you.
I have never found 100% test coverage has bitten me, dogmatic or otherwise.
throw InternalException("Unsupported type!");
assert(type == Y);
Is worse code worth getting 100% code coverage? In my eyes, absolutely not. I think good code + testing should be able to reach at least 90% typically, likely 95%, but 100% is often not possible without artificially forcing it and messing up your code and/or making it much harder to change your code later on.
You can be defensive to various degrees about assertions:
1. You can just use assert() to fail in Debug and do nothing in Release.
2. You can be more defensive and define your always_assert() to fail in Release as well.
3. You can double down on the UB with hints to the compiler and provide assume(), which explicitly compiles to UB when it's triggered in Release (using __builtin_unreachable() for example).
About the organization of the if statement: I agree that the former is better, I would use assert(false) though.
Throwing an exception here is basically free (just another switch case) and gives the user a semi-descriptive error message. When they then report that error message I can immediately find out what went wrong. Contrasting with a report about a segfault (with maybe a stacktrace), the former is significantly easier to debug and reason about.
assert_always would provide a similar report, of course. However, as we are writing a library, crashing is much worse than throwing an internal error. At worst an internal error means our library is no longer usable, whereas a crash means the host program goes down with it.
Better yet, omit that default case, so that in the future when you do add a new value to the enum, the compiler will warn you and force you to add a new case.
But I agree with your general thesis that it's just not worth getting to 100% coverage.
Tell that to SQLite guy.
SELECT code execution from using SQlite - DEF CON 27 Conference
Don't get me wrong, it's a great product and I use it often, but 100% test coverage does NOT equal 100% safe.
Try that with a networked application that takes user input though...
It will always break on the user's internet, because it's too diverse to predict.
Doesn't mean you can't have some networking unit tests, just that you shouldn't believe in them too much.
Edit: you said services. You thinking of server? I'm thinking of clients.
And it doesn't have to be a static mock. It's not too hard to inject a fuzzer in your mock service response, although that's probably left to a separate testing routine, and not part of your unit test setup. But if you have no mock for your network service, you can't fuzz it either.
Which shows 100% unit test coverage is not better than spending that time in other kinds of tests.
SQLite is suitable for 100% coverage.
A lot of application code or workflow style code is hard to reach 100% coverage as they are rarely triggered.
That's an opportunity to describe, in code, what that path is supposed to do, and then make sure it does it.
But the original signature is just this:
public async Task<SolarTimes> GetSolarTimesAsync(DateTimeOffset date)
* The SolarCalculator needs to be able to work out its own location, so it needs a LocationProvider
* SolarCalculator needs to be IDiposable since it owns a LocationProvider
* The SolarCalculator will need more methods if it ever needs to calculate the times in a different location
* If fetching the location is slow, but the application needs to calculate times for multiple dates (eg to build up a table of times), then the SolarCalculator will need an method that takes in an array of dates to be efficient
But all that could be solved by making the function take all of the arguments it needs to return its value:
public SolarTimes GetSolarTimes(DateTimeOffset date, Location location)
Unit testing this is now just:
var calculator = new SolarCalculator();
var actual = calculator.GetSolarTimes(new Date(...), new Location(...));
var expected = new SolarTimes(...);
If your code is broken down clearly into logic and plumbing, unit testing the logic becomes super easy. It allows you to construct software using blocks you have absolute confidence in. Unit testing plumbing is harder, and that's when integration testing shines.
100% of the time, it was the right idea and the code became a lot better.
The author's tests are overly complex. Instead of gleaning the actual value of this insight, which is that you're not cleanly separating your inputs and your outputs, the author concludes that unit tests are a waste of time.
Nope. Unit tests are a tool, but writing proper unit tests and understanding the value they give you is an art and a science. It requires experience and deliberate design.
It seems that most devs (me included) learn at school to write pure functions, which is great. Then they come to the industry and all of the sudden the "parseXml" function takes a ftp port as a parameter... ("be in my case the xml was on a ftp server!")
Why there is no CS course that explains this kind of stuff?
(And I am sure a bunch of other similar but differently-named concepts)
public SolarTimes GetSolarTimes(Location location, DateTimeOffset date)
TL;DR: GetSolarTimes(Location, Date) is a unit-testable function.
Had some thought been put into writing with unit tests, there would be no problems with that example.
seems like you're all saying that
var actual = calculator.GetSolarTimes(new Date(...), new Location(...));
public async Task<SolarTimes> GetSolarTimesAsync(DateTimeOffset date)
but i think that ignores the reason why DI containers were invented in the first place and assumes that the solar calculator is just a simple entrypoint-type application, rather than being a component in real application. You might have 20 layers of THING, somewhere inside which, this solar time calculator lives and is used... and you still have to get Location from SOMEWHERE to pass it into the calculator.
so what happens when Whatever uses the location provider to get the location and pass it along needs to be tested? and through how many layers of stack do you need to pass Location before you realize that every test of every intermediate layer needs to know about location, but only for the purpose of passing it along?
I think it's a more nuanced case than you're making it seem. Beyond some level of complexity in an application, it becomes simpler to co-locate dependencies where they're actually used.
Refactoring is even worse. Refactoring after you've split something up into multiple parts and tested their interfaces in isolation is far more work. Any refactoring worth a damn changes the boundaries of abstractions. I frequently find myself throwing away all the unit tests after a significant refactoring; only integration tests outside the blast radius of the refactoring survive.
I find the same issue in throwing away tests when I'm writing small scale integration tests with junit. Usually I'm mocking out the DB and a few web service calls. So those tests become more volatile because their surface is exposed more. But smaller level, function and class level tests can have a really good ROI and they do push you design for testing which makes everything a bit better imo.
If you unit test all of the objects(Because their all public) then refactor the organisation of those objects then all your tests break. Since you've changed the way objects talk to each other, all your mock assumptions go out the window.
If you define a small public api of just a couple of entry points, which you unit test, you can change the organisation below the public api quite easily without breaking tests.
Where to define those public apis is a matter of skill working out what objects work well together as a cohesive unit.
One of his examples from the article is injecting, IOC-style, the HttpClient instance into his LocationProvider class. He insists that this is a waste of time, and that the automated tests (if you have any at all), should be calling out to the remote service anyway. I can't disagree more! Hopefully you're configuring the automated tests to interact with a test/dev instance of the service and not the production instance (!). But what invariably happens is that the tests fail because the dev instance happened to be down when they ran. And they take a long time to run anyway, so everybody stops running them since they don't tell you anything useful anyway. This is even worse when the remote service is not a web service but a database: now you have to insert some rows before you run the test and then remember to delete them... and hopefully nobody else is running the same test at the same time! To be useful in any way, automated tests must be decoupled from external services, which means mocking, which means some level of IOC.
On the other hand, he also introduces the example of SolarCalculator mocking LocationProvider. I agree that that level of isolation is overkill and will unapologetically write my own SolarCalculator unit test to invoke a "real" LocationProvider with a mocked-out HttpClient, and I'll still call it a unit test. (On the other hand, the refactored designed with the ILocationProvider really is better anyway).
So I think the reason people argue about this is because they can't really agree on what constitutes a unit test. I'd rather step back from what is and isn't a unit test and focus on what I want out of a unit test: I want it to be fast, and I want it to be specific. If it fails, it failed because there's a problem with the code, and it should be very clear exactly what failed where. A bit of indirection to permit this is always worthwhile.
Good, this time you can get it right.
If you change the implementation for a unit, a small piece of code, then the unit test doesn't change; it continues to test that the unit does what it's supposed to do, regardless of the implementation.
If you change what the units are, like in a major refactor, then it makes sense that you would need whole new unit tests. If you have a unit test that makes sure your sort function works and you change the implementation of your sort, your unit test will help. If you change your system so that you no longer need a sort, then that unit test is no longer useful.
I don't see why the fact that a unit test is limited in scope as to what it tests makes it useless.
Of course, you don't know ahead of time exactly which tests will catch bugs. But given finite time, if one category of test has a higher chance of catching bugs per time spent writing it, you should spend more time writing that kind of test.
Getting back to unit tests: if they frequently need to be rewritten as part of refactoring before they ever catch a bug, the expected value of that kind of test becomes a fraction of what it would be otherwise. It tips the scales in favor of a higher-level test that would catch the same bugs without needing rewrites.
That's like saying you shouldn't have installed fire alarms because you didn't wind up having a fire. Also, tests can both 1) help you write the code initially and 2) give a sense of security that the code is not failing in certain ways.
> It tips the scales in favor of a higher-level test that would catch the same bugs without needing rewrites.
Writing higher level tests that catch the same bugs as smaller, more focused tests is harder, likely super-linearly harder. In my experience, you get far more value for your time by combining unit, functional, system, and integration tests; rather than sticking to one type because you think it's best.
To go with the fire alarm analogy and exaggerate a little, it would work like this: you could attempt to install and maintain small disposable fire alarms in the refrigerator as well as every closet, drawer, and pillowcase. I'm not sure if these actually exist, but let's say they do. You then have to keep buying new ones since the internal batteries frequently run out. Or, you could deploy that type mainly in higher-value areas where they're particularly useful (near the stove), and otherwise put more time and money in complete room coverage from a few larger fire alarms that feature longer-lasting batteries. Given that you have an alarm for the bedroom as a whole, you absolutely shouldn't waste effort maintaining fire alarms in each pillowcase, and the reason is precisely that they won't ever be useful.
There are side benefits you mentioned to writing unit tests, of course, like helping you write the API initially. There are other ways to get a similar effect, though, and if those provide less benefit during refactoring but you still have to pay the cost of rewriting the tests, that also lowers their expected value.
To avoid misunderstanding, I also advocate a mixture of different types of tests. My comment is that based on the observation that unit tests depending on change-prone internal APIs tend to need more frequent rewrites, that fact should lower their expected value, and therefore affect how the mixture is allocated.
> unit tests depending on change-prone internal APIs
This in particular is worth highlighting. I tend to now write unit tests for things that are getting data from one place and passing it another, unless the code is complex enough that I'm worried it might not work or will be hard to maintain. And generally, I try to break out the testable part to a separate function (so it's get data + manipulate (testable) + pass data).
I'm not arguing unit tests are useless.
They're not for catching unknown bugs, they're for safer updates.
Striving to make your code testable is almost always worth it. Someone might ask this guy to add some error handling to his code for example. :)
Then he will find out that by writing code, however simple, that a "works on my machine" I.e. is proven to work in a single happy path context is painful to change. Writing code that runs in multiple contexts (composed as an app or decomposed for testing) is intrinsically more easy to work with and change.
Can you expand more on this? I think this is where the author would disagree.
E.g., how is the code easier to reason about or refactor having introduced a location service interface that has only one one implementation?
I've been on projects that focused almost exclusively on unit tests and on projects that focused almost exclusively on integration tests. The latter were far better at shipping actually working code, because most of the interesting problems occur at the boundaries between components. Testing each piece with layer after layer of mocks won't address those problems. Yay, module A always produces a correct number in pounds under all conditions. Yay, module B always does the right thing given a number in kilograms. Let's put them together and assume they work! Real life examples are seldom this obvious, but they're not far off. Also note that the prevalence of these integration bugs increases as the code becomes more properly modular and especially as it becomes distributed.
I firmly believe that integration tests with fault injection are better than unit tests with mocks for validating the current code. That doesn't mean one shouldn't write unit tests, but one should limit the time/effort spent refactoring or creating mocks for the sole purpose of supporting them. Otherwise, the time saved by fixing real problems more efficiently - a real benefit, I wouldn't deny - is outweighed by the time lost chasing phantoms.
Unit tests protect you against current mistakes. They're tied to the exact implementation.
"Right now my function X should call Y on it's dependency Z before it calls A on it's dependency B.
I know that my method should do this, because this is how I designed it now.
Let me write a test and expect exactly that."
Integration and unit tests will tell you whether in the future your code will still work when you refactor.
"Okay, we rewrote the whole class containing the function. Does running my thing still end up writing ABC into that output file?"
Otherwise I agree with you mostly.
If unit tests are tied to an exact implementation, they''ll fail on correct behavior and that's definitely wrong. It shouldn't matter whether X calls Z:Y or B:A first, whether it calls them at all, whether it calls them multiple times, whether it calls them differently. All that matters is that it gets the correct answer and/or has the same final effect.
Unit tests should be based on a module's contract, not its implementation. This is in fact exactly what's wrong with most unit tests, that they over-specify what code (and all of its transitive dependencies) must do to pass, while by their nature leaving real problems at module interfaces out of scope.
b) Even if you have an output, it's dependent on more complex input of arbitrary types.
Assume that there's a method that returns an input based on summing the output of a method call of it's abstract dependencies.
To do dogmatically correct unit testing you'd pass those 2 mocked dependencies, and have those methods return the values when the right method is called on them.
Then you'd assert that B was called on A, that D was called on C, and that the method under test returns the sum of those returns.
As soon as you move into passing implementations of those 2 dependencies, to anyone dogmatic you're doing integration testing.
Even if the tester isn't being dogmatic, in a lot of cases these inputs are complex enough that building enough actual inputs that are consistent and realistic to cover all the cases is prohibitively costly, so they opt for mocks.
Now, suddenly you just have more code to maintain when making changes, but you feel good about yourself.
O -> int
O -> int // of specific value based on dependencies
A -> C -> int // of specific value
M(A) -> M(C) -> int // of specific value
M(A) -> int
M(B) -> int
M(A) -> 3
M(B) -> 5
3 -> 5 -> int // of specific number
3 -> 5 -> 8
The designer of the above monstrosity could learn a lot from the phrase "imperative shell, functional core". It sounds like dogma until you are knee deep in trying to test the middle of a large object graph!
It's integration testing that validates that all your units still combine (integrate) into a working end product. That's not about testing your implementation nor your internal interfaces, that's about testing your program's inputs and outputs.
All tests protect the programmer against future mistakes. All tests are a protection against regressions.
But yes agreed, integration tests absolutely carry much more value than any unit tests might. Specifically because units tests tend to target things that are essentially implementation details.
The only time I'd say unit tests carry any value is if they're testing some especially important piece of business logic e.g. some critical computation. Otherwise, integration tests rank the highest in the teams I lead.
I don't have anything against objects, per se, but I think they tend to make unit testing much more difficult to accomplish. The closer your code resembles pure functions, the easier it is to do dependency injection and unit testing.
Plus the same problem arises with modules instead of objects, which traditionally are even harder to customize.
You can get pretty far with good abstractions and dependency injection. Go's io::Reader and io::Writer interfaces are a great example of this. The resulting functions aren't pure in a technical sense, but they're pretty easy to unit test none the less.
> Plus the same problem arises with modules instead of objects, which traditionally are even harder to customize.
Maybe you could elaborate. I really don't understand what you mean here.
From what I understand, modules just scope names, they don't maintain state. I don't see how they have the same problems as objects.
Which goes back to the article's point of having to write code that is unit test friendly.
Now architecture decisions have to integrate interfaces that wouldn't be needed otherwise.
> Maybe you could elaborate. I really don't understand what
you mean here.
Modules keep state via global variables, module private functions and the surface control that they might expose via public API for the module.
Additionally on languages that support them, they can be made available as binary only libraries.
> Now architecture decisions have to integrate interfaces that wouldn't be needed otherwise.
You're not wrong.
But in the context of functions, that doesn't seem to me to be particularly onerous. If the worst I'm forced to do is change the type of my parameters to an interface instead of a concrete type, that seems like a pretty small price to pay for easy testability. Certainly a much smaller price than the examples in the article.
Imagine doing unit tests for a C application, where modules == translation unit/static/dynamic library, thus you can only do black box testing.
Now one needs to clutter it with function pointers everywhere, or start faking interfaces with structs, just for the benefit of unit tests.
And with static/dynamic libraries than one might need to start injecting symbols into the linker to redirect calls into mocking functions.
All just to keep QA dashboards green.
The fact that it makes unit testing easier is just icing on the cake.
Mainly due to the linking hacks and low level debugging sessions required to mock all necessary calls.
Plus that was just an example, there are plenty of languages with modules and binary libraries.
The libraries dependencies should all be indirected through whatever context struct you pass to all your calls.
Sadly not all code is great.
You probably want it rigged up to your own logger instead of just blindly writing to stdout. You probably want the library's allocations tagged somehow on the heap so you can track down memory leaks. You probably don't want it doing IO directly, because of how many different way there are to do IO.
It's all more a function of how incredibly varied c envs are, than design for testability. It just happens to be very testable as an aside.
Keep the impure code and the pure code separated.
It's where you need to handle mutable state with objects that things get trickier.
Unfortunately, these are exactly the places where you most need tests.
I'm a big fan of constantly returning things rather than holding state in objects, for specifically this reason.
If the only thing you inject is data, can we still call that "dependency injection"?
I suppose that's a philosophical question.
Probably the 2 most common functions I 'dependency inject' are rand() and time.now(). I feel like they count, but you might not.
I also tend to avoid HOF when I can instead pass data around explicitly.
Foo foo = new Foo(mock(Bar.class))
In my experience mock objects can be brittle. A few sprinkled in judiciously can be ok, but once the density gets high enough, it starts to feel like the test becomes decoupled from the actual code it's supposed to test.
Everyone started writing unit tests, and the code broke less. Developers became more confident in deploying, and eventually most PRs looked roughly the same: 10-20 line diff on the top, unit tests on the bottom. If there were no tests, the reviewer asked for tests. It became a fun and safe project to work on, rather than something we all feared might break at any moment.
I've since started insisting on having them as well, especially when I'm using dynamically typed languages. A lot of the tests I write in Python for example are already covered in a language like Go just by having the type system.
So we started adding unit tests. Utesting code that wasn't written for utests is painful: you often need to choose between refactoring or just patching the hell out of it. The latter is highly undesirable, since it leads to verbose tests, failures when you move a module, and the inability to do blackbox testing.
But utests encourage our new code to be clean and readable. We've found that functional programming is much easier to test than object-oriented, and is easier for engineers to grok. We just sprinkle a little dependency injection and the whole thing works nicely.
Itests have their place, but utests lead to faster feedback and more readable code.
Unit tests are an easy path to fall down, because they're clearly easier to setup, to write for, require less effort to maintain, execute more quickly.
But you don't realise their significant downside until after you attempt a major refactor - you begin to see that unit tests are testing at the layer that changes the most anyway.
What's a better term than "dependency injection"? What should I call an argument whose default is always used in production code, but is there to make passing a mock easy? I'm not trying to be snide -- I'm genuinely curious.
1) The use of unit tests as the exclusive automated test type. ie; No functional tests, integration, etc.
2) Test doubles for most or every dependency, even purely functional dependencies like math libraries.
3) Not using the appropriate kind of test double for the test at hand. (Dummies vs Fakes, vs Spies, vs Stubs, vs Mocks)
4) The overuse of mocking libraries.
Mocking libraries have their place, but in opinion, are used approximately a hundred, perhaps even a thousand times more often than they should be. I use them to create test doubles in exactly three scenarios:
1) A dependency that does not have an interface, usually a third party library. This usually happens in one place only, and is used for writing the wrapper code test.
2) A dependency that has an incredibly large interface and/or dependency graph where building a set of stubs or spies is simply not worth the effort.
3) I want to test weird edge cases that's not available any other way, such as theoretically unreachable code.
These should not be the majority of your unit tests!
It feels like the industry has blindly pushed for unit testing everything and 80% or more code coverage as the gold standard.
I’ve given up arguing about the cost/benefit of unit tests at work. I feel that the software the teams I’ve worked on over the past couple of decades still produce about as many bugs as before unit testing came along. I’m not building pace makers or aviation software, mostly LOB applications.
Unit tests provide a false sense of security (especially to management.) Yes sometimes they help catch refactoring bugs, but at what cost?
For a startup with a small team and few customers building an MVP? Unit testing is overrated.
For a company with 50 engineers in 10 teams building a product, that moved $500,000/day in revenue? Unit testing could or could not be overrated.
For a company with 1,000 engineers working in the same repo, shipping a product that moves $50M in revenue per day? Unit testing is most likely underrated - and essential.
You cannot ignore how the organization works, and the cost of a defect that a unit test could have caught. I happen to work at the third type of organization, and while unit tests might not be the most efficient type of safety net, it is a very big one. We have other types of testing layers on top of unit: integration and E2E tests as well.
Also, one more fallacy in the article:
"If we look back, it’s clear that high-level testing was tough in 2000, it probably still was in 2009, but it’s 2020 outside and we are, in fact, living in the future. Advancements in technology and software design have made it a much less significant issue than it once was."
This is not true everywhere. High-level / E2E testing on native mobile applications in 2020 is just as bad as it was on the web in 2009.
You are right, but it still doesn't mean aiming for high coverage. In the big company case you'll want to cover the interfaces and dependencies and less of your team's code.
I know that part of this will fall under "integration" but definitions are sneaky.