Hacker News new | past | comments | ask | show | jobs | submit login
Unit Testing is Overrated (2020) (tyrrrz.me)
236 points by ivanvas on April 7, 2022 | hide | past | favorite | 228 comments



Team I'm on obsesses with 100% unit test coverage which means that it takes 3-5x as long to ship a feature while engineers write tests that are:

1) Almost always better to be part of integration test

2) Offer no real ROI because the framework usually handles most of these issues gracefully to begin with, so its just a few hundred lines of cute code trying to mock out shit (or better yet getting multiple people in a zoom and wasting their time trying to test 4 lines of code - WHICH THE FKIN LIBRARY/FRAMEWORK ALREADY HANDLES)

3) Complete garbage. They write the code first and then add tests. That's fine if these were integration tests, but they run their little commands to print out a report that says 92% test coverage. Oh no No nO nO NO No meme-1.gif.jpeg. We have incidents regularly compared to my previous team, which was granted all Senior/Staff engineers and we didn't spend our time chasing dumbass metrics. If you want 100% unit test coverage, just do TDD and not this half-ass yolo cowboy approach. Or just write simple, single responsibility code so it can be tested easily. But nah, you know how it is.

These people are annoying as hell to work with, and unfortunately I work at a bigger company so this kind of BS is normal as people try to chase promotions instead of actually shipping quality products. I personally wouldn't hire people like this, nor keep them around this long.


Does your team work on a codebase that uses a dynamically typed language? If so then the 100% coverage requirement is perfectly reasonable and arguably the only sane way to develop anything more than a tiny project. If a statically typed language is being used then sure, that requirement is counterproductive but if a team uses a dynamically typed language I don't see a way to have any reasonable amount of confidence that the code is even syntactically and semantically valid without 100% coverage. The 100% coverage is just a tax that you need to pay in dynamically typed codebases (hence my preference to use statically typed language for any reasonably sized work related project).

For 3) TDD has nothing to with test coverage or aiming for high coverage. The fact that TDD evangelists have tried to couple "good testing" with TDD is a huge scam. I don't care when you write your tests as long as they get written before the code is merged in. TDD is good for parts of the code which are very pure or where you know exactly what you are doing, but if what you are doing is more exploratory in nature or is not pure and has lots of external dependencies TDD is simply not practical for a lot of people.


In my experience, scaling a large project in a dynamically typed language with 100% unit test coverage just gets you production outages with all unit tests passing. A reasonable threshold is probably closer to 70-90% _plus integration tests_.

Hopefully you also picked a language where a bigco has created tooling that lets you incrementally add static types -- TypeScript, Closure Compiler, Python typing and/or pytype, etc.


Lol, this is so true. A project that I had joined (late) some years ago had 98%! code coverage, full sonar/fortify code compliance and was an utter piece of garbage that had regular outages in production.

It taught me a couple of lessons:

1. Integration and System Tests are FAR more valuable than unit tests. By several orders of magnitude.

2. Performance Tests are vital. This is actually more true in a cloud world. There is only so much hardware you can throw at a problem before your costs are un-feasible.


This is false. 100% UT in a dynamically typed language like JS is no assurance of anything really. You need to cover all the observable behavior in any case (no matter the type of language) to get any kind of assurance. UT cannot deliver this.


The law of diminishing returns for sure applies here


It applies everywhere.


Also, this kind of mentality frequently results in what I consider negative-value tests. A test suite is supposed to do two things:

  a) identify bugs by failing on incorrect behavior, and 
  b) enable refactoring by continuing to pass when underlying code doesn't change behavior
Focusing on 100% unit test coverage all the time—particularly in code that isn't doing simple, single responsibility stuff—often results (in my experience) in tests that overly mock other parts of the system. The net result is that you end up writing tests that simple check that "the code was written the way it is currently written". Tests like this fail on the first count because they fundamentally cannot detect incorrect behavior, and they fail on the second count because a non-trivial refactor inevitably changes the implementation even if the resulting behavior stays the same.

Not that mocks aren't useful in some scenarios, but if you find yourself writing entire test suites based on them you're likely creating tests that are negative value. They won't catch undiscovered bugs, and they'll give you bad information when you're refactoring.


That's precisely what I did in the beginning of my career. To test every method and mock all the dependencies. This had the impact, because we usually don't use a dependency injection framework, that I had to add an interface for almost every class, adding factories with interfaces almost everywhere and then write tests which expect certain method calls and to basically debug the tests while writing them until all calls have been mocked. Thus, whenever I changed the production code I had to change the test as well.

IMHO in my own projects I tend to write tests which test certain parts of a system as well as some dependencies. So usually I'd say they are integration tests instead of unit tests. But I can refactor stuff and once I break something it's usually detected by one of the tests. Maybe unit tests should test for certain cases border cases, which are hard to test with integration tests.


I heartily agree with your general sentiment, but think you're not mentioning a very important function of a test suite which is:

c) explicitly and accurately document expected behavior including in a variety of edge cases

...this is sort of implicit in your first 2 but I think worth breaking out.

Going for high test coverage (in places where I've done TDD-style dev) has really forced me to think through those edge cases and what I want my code to do in all of them.

I think negative-value testing comes from the mindset of "how do I write a test which exercises this branch?" but a positive version of that is "I put this branch to handle the case where foo is getting close to the boundary. Let's put in test cases with foo at and close to the boundary and make sure it does what it is supposed to." If the test focuses on behavior then you find the bugs and enable refactoring and you get high coverage because if you can't exercise all the paths probably some of them are dead and can just get deleted.


That's the thing that pisses me off about how testing and the periphery around testing (like coverage) are commonly treated. They're seen as being virtuous in and of themselves and no one wants to measure whether they are counterproductive. Even if one took that initiative, good luck getting anyone to care.

Sure, I suppose one could say that any tests is better than no tests, but that also dismisses whether testing can be done better.

Code coverage is the worst. I've worked with codebases with various degrees of code coverage, and I've never seen a correlation between coverage and the number of bugs encountered. There may be an asymptotic nature to how much coverage is needed, but I would bet the threshold is low in a general sense. Once you've reached even 25% coverage, it's really dubious whether more coverage than that is actually telling you anything.

Of course, we could measure success by number of bugs and company revenue, but no, we can't do that; those could make people look bad.


Worse still, I feel like doing unit testing at the expense of integration testing can create a false sense of security.

Unit tests often skip the error cases that are really important for real life scenarios.

We've recently introduced a regression test suite for our data pipeline which simply compares all the data before with the data after running the pipeline. This gave us about 65% code coverage and the certainty that we're testing what actually matters for the product instead of how somebody decides to implement a function.


False sense of security sold by conference indsutrial complex is my biggest issues with unit testing. One time I've seen a junior dev wrote copious amount of unit tests to achieve near 100% code coverage. When this "perfect" code went into prod it almost took the whole system down. Because the junior dev was joining close to a dozen tables and each one of the tables has at least a million rows.

If the junior dev took the time to understand basic DB performance instead of writing dozens of unit tests, then it wouldn't have happened. I blame the gurus who don't write code for living, but peddle magic solutions like "clean code", unit testing etc. to sell more books and seminars.


I'm pretty sure if he didn't write the unit tests it would have equally failed. There is something to blame here, but not the unit tests.


No doubt the junior dev would benefit a lot more from only getting up to 60-70% code coverage.

The extra time could then be used to learn basic Db performance, which is hugely important when tables have millions or billions of rows.

Stuff like indexes and isolation levels should be taught to all junior devs, it's just too risky not to.


I feel the opposite way, with unit tests you can test failure scenarios as easily as any other scenario. So it makes it simple to exhaustively test every code path including failure scenarios that would be very difficult to set up in integration tests.


There are bugs that don't appear as such at the unit level. Two objects might individually be in perfectly valid states, but those states can be invalid at a system level.

Which is why there can be a false sense of security with unit testing. When you see 100% code coverage, it implies that you've tested every possible scenario, but with most non-trivial software, you haven't. Did you test what happens when the methods are passed in 100GB of data?


I've become a fan of using mutation testing to drive unit test coverage. The goal of mutation testing is to ensure there is test coverage around areas with a higher likelihood of being brittle. Work smarter, not harder.


> Once you've reached even 25% coverage, it's really dubious whether more coverage than that is actually telling you anything.

For sure there's diminishing returns on test coverage and I share your frustration with the myopic focus on coverage but there's no way it's anywhere close to 25%. Perhaps 25% is sufficient (or way more than enough) for new, small projects using languages with good compilers and type checkers. But it's no where near enough even under those ideal conditions when you have several generations of developers working on big multi-year projects.

Tests do reduce bugs. That's why automated testing was invented in the first place. Number of bugs vs company revenue isn't ignored because it'd make people look bad, but because it's an incredibly difficult metric to calculate.


I think this approach comes out of highly dynamic languages like python.

If a typo in a variable name doesn't cause any problems until it get evaluated, then you really need to execute every piece of the code to be sure that you don't have incredibly simple errors.

If you are using languages with a even a primitive static type system, you can catch a lot of errors at compile time (or when the interprartor loads them), which provides most of the benifits of 100% coverage.


> If a typo in a variable name doesn't cause any problems until it get evaluated

Linters even in languages like python will catch that kind of thing.


for undeclared variables PHP is even worse. It just keeps going and just gives you a notice!


PHP has a lot in common with BASIC at that.


IMO it's a huge problem that TDD and unit testing hey conflated way too often. TDD just means you write the test first and don't write code not required to get the tests to pass. It does NOT tell you what tests you should write. I for one am a huge believer in inside-out TDD. This would have you start with an acceptance test that uses the product as a user would. This may be exercise the feature or reproduce the bug. It's for you to decide what the right abstraction level is. I typically will end up testing edge cases either one level down in integration tests or unit tests, but it depends and there are no hard rules. It's all a trade-off. What's definitely wrong to me is retiring solely on unit tests for your TDD that's a disaster waiting to happen. TDD is about fast feedback loops and quality not about proxy metrics like coverage from unit tests. As others have pointed out, depending on your language you also might need more or fewer tests. When I write Ruby or plain JS, I'll write a lot, a lot of tests on different levels. When I'm writing Rust, I write very few and those that I do write will be focused on complex business logic (granted all Rust code I've written was for toy projects or puzzles where "business" logic might be the wrong term)

Edit: we've also corrupted what "unit" is referring to. AFAIK it originally referred to a unit of functionality. Many now take it to refer to a single class.


100% test coverage is really the starting point for testing. Once all the lines of code you've typed in actually run before they're delivered to the customer, then you can start adding the real test scenarios!


And how, pray tell, do you discover all of the scenarios that need to be covered?

If only there were some sort of coverage tool that could tell you...


You are forgetting one important part: unit tests serve as an always updated, always tested documentation of how you use the functionalities of your software.

I agree with your point on 3 - you should use the tests to start figuring out how your unit is going to be used. If you write code first, you frequently end up with untestable code that you need to mock the universe to get it running in artificial conditions. I don't recommend writing your tests first unless it's a trivial case, but you should at least have the unit calls in tests before you start coding.


This has been my interpretation of unit testing for some years now. They should be renamed to "Examples" or "Documentation Test" because all they really do is when I read them is (i) show how to use the unit (ii) and common error / success cases of that unit.

I would never really on them for actually testing the expected property of a public API for example.


Depends on how extensive the unit tests are. But, again, you also need to do integration tests for making sure you are not getting anything funny from external sources you rely on.


I like this post, because I avoided unit tests for so long lol. After finally doing them, I did notice that I would at least format my code better and this had other benefits. There would be cases I missed, made me start thinking about edge cases further in advance, as well as made it easier for other to read.

I think the primary point is made

> Or just write simple, single responsibility code so it can be tested easily

there it is.


This describes my experience with unit testing perfectly.

What I try to do is just write small functions that feel like they would be well suited for a unit test. I find that makes them much easier to read, and reason about. I don’t even bother writing tests.


Unit Tests are still underrated in my experience. It pains me when I see developers testing their functionality end to end, manually. Discover a formatting error in their input, then do this again. When I was working in the office I could see that easily 90% of a lot of developer's time is spent like this. Then I showed that a unit test can discover this, people are still in denial, that it was just for this small error. But software development is a lot about nitty gritty details. And unit testing fixes this.


> It pains me when I see developers testing their functionality end to end, manually

On the other hand I know developers that only unit test. Hundreds of lines of mock code, dozens of tests. No documentation, no sample configuration for the final application and dozens of data races because the unit tests don't cover multi threading. But hey an unusable buggy mess is a net positive as long as it makes the test coverage statistic happy and that is one of those things reported to management.

While a bit exaggerated one of the downsides of unit tests is that they quickly become part of a metric without regards for when they ad value.


> Hundreds of lines of mock code

Mocking should be kept to a minimum and ideally avoided.

Mocking is most useful when adding tests to legacy code, but new code should be designed so it can be tested without the need for mocking.


Mocking own code should be avoided if possible, but it does not apply to external resources like database, http responses etc.


I don't think it's ever a good idea to mock a database. Invest in a test framework that can performantly spin up a fresh database instance and throw it away at the end of the test run instead.

Mocking outbound HTTP calls is worthwhile, but thankfully most testing frameworks I've used (at least in Python world) make that pretty easy.


I mock database by using sqlite in-memory database. It is not 100% compatible with my production database but still god enough for most of my tests. This way I have fresh database for each unit test and the tests are still fast.


Also can be pretty handy when there's hardware that needs something very specific to happen to work properly. Mocks can give you (some) confidence that when you poke the I2C bus, you won't hang it up.

But it's very easy to overdo it and end up testing irrelevant implementation details instead.


Hence why you need automated integration tests, where possible. Especially for regressions.

They tend to have much higher value than pure unit tests.


I agree that automated integration tests typically offer higher ROI than equivalent unit tests written with mocks, because mock code is so difficult to write and fragile.

I disagree with your assertion that integration tests offer higher value than unit tests in general. Don't judge all unit tests by mock code.


I will counter that integration tests tend to have a much bigger footprint: one test covers a lot more ground and so per line of code will catch more things.

But, the failures from integration tests are often a lot harder to read. Even half-good unit tests tend to show exactly where the error is, and make the fix cycle much shorter. It is also much harder to write exhaustive integration tests for all of the inputs that might happen.

I, as a long-term qa-ish person (I usually write testing systems) see real value in having a mixture of both system, integration (multiple systems), and unit tests, as well as end-to-end manual tests. Anything big enough needs all of these.


If errors from your integration tests are hard to understand, that just means you need to improve the debugging information emitted by your software such that your developer can debug it properly using it, no?


This is the point. Both are tools. They're both needed.


TBH mock code just sucks to write, no matter how good your unit tests are.


That is one of the big benefits I have found from writing code while planning to write unit tests for it (even if you don't write all of them). You will generally avoid patterns that will require a lot of mocks to test, which generally make that code easier to refactor and reason about. If you are able to make big chunks of your code into "pure" functions, they are now really easy to test.


Also agree.

I've seen code so mocked it was dubious that it could be testing anything, accurately at least.


A classic approach is that if you have to make some complicated API call with a bunch of arguments, you can put all the logic wrangling the arguments into a pure helper function and unit test the helper.


Agree.

I use strategy pattern with Java functional interfaces to avoid using magic mocking frameworks. That way, my mocks are just trivial impls of the plugin parts.


Setting up sample data is the biggest pain for mocking in my experience. But it's true however you're testing.


How do you accomplish that? I often end up in a situation where mocks are needed (like in the article).


Often times, you will have chunks of code that are either creating input for what would go into an interface/mock, or handling the output of it. You could pull those bits, even if they are relatively simple (maybe only a few lines, and only used in that one spot), and extract those out to a pure function to sidestep the mock.

Sometimes you can't really avoid it. In those cases, I would judge how much of the effort is testing your code, and how much is testing your ability to write mocks. Unless it is a really important functionality, or something complex enough that it would be easy to get wrong, I wouldn't bother with most tests that need extensive mocking. Instead handle those later with integration / e2e tests.


> and that is one of those things reported to management.

Yeah unit testing as I see it is just optimising for the success of middle management, at the expense of both the organisation's goals, and the developers.

You are making predictability the only focus, because that's the metric which middle management is evaluated by. At the expense of performance, end product quality, and work satisfaction. "It was done in exactly the excruciatingly slow and mind numbing boring way, and was exactly as shitty as designed" = success for management.

If you would try to increase performance, end product usability, and/or work satisfaction, you would have to put the predictability at risk, which means that the success of the middle manager will not be the highest priority at all cost anymore.


yep. I've worked at places with unit tests/e2e/and static typing. And I've worked at places with zero unit tests/no e2e/dynamic typing.

The push for more testing and static typing (to degree) was always managers. They trade predictability for momentum.

The reality in both situations is that bugs happen. I've not noticed a difference in frequency of "critical" bugs in either scenario. The only difference is that at the no unit test/no e2e places, getting your code into production was trivial. Which always meant when bugs happen, they are fixed much faster than the unit test-everything place. We're talking minutes vs. days here. Because at a risk-adverse business (that is, any place more than a few hundred people), you have multiple stakeholders that have to sign off if you want to go take a piss in the bathroom.


While I totally agree that process can go nuts and get in the way, I will disagree that testing always makes things slower. Once you are past the prototyping phase, and get into actual production phase (some projects never make this turn) some testing infrastructure can absolutely make thing faster. Rather than having to deal with all of the bugs that your fellow developers are just dumping into the main branch, they can help weed out some of the bugs before others have to deal with them.


Your comment is downvoted perhaps because it sounds so cynical, but after thinking about it I agree to a certain extent. Throughout my career I've noticed the biggest pushers for more unit testing are usually middle managers who love to see that coverage number goes up since it's an observable metric, but nuanced opinions like "more unit tests are not always good and can be detrimental" are generally not welcome.


> Hundreds of lines of mock code

No wonder they are not getting value out of unit tests.

https://blog.metaobject.com/2014/05/why-i-don-mock.html


Many developers seem to lack a focus on automation in the most basic sense of the word as well. While consulting, I see lots of backend developers constantly making a change, waiting for the server to reload with the new changes, switch to Postman/cURL/whatever, fire off their request and see if it's accurate. Rather than just having the request handler as a function that is under test with assertions, and have the test re-run on each change. So much time during a developers day is spent on just repetition, and they don't seem to care.


Plenty of time to check Reddit while waiting for the server to reload.


> Then I showed that a unit test can discover this

Yeah but the catch is that in order for your code to be "testable" you have to rewrite it completely in a way that comes with large sacrifices, making it much more confusing, abstract, larger and much harder to work with, in every way except testing it.


My experience is the exact opposite of this. Modular code that has well defined boundaries and responsibilities can be mocked and tested in isolation. It can also be reasoned about in isolation, reducing complexity. New functionality can be assigned to the proper module because each module has a well defined purpose.


I think the focus that everything should be testable is a wrong one, but almost every codebase has functionality that can be easily unit tested without major refactors. For everything else, integration tests can already help a lot.


Ideally, everything should be testable or replaceable. The non testable parts of your code have network effects on the testability of other parts of the code that are dependent on those parts. This is where dependency injection (e.g via a constructor or function parameters) really shines.

I totally agree that if you have to choose (and usually you do, unless you have unlimited time and funding), choose to implement end to end integration tests first.


The simple verification aspect of unit tests makes writing code easier and more purposeful in my opinion. At a most basic level it's kind of like a small informal proof. When the code base get's complex and larger in size this is invaluable if unit tests are written correctly. Tests have to make sense and be written well just like any other piece of code.

It also helps form thinking about test cases and logic before writing code. Before I used exhaustive unit tests I would just start writing code right away and often times this would result in constant refactoring as the code evolved or during debugging. But unit tests encouraged me to scaffold out abstract functions and write tests before implementing the code. It got me thinking about what the logic and workflows then implementing the functions meant having the tests pass as verification.


So if you an existing, functional, code base, I totally agree. Rewrites in general are rarely worth doing. If you have a new code base, it can be incredibly valuable to design the code such that externalities can be injected. Usually this means being able to swap out file system, RNG, system time, and networking layers. If your code uses a particular hardware platform, it can be useful to put the platform specific code behind an abstraction layer.

Not only does it make your code more unit testable, it makes your code much more ad hoc testable. Again, I would not refactor a large, non-disposable application just to achieve this.


Imho one of the benefits of unit testing is precisely that you have to write/refactor your code to make it more testable and in the process you uncover and fix flaws in lacking abstraction and isolation.


If you write code that is not testable, how do you test it? What you propose is false dichotomy really. Code that isn't testable is almost always confusing, large, hard to work with etc.


Once I wrote my third game in Unity, with unit tests coverage of about 80%. It was least buggy game in my entire career. It was also the hardest time as I had to start thinking more about creating testable components rather than messy spaghetti code I used to write before.


This is one of the best "side-effects" of testing, with the reasoning is that testable code is good code (as in modular, cohesive, etc...)


Games are likely to benefit significantly from unit tests: architecture is complex and likely to be improved by adding clean boundaries and layers for the sake of testability; supporting automation is likely to enable useful features (e.g. deterministic simulations); tests are likely to be meaningful and important.


Have you developed games at all, or are you just speaking from an armchair?


As a gamedev who writes unit tests, I agree with GP. Games benefit from unit tests. That, however, doesn't change the systemic incentives that prevent tests from being written. The ways that gamedev is budgeted/funded is a more important factor than more tests = a more stable/less buggy game.


What engine do you use that supports a unit test framework?


Since testing in game development is very much underrated, it would be interesting to hear more about this. Was this just as a hobby or is the game available somewhere? And have you ever done a demo/talk about how you've approached unit testing with Unity?


not unit testing but a good talk on one team's testing

https://m.youtube.com/watch?v=m5zrfTFKf_E


I’ve found that while I did spend a lot of time at first figuring out how to write testable code, I now am so much better at writing in this way that I am much more productive since everything has great test coverage. Saves me going back to modules I wrote 2 years ago to add features.


Property based tests have the benefit of being fun to actually think about (as far as testing goes) and also often exposing more bugs than unit tests (which are usually more like sanity checks than exploring the boundary of what will break or turn weird your functions).


Property based tests have caught 10-100x the number of buggy edge-cases than the handwritten unit tests I could have come up with. The material gains of getting a combinatorial explosion of test cases from property composition is extremely underrated.

However, there seems to be a general skepticism around property based tests in my experience. Any idea where it stems from?


IMO, you have to code in a certain way to make them ergonomic or even worthwhile - which is also true for example tests. They have some overlap with example tests but have a different advantage. Example tests document function/interface usage and typical or historical errors/exceptions. Property based tests are more about data integrity and invariants.


It requires more mathematical / analytical thinking and code IME.


It pains me when I see developers assiduously write unit tests for every line of their code and every single bug in the code ends up encoded in those unit tests so that they fail when the bug is fixed.

Integration tests need to be the default and people need to learn when to use one or the other.


Are these bugs in the sense that the code's plain wrong or that the developer misunderstood the spec? Good unit tests (when written alongside the code they're testing) should go a long way towards preventing the former.


The latter. Also misunderstandings about APIs. IME these combined are usually something like 70% of all bugs so the ROI on unit tests is not particularly high.

I find integration tests paired with BDD do a good job of catching "code plain wrong" bugs as well as misused APIs and misunderstood specs. They're harder to build and run slower but the ROI is still higher.


Because that seems like they're doing things backwards...

You should know what result you want before you even start writing the code.


>You should know what result you want before you even start writing the code.

In theory, but IME about 50% of bugs end up being specification bugs at the high level anyway.

Once you drill down and start writing unit tests on lower levels to imitate higher level APIs that % only goes up.


Unit tests are extremely boring for people who don't particularly like programming but do it anyway as means to an end. It's just another piece of code you are writing instead of releasing the dopamine of the finished product. It's anticlimactic.

If you first write the test and then try to satisfy them(test driven development) can work to an extent as it can divide the large task into small pieces where you get a prize each time but it also alienates the developer from the larger picture, diminishing their value to the project since they no longer can put their intellectual output into the project.


In my experience, any complex and widely used library is dead in the water without unit tests - if you make a change, you WILL break some use case, ending up in an endless cycle of fixing your changes based on user feedback, and souring your users on your library, who will refuse to update.

If you don't believe this, just write some (good) unit tests, then make the change - it's very very likely they will catch some weird corner case you've forgotten even existed.


I don't doubt the usefulness of unit tests.


I find it makes programming so much more enjoyable both in an immediate gratification and long term way.

More green checks = more dopamine Higher quality code = long term satisfaction.


But there's nothing long term about unit tests, they only verify a particular version of the code and must be tweaked whenever that code is changed to return different results. The "happy green checks" that really help in the long term are comprehensive type checks, reflecting a reasonably stable underlying design.


I don't understand.

Code does things, tests verify that it does the things I thought it did. Getting a bit of validation makes me feel good while working.

If it now needs to do completely different things then it needs different tests.

Also, if you are making huge changes to tests whenever you change a single detail, that's an issue with your code, not testing in general.

Type checks are good too. Using a typed language is even better.


This reads to me as means/ends confusion. If you write your tests afterwards, to reify the implementation of the system, this is true. They may then require radical changes when the implementation changes. On the other hand, if you write your tests first to reflect business rules and requirements, then those tests will change only when the business rules and requirements change. (Which they will, and that's OK.)

I'm not an all-TDD-all-the-time type by any means. You kind of have to at least have something to start writing tests against or it'll just be compiler errors all day long. But for the projects where I've most consistently found success in building correct systems, writing tests as soon as the basic interfaces are in place has resulted in more flexible, more reliable tests that last over the long term.

As for type checks...isn't that what a compiler is for?


Don't you think that it also alienates you from the projects? Green checks are green checks anyway, you can no longer put your input into the project and the best strategy for you becomes the minimal viable effort to earn a green check.


> minimal viable effort to earn a green check

That's literally the thesis of the Kent Beck book.

I'm not some idiot copy and pasting `expect(true).toEqual(true)`

I add functionality one test at a time. I get satisfaction of knowing it works as specified in the form of an increasing count of green checks.

I don't understand why it would alienate me.


Personally, I hate any system designed to give me lots of small bits of validation. I don't want to see green check marks, just show me any errors that I need to think about.


That's literally the same thing. I write a failing test, I fix it. Those are the errors to think about and the green check marks when I don't need to think about them.


It's the opposite. One is motivation from being completionist all-green checkmarks. Where checkmarks are good. I only want to see/think about errors. I don't care if it passed 100 tests or a 1000 tests successfully.

Edit: I get that logically they are the same, but that's true of a half-empty/half-full glass. Psychologically they are different.


Most test frameworks only really show you the failing tests in any detail as the top level results.

As you say, you don't care that there are 20k passing tests. You care just care that there is 1 failure.


Unit testing is a tool, you don't need a 100% test coverage. But sometimes it's the right tool for the job. Sometimes it's not. If I have a piece of code... usually an algorithm that handles a handful of different use cases, and making a change might unexpectedly cause an issue to another use case, i'll unit test the heck out of it. If I have some code where it has a simple input, and output but it takes some time to test it, i'll write a unit test to make development faster.

But i'm not going to add an interface to every concrete class in my project, and design literally every component so I can mock it.

Writing software is a business, i can sprinkle TDD around for a high ROI. If I use it EVERYWHERE the ROI is very little if not negative.


Very much agree with "sprinkle TDD around for a high ROI" This requires both developer expertise and trust from stakeholders.

But, VeryBigHugeBankCo requires me to have 70% unit-test coverage for 'new' code, as measured by Sonar or whatever, or else I am simply not able to deploy my change to PROD.

So here I am refactoring a DAO, using strategy pattern and functional interfaces, so that I can cover my DB interactions adequately for that 70% metric.

No, I can't just use H2 DB, as the SQL is Oracle-specific.

Yes, there is very little actual value in the new refactor, except to check the box.

This is how we do...


Oh and don't get me started about how, especially when using mocking frameworks, the test code is sometimes wayyyy harder to understand than the actual code under test. NOT A WIN.

I have told mgmt that test code is equally likely to have bugs as production code, but, again, this is just box-checking, not actual attention to detail.


It gets worse than that. Often, code is pushed into an absurdly wrong level of the development stack, just because it makes testing it easier. For example, something should be a database trigger, but in-memory testing database gets in the way. Other times it strikes all the way to architectural topics - message queues replaced by REST calls, because the former is difficult to mock


Yup. I open a 'test-mode-only' socket on our ingestion piece, for integration testing, as I can't easily get onto our MQ (even non-prod).


> in-memory testing database gets in the way

Why would using an in-memory database prevent use of triggers?


This seems particularly bad scenario to be in, as being forced to write more lines of code increases any attack / bug surface (?)


Some time ago I stopped debating the definition of "unit" in testing, and instead started focusing on whether my tests were fast, reliable, and provided a useful signal about the health of the system when they failed. I've been much happier since then.


This is what Working Effectively with Legacy Code says:

Here are qualities of good unit tests:

  1. They run fast.

  2. They help us localize problems.
A unit test that takes 1/10th of a second to run is a slow unit test.


Nice joke, it takes longer to upload firmware with the unit test suite than that.

The test that is too short is most likely not being thorough enough, or the functionality it tests is so simplistic the test likely brings no value.


What about integration or end to end tests?


That's surprising to me, because the author of that book recently re-advertised his long post about a hard-line definition of a "unit" test

https://www.artima.com/weblogs/viewpost.jsp?thread=126923

I think the properties you list are good ones, but if you drop the word unit from your post the advice is just as good and you can't get mired in ontological discussions


This is the sticking point I've found. The definition of "unit" is different depending on who I talk to.


> Although the design above is perfectly valid in terms of OOP, neither of these classes are actually unit-testable. Because LocationProvider depends on its own instance of HttpClient and SolarCalculator in turn depends on LocationProvider, it’s impossible to isolate the business logic that may be contained within methods of these classes.

The design of those classes is bad, and the perceived need to reach for mocks (or interfaces) is telling you that.

The next step the author takes is to go for interfaces. That's not what I'd do. Instead, I'd decouple network from calculation with intermediate dumb data structures. The network side produces the data structure and the calculation side consumes it. The network side can further be broken down with a transformation of the API call result into the dumb data structure used for calculations.

This idea comes from an old Bob Martin presentation [1]. The whole thing is worth watching to put it into context.

This is the kind of thing that never seems to get discussed in these pieces praising integration tests over unit tests. For even better results, ditch the classes altogether and just use functions. It's quite surprising how badly designs come off the rails with the "everything is a class" model.

[1] https://www.youtube.com/watch?v=WpkDN78P884


Which in turn comes from Bertrand Meyer, who probably stole it from somebody else in 1990. Code should act or decide, but not both.


This!


I stopped reading right here:

"Considering the factors mentioned above, we can reason that unit tests are only useful to verify pure business logic inside of a given function."

That just isn't true and it makes the rest of the blogpost also not true.

A unit test should test "a unit of functionality" not just a method or a class. Your unit tests also shouldn't be coupled to the implementation of your unit of functionality. If you are making classes or methods public because you want to unit test them, you're doing it wrong.

The exception is maybe those tests you are writing while you're doing the coding. But you don't have to keep them around as they are.


> I stopped reading right here:

Me too. The author shows a deep misunderstanding and even obliviousness to testing, particularly from a practical point of view, that takes any credibility from any argument listed in the article.

It's even perplexing how the best example the author could come up with was this absurd chain of strawmen that a) it's impossible to test code that uses instances of HttpClient, b) dependencies used in dependency injection "serve no practical purpose other than making unit testing possible."

There are plenty of people talking about unit tests. This article is not one of which justifies any click.


That person never wrote portable code. I wouldn't envy anyone maintaining or refactoring it to support more platforms.

Main reasons to have dependency injection is to have properly runtime switchable code with less shared state. That it makes testing easier is a side effect.


If you are making classes or methods public because you want to unit test them, you're doing it wrong.

I've heard this a lot, but how do I test private methods right? If my public method just calls 8 private methods (which each call out to a bunch of other private methods), how to I test to make sure all those private methods do what I want them to do so that I know which one is broken when my public method breaks.


That your public function calls private functions is an implementation detail. It does not matter, test the public function. Subsequently that tests the private functions for all relevant input.

There was recently a submission here on HN that talked about this and different perspectives on that topic. It's not a universally shared opinion, surprisingly.


This seems obviously correct to me and it's still not appreciated by many developer teams.

Then there's the world of behavior / property / invariant -based testing which slims down your tests to essentially data generation and testing observable behavior which seems like magic to people still.


> how to I test to make sure all those private methods do what I want them to do so that I know which one is broken when my public method breaks.

You don’t. You test only the public API. This way, you can refactor the public method to your heart’s content without any tests breaking.


I just try to be practical and make them protected so they can be called from the test class which lives in the same package. Perhaps adding a @VisibleOnlyForTesting annotation.

The reality is that these private functions are building blocks that can be easier to test. If you only test the public methods, testing gets a lot more challenging and the methods are more difficult to test and the tests classes get a lot of more complicated. You end up creating more mocks or other doubles and bending backwards.


> how to I test to make sure all those private methods do what I want them to do so that I know which one is broken when my public method breaks.

Others have pointed out that you test only public methods. The thing I'd add to that: Having large classes is a code smell. If your class is big, it is probably several classes in one. Break it up accordingly. Some of your private methods in the big class will become public methods in the small class.


IMHO the purpose of unit tests is to ensure the code is correct and allow you to safely refactor and introduce new functionality without breaking existing behavior. The purpose is not to pinpoint the exact line where the bug is, if a test fails.

What you describe is white-box testing, where the test is coupled to implementation details of a class. This gives you more fine-grained reporting in case of an error but it makes it impossible to refactor the code safely. So I don't think that is a worthy trade-off.

If you often have hard-to-locate bugs in particular large component, the solution is probably to refactor into smaller more loosely-coupled components with well-defined interfaces.


That's a sign that you're burrying too much functionality to make testing reasonable. It's a sign that you should start splitting that out to more generic functions.

Scala is great about making these functions generic. Java has an issue with this where lots of things are easy to get burried but hard to reason about pulling them out.


As others have said: you need to think about your functionality as a public API. How the API works under the hood shouldn't affect your tests.

So how do you test private methods? You don't, with the exception of testing during development, but those aren supposed to be kept around.


Your expecting too much detail from your tests. As consequence your probably over-specifying them as well. Test through your public interface.


> "Considering the factors mentioned above, we can reason that unit tests are only useful to verify pure business logic inside of a given function."

> That just isn't true and it makes the rest of the blogpost also not true.

It's not? Writing unit test is all about minimizing side effects and interaction with other systems.


> It's not? Writing unit test is all about minimizing side effects and interaction with other systems.

Not really. Writing unit tests is all about verifying behavior from combinations of inputs, and side effects are inputs as well.

You can write unit tests that inject delays and timeouts and retries and throw exceptions under specific circumstances.


> Not really. Writing unit tests is all about verifying behavior from combinations of inputs, and side effects are inputs as well.

Well, except it's not. If you use a side effect, you have to account for it in some way via mocks or whatever.


> stopped reading

> it makes the rest of the blogpost also not true

what do you know of the rest of the blogpost if you stopped reading?


Wild speculation with a sprinkle of logic (garbage assumptions lead to garbage conclusions)

HOWEVER, I have skimmed through it by now and the last paragraph is actually quite good. It takes a while to get there and the chosen examples aren't great, though.


It reaks of the same arguement that DHH made about "i hate unit tests because I'm testing getters and setters" (From DHH's point.. yes it's tedious and unit tests on getters/setters are low signal.. but that's a defect in the langauge not with testing)


As a predominantly fullstack web developer, I will only ever voluntarily write unit tests and end to end tests, but avoid in-code integration tests in most circumstances.

This is frankly the easiest and best bang for buck.

Unit tests force you to make your code more flexible, they're fast to write, fast to run. Maintenance depends on how overboard you go with your verifications - try to verify the most important parts.

End to end tests require no elaborate scaffolding of internals and allow you to test it as a real user interacts with it. Generally fast to make, slow to run, and maintenance depends on how disciplined you are with stable identifiers/interaction points.

As for in-code integration tests - I love the idea of them, but they're absolutely miserable 9 times out of 10 due to extremely convoluted processes to "bring up" parts of your application. If you use DI it shouldn't be as bad, but almost nobody does and it becomes a total clusterfuck not worth the maintenance burden.

As an idealist, I want all 3. But most codebases frankly cannot support all 3, so the most value will come from whatever is easiest. I've found that's most often unit and e2e.


The article actually argues the opposite. Developers should move their focus to integration / "real-world" tests. The major summary bullet point being:

"Aim for the highest level of integration while maintaining reasonable speed and cost"

My experience mirrors the author's. In any "real" business application, the unit tests end up mocking so many dependencies that changes become a chore, in many cases causing colleagues to skip certain obvious refactors because the thought of updating 300 unit tests is out of the question. I've found much better success testing at the integration level. And to be clear this means writing a tests inside the same project that run against a database. They should run as part of your build, both locally and in CI. The holy grail is probably writing all your business logic inside pure functions, and then unit testing those, while integration testing the outer layers for happy and error paths. But good luck trying to get your coworkers to think in pure functions.


> The holy grail is probably writing all your business logic inside pure functions, and then unit testing those, while integration testing the outer layers for happy and error paths. But good luck trying to get your coworkers to think in pure functions.

I've come to a similar conclusion. Functions don't necessarily have to be pure in the academical sense, though - but I feel like the more the business logic is decoupled from dependency injection and the less it is relying on some framework, the better.

It makes testing a lot easier, but also code reuse. I've just been writing a one-off migration script where I could simply plug in parts of the core business logic. It would have been very annoying if that was relying on Angular, NestJS or whatever.


I've had the same experience. Suboptimal code isn't refactored because of the test code overhead, or, much worse, the tests on that same subpar code somehow morph into a perceived "gold standard" for how that code should work.

I avoid tests (aside from hands-on end user testing) as much as possible, actually, since they rarely seem to tell you anything you'd didn't already know.


> in many cases causing colleagues to skip certain obvious refactors because the thought of updating 300 unit tests is out of the question.

Good! They shouldn't do the refactor.

Because "obvious" refactors often introduce bugs (e.g. copy/paste errors), and if developers can't be bothered to write tests to catch them, they're going to screw over the other team members and users who will be forced to deal with their bugs in production.

> The holy grail is probably writing all your business logic inside pure functions, and then unit testing those, while integration testing the outer layers for happy and error paths.

So settle for half a loaf.

Write all the easy unit tests first. The coverage will be very incomplete, but something is better than nothing.

Write all the easy integration tests next.

Never write the hard tests if you can help it.


> Good! They shouldn't do the refactor.

> Because "obvious" refactors often introduce bugs (e.g. copy/paste errors), and if developers can't be bothered to write tests to catch them, they're going to screw over the other team members and users who will be forced to deal with their bugs in production.

In my opinion, useful tests should be able to survive a refactor. That is the only sane way I've ever done refactoring.

If I'm doing a large refactor on a project and there are no tests, or if the tests will not pass after the refactor, the first thing I do is write tests at a level that will pass both before and after refactoring.

Rewriting tests during refactoring doesn't protect from regression on my experience.


Unit tests which can survive a refactor are a nice-to-have.

I would not rule out a refactor merely because I'd have to refactor some unit tests too. That's just part of the cost benefit analysis.

> Rewriting tests during refactoring doesn't protect from regression on my experience.

Your experience is completely at odds with mine. Every time I change code, there is the possibility for simple errors such as copy/paste mistakes. Trivial, cheap-to-write unit tests have saved me time and again from having to debug something down the line.

Overconfident devs who act as though they're above making such simple mistakes make for bad team members.


I couldn't disagree more.

Tests that will survive a refactor are the most important tests to have.

The other tests are, at best, a false sense of security and often an active detriment that slows down future development. They might sometimes catch actual mistakes, but just as often they fail when nothing is broken, leading to the tests not being trusted and broken tests being updated even when something was actually broken.


> I couldn't disagree more.

*sigh*

I know where you're coming from. This is the classic argument for limiting unit tests to black-box testing of public APIs exclusively, avoiding clear-box testing altogether.

I agree that it's possible to write absolutely wretched fragile clear-box tests. And I agree that if you have a black-box test and a clear-box test which provide equal validation of functionality, the black-box test is superior because it will survive a refactor.

I generally dislike absolutist rules of any kind when it comes to unit testing and prefer to think of things in terms of ROI. Sometimes you can add a lot of value with a clear-box test because the functionality is impossible to write a black-box test for without a ton of extra work and time.

But sometimes you may be in an environment where absolutist rules are the only way to go.


Yeah a refactor that changes how you expect the interfaces to work sounds like much more than a refactor.


>Good! They shouldn't do the refactor. Because "obvious" refactors often introduce bugs.

& if your tests arent catching those bugs and require extra maintenance to go green again you are doing them wrong.


I understand what the article is arguing. I agree with it, but think it's idealistic. If swaths of your code are a mess, integration testing is super painful. You can't easily add it until you clean up the mess, so the other forms of testing are more practical more often in my experience. If you get to a point where your code isn't a mess, I'd agree that you should start introducing meaningful integration tests.

I think this is just one of those cases where there is a context-sensitive strategy to testing. It depends completely on the cleanliness of your code and experience working with it.


Trying to write your code such that it can work with arbitrary data and arbitrary amounts of it is a step towards the holy grail I think.

Then you get to use fuzzer & Arbitrary for basically what is a coverage guided property based test.

But it's hard to maintain that idea at all times when you are writing.


> As for in-code integration tests - I love the idea of them, but they're absolutely miserable 9 times out of 10 due to extremely convoluted processes to "bring up" parts of your application.

I've definitely worked on projects like this (actually, I'm working on one now at $dayjob), but I found that with a little bit of effort you can get this right, and usually isn't that much effort.

What I often see people do is "oh I need a thingy for this test, and a thaty for this other test" and create it ad-hoc when needed, instead of creating a good convenient API to do all of this, which is then also used in the application itself.

The big advantage of these tests is that they tend to be a lot faster than E2E tests, at least in my experience, and often also easier to reason about once you get it right.


"Overrated" is the right word here. It's not like unit tests are useless, instead, the problem is that everybody uses them much more than it makes sense. And yes, I'm a critic of them since the fashion started at the turn of the century and I also use them more than they make sense.

Unit tests are very cheap, and people love cheap things. The problem is that they also provide almost no value, so their cost and benefit are on similar levels and they can easily get into negative net value.


I disagree. I also feel UTs are a waste of time. I think in some cases UTs can work such as helper classes, but for business logic they are pretty useless and a waste of time. I guess it gets at that verification level DEPENDS ON what is being verified.


Some observations after using many different testing philosophies on many different teams in the past 20 years:

1. Unit tests for code with no side effects (code that takes well defined input and produces well-defined output) are easy to write, easy to understand, and I've never regretted writing them.

2. I've had good experiences transforming as much error-prone logic as possible into code with no side effects.

3. Integration tests that execute in an environment that is similar to production have proven to be incredibly valuable, especially when refactoring. When possible, I prefer to make technology-stack choices that make such an environment quick to setup and teardown such that these tests can run in seconds.

4. Many bugs occur because team A makes bad assumptions about how team B's system behaves, and they encode these assumptions into their (passing) test cases.

5. Unit tests with lots of mocks should be avoided; their cost-to-benefit ratio is terrible. Sometimes the best solution is to delete these tests. Relying on mocks for a few error-case tests that are hard to reproduce on the real system is ok.

6. If they cannot be avoided, fake implementations should be written by the same team that writes the real implementation. They understand how it actually behaves much better than their users, and are in a good place to reuse critical logic from the real implementation in the fake to make it more realistic.

7. If my project includes firmware running on custom hardware, building an emulator for the hardware that can run on standard computing infrastructure is valuable for writing test cases against.

8. More tests or coverage is not always better. We have a finite amount of time to improve our systems, and it is our duty as engineers to ensure that the benefit we get from adding a specific test-case is higher than using that time to make some other improvement to the project.


I forgot who, but someone semi-famous said unit tests should be called "programmer tests". That is, unit tests are for the programmers, not for verification. They are used as a tool by the programmer to:

Tighten up the development loop (manual e2e is often too slow)

Prove to yourself that the code does what you want

Communicate intent to other programmers

Documentation to other programmers

Help programmers debug regressions

I really think this holds up because, really, who except for programmers that can read and comprehend unit tests are going to trust they verify a product works? The more distant a stakeholder is from a project the more holistic the testing needs to be in order for that stakeholder to trust it.


This is exactly right. I write unittests first and foremost to gain confidence this shit won't blow up on me on prod. I also take extra care of them because it's by far the best documentation of the code. To me these are good enough reasons to write extra potentially unnecessary tests.


"Oooo...yeahhhh, ummm...I'm gonna have to go ahead and sort of disagree with you there." - Office Space

UT is not the panacea and won't do your laundry for you, but well-tested code is better than sloppily- or un-tested code. Always will be, IMHO. UT is the core of that, for languages where a unit of functionality can be executed directly. Integration tests are also critical, but IMHO it's foolish to do one or the other.

"But dependencies ..." yeah that's been solved many times over with mock tools and such, and refactoring to reduce complexity. If your code is too complex to test properly, it's too complex to release. Don't be lazy, do it right.


But you are writing like a single person would be writing whole project.

I might not be too lazy - but I have to collaborate with 5-10 people.

Well-tested code is also some kind of utopia, no one have seen such code but everyone is claiming that once you get there it is all unicorns and roses.

I can write perfect code if I do it alone for my toy project - when I have team of people that has to agree on stuff it is not going to happen.


I really enjoyed this article and spoke to a lot of the lived experience I've had in over abstracted codebases that had tests incredibly coupled to implementation.

I think this title could maybe be rewritten as "Unit Testing that Requires Mocks is Overrated".

Unit testing something that has no (or very simple) dependencies is great. For example: - some kind of encryption that takes in a string as a key - serialization that expects the same output as the input passed in - a transform function that takes in one type and expects a different type out

As soon as you rely on a DB, file system, etc... you're probably better off with an e2e test.

At the end of the day, it comes down to data contracts. This could be the functions a package exposes or the GQL/REST/gRPC/whatever API. That is the most important thing to not break the behavior of. Write good tests that target the external facing data contracts and treat implementation like a black box and you'll be empowered to do the important things that tests should enable like refactoring, reworking abstractions that may have made sense at one point, but no longer do, and let your codebase evolve as you learn.

Tests that are a barrier to reworking internal abstractions are not good tests.


Oh god this resonates so much with what I'm going through right now. I'm on a team that rotates members pretty much every 6 months and have been put in charge of writing unit tests for a series of repositories that I have never seen in my life. I must also add that I had 0 experience with unit testing beyond knowing what the concept is when asked to start testing. Now, I wouldn't be too bothered because it was a sort of learning experience but what was supposed to be a two week task at most has been dragging for a couple of months because they keep adding APIs and other backends which further complicates the issue and, of course the code isn't really unit testable so I have to modify code that I haven't seen before made by people who aren't on the team anymore just to make tests work

All of this to achieve a very arbitrary 80% coverage that's required by business on a few REST APIs that's not even ours!! And don't get me wrong I get the importance of testing but the enphasis on unit testing these days seems ridiculous


> I'm on a team that rotates members pretty much every 6 months and have been put in charge of writing unit tests for a series of repositories that I have never seen in my life. I must also add that I had 0 experience with unit testing beyond knowing what the concept is when asked to start testing.

I don't think the concept of "unit testing" is the main problem here.


Oh it certainly isn't, but I have always been used to learning on the run and with proper support it wouldn't be such a hassle but I believe that the emphasis they made on unit testing doesn't really help either


> I'm on a team that rotates members pretty much every 6 months

What kind of software do you make?

Where I am, I was still very much learning after 6 months and it wasn't until a year or so before I became really effective.


It's mostly line of business software and yeah, it took me a year before I got to know the ins and outs of the system so you can absolutely imagine the clusterfuck it is when user acceptance and deployments start coming in


Hot take.

Unit testing is not overrated, if you feel this way you likely just suck at optimising your suite for high RoI tests or chase some stupid metric like code coverage getting mad when you find out you wasted a bunch of time writing worthless tests.

https://youtu.be/z9quxZsLcfo


There's nothing that cements the value of units tests more, for me, than surfacing bugs in new code almost immediately that would otherwise need debugging "in situ" in the application.

Figuring out where your bug is is so much harder when you have to do it though the lens of other code, whereas in your unit test, you can just see "oh, I'm returning foo.X, not foo.Y".

It also makes sure you actually can construct your object at all without dragging in a dependency on the entire system. Code without tests tends to accrete things that can only be set up in a very long and complex process. This is both hard to reason about (because your system can only be thought about as an ensemble) and fragile when the system changes: you now have to unpick the dependencies to allow your "SendEmail" function to not somehow depend on an interface to some SPI hardware!

But there's certainly value in not spending hours testing obvious code to death: a getter is almost certainly fine, and even if it isn't, the first "real" test that uses it will fail if you're returning the wrong thing. But if, down the line, you do find a bug in it, then something was probably not that obvious!


Nothing cements the uselessness of unit tests more, for me, than not surfacing bugs in new code and having wasted all that time preemptively debugging code that works.


Testing isn't only about bugs. Suggesting that's the only benefit is a strawman. Testing forces you to design better code. Sure you can mock the crap out of things and still create trash but if you think critically it usually affords making more robust code.


Then don't write tests for that code. Writing tests for trivial stuff is pointless. Write tests for code that could actually be wrong, or where you need to prove to yourself or others that the invariant is maintained. Critically, where it's important that the invariant stays true even after refactoring.

It's also valuable to have some testing framework available, just so that it's then easy to write tests when they're needed (which comes back around to "make sure that your objects can ever be constructed" thing). Not all unit tests have to be preemptive, even if the presence of the ability to quickly write them is.


> Write tests for code that could actually be wrong, or where you need to prove to yourself or others that the invariant is maintained.

Yes, what you need to do in the end, is to use your judgement. But you will get a surprising amount of pushback against this obvious common sense solution. From management who thinks it's too risky, and from developers who have chosen a technical career just because they want to rely on their intelligence, and black and white thinking, and now you're asking them to make trade offs and decisions which is management territory. The way out in my opinion is to make technical leaders that have actual authority, down to the details.


If you ever meet a developer doesn't like making trade offs show them the door, it's basically all we do.


My reaction to this is... if you're writing tests for pointless things to test.. it means that your testing framework/approach may not be giving you more of what you need... or it could be a defect in the language.

In Unit testing, you should only be testing the code you wrote. (Contract testing is more of an assurance test) In Java you have to write unit tests for your getters and setters because they might do more than what you expected (as that's within the realm of possiblity) If you find yourself writting getters or setter tests .. you should be looking into a better language like Groovy, Scala that do it for you, or something like Lombok or Records in later JDKs.


The first question I’d ask if I were writing unit tests for their first example is why a solar calculator containing mathematical business logic should depend on a location provider rather than just accepting a location as a method parameter.

Unit tests help you to design your classes based on use cases—if a simple class is hard to test you should think about whether it might have a simpler design. If you have a brute forced design and each of your classes have dozens of dependencies and your tests require pages of setup and are hard to reason about and are tightly coupled to the implementation then the tests might not be providing any benefit at all.


Totally agree. Functions should never accept an object reference or another function as a parameter when the necessary information could be provided directly as data (where "data" means C structs without pointers, protobuf, JSON, ...). Even code that mutates its arguments is easy to test if it's all simple data.


Dogmatic unit testing is overrated.

For backend testing, I've had amazing success following a few simple principles: 1. Abstract and mock code you don't control that require network calls. 2. Use frameworks that support database abstraction out of the box 3. Test end to end functionality. Every test should basically be getting your DB into a specific state, then hitting an API endpoint and validating the output and/or state change.

It's not that hard, you can get decent and worthwhile code coverage for not much work, and even if you only write a few tests with low coverage, you can easily add regression tests later if you notice a bug.


The biggest issue in my experience is trying to target metrics. E.g. some % of coverage.

If you encourage developers to write tests with the mindset that "these should make my life easier", you'll get better and more maintainable tests.

If you encourage developers to write tests "just because" you will get very useless tests.


Maybe, in type-safe language like C#. But in a dynamic language, we rely on unit tests to catch obvious typos and regressions.


I'm wondering if that's part of the problem, the fact that unit testing started out in Smalltalk. It does make a lot more sense in a dynamic language, because any line that's not touched is a potential typo.


My problems with unit testing:

- There's no consensus on what a "unit" is.

- There's no consensus on how much coverage is needed.

- There's no consensus on what should be mocked.

- There's no consensus on if private methods should be tested independently, even though they obviously shouldn't.

- It violates "don't repeat yourself".

- The developers who write the tests usually write the code. Most bugs occur when developers overlook a use case. Do the math.


The first example is such a straw man. The LocationProvider class has no business being a class. It should just be a function. Problems then go away.


And the sunset calculator doesn't need a location provider. All it needs are location coordinates which can be given to the calculate function.

In fact the whole thing then becomes a standalone function which takes Location and other relevant parameters and returns the computer sunrise/sunset values. Pure function and super easy to unit test.

If one needs to do a lot of "mockups" for your unit tests then maybe one needs to consider the API and class design. Removing needless coupling helps testability by removing the need to use mocks in the first place.


In design terms, I would claim that Sunset Calculator is a "component" (an integration of units), with a "required interface" (i.e. abstracted) that provides the coordinates.

A role of one unit inside this component is to use the provided coordinates and only perform the calculation.


Unit tests are essential. I've developed some devops code around self-healing where the degrading signals were not reproducible. We knew they occurred because we lost a non-trivial number of nodes a day. A weird set of bugs between Linux, Kubernetes & Docker. The problem was we could recognize the signals and take action. The entire daemon I created to trap and execute on these was built and deployed on unit tests. In fact, I had more lines of unit test code than actual functional code because I had to mock the hell out of what the actual system would look like.

Another situation - where my wife used to work in medical devices. Feature development velocity was a problem because there was a queue on the very limited set of devices (CT scanners, PET scanners, etc.) that the team could use for testing. Debugging was very hard. Fixing bugs was hard because to debug you needed a device. With unit testing and mocks they made a lot of contention on the devices go away.

Write your unit tests. It will help you and the people coming in after you to maintain the code.


Previous HN discussion from 2020: https://news.ycombinator.com/item?id=23778878


Thanks! Macroexpanded:

Unit Testing Is Overrated - https://news.ycombinator.com/item?id=23778878 - July 2020 (387 comments)


> In this article I will be relying on the “standard” definitions, where unit testing targets smallest separable parts of code

Maybe that is the standard understanding, but I don't think it was the original meaning, and I don't think it is a helpful definition. Testing only "smallest separable parts" is what leads to an over-reliance on mocking and brittle tests.

Unit test should test a "unit of behavior", not a unit of code. That is, a unit test should test a single assertion about the behavior of a component. The test should not care how the behavior is implemented, i.e. it should be possible to refactor or fully rewrite the implementation without changing the test, which means the test should not care if one or many sub-component are involved (as long as no external systems are involved).


Does anyone else use unit tests as a form of interactive development?

For languages like typescript and kotlin, I’ve found them useful in situations where I don’t have a repl available.

Custom react hooks are much easier (and faster) to develop in unit tests without having to load in a whole application. Exploring data models and their functions before implementing views is easier in unit tests.

I find unit tests to be a really productive playground and I think it’s great to have a place to later jump in and extend functionality in a safe space.


Tests ensure you always have two consumers for an API (the API "in-place" as used by the functional portion of your system, as well as "how the tests use it"). This ensures new uses of the API have a path/model to follow and prevents you from having to "instantiate the world" just to call your particular function.

I also think of tests as "scaffolding" when building a skyscraper. They make it easier to access certain "things" before they're fully complete and are a cost that makes building the actually building less risky and quicker. Why does no-one question the value of paying for putting up and tearing down a scaffold when building or repairing a building, but can't see the value of unit testing?


I don't consider myself an expert on this topic, but I can't shake the feeling that this person almost defines unit tests in a way that builds them up to fail, but is not how people who use Unit Tests effectively would necessarily define or understand by the term unit tests.

It's certainly not compatible with what Michael Feathers describes as unit tests in his "Working Effectively with Legacy Code" classic, a book I consider the bible on unit tests (even if a bit dated by now).

If anything, many of the things the article describes as drawbacks of "unit tests" are exactly the things that Michael Feathers seems to describe Unit Tests are "not" if they are to be effective as unit tests.

Obviously I'm being awfully vague here, but, yeah. That was the feeling I got from the article. It's just another case of "My particular definition of X, which is not necessarily what others understand by X, is bad. Use what I define as Y instead, which may be what some people used to mean by X in the first place, but let's call it Y from now on because X is bad."


In my current work, which a GraphQL backend API, I've identified only two use cases for unit tests:

- complex pure functions. For instance, I have a street address "cleaner". The input is an unclean street address (weird capitalisation and so forth) and ouputs a clean address. Unit tests are perfect here because they're fast, and it's an isolated piece of logic that has no side-effects (no writing to DB or network request).

- wrappers around third-party libraries. Yes, in theory you're supposed to trust that a library will respect its API unless major version changes etc. In practice, it's not the case. And I'm also lazy and don't want to read all the release notes of all the libraries when doing updates. By wrapping a utility function from a third-party library and unit testing that function, I can have real confidence that the library still works in the way I'm expecting it to work. And it fails loudly when it no longer does.

Outside of these, I only have end-to-end tests that test the whole stack: send a GraphQL query and check that the resulting JSON is what I expect.


> - wrappers around third-party libraries.

That's an "acceptance test" or "fit for purpose" test in case you're looking for industry terms.

Closely related is "law of demeter" (the law of one dot), as it buffers you from changes in dependent API's.


My interpretation of UT has evolved after I began using Clean Architecture. My original approach was to test functions and methods, which had as a consequence the calcification of the code: changes were painful because they broke a lot of test. Their granularity level was too low. Now, my units of test are use cases. They naturally offer great coverture, test and document business logic, and easily allow refactoring.

OP criticizes dependency injection, and comment that its unique purpose is making UT easier. I strongly disagree. I've been able to easily replace web frameworks and OORDB libraries using DI, and even develop business logic before selecting infrastructure technology. In that sense, the fact that UT are easier to write is an indicator that the system is easily adaptable.

A good UT allows me to reduce integration testing to the minimum. What my IT do is verify that two systems can exchange messages, but as soon as I get a request or response representation stripped from any I/O reference, it's UT time for me.


I don't like: - manual testing - fixing bugs found in production under extreme time pressure - people finding embarrassing bugs in my code - spending lots of time explaining what code should be doing to other developers

Unit testing massively reduces all of these. Therefore I assume people that don't like unit tests either enjoy doing these things or are masochistic.


My gut reaction to the title was to disagree. Looking at the article, the thing that stands out strongest is this graph:

https://tyrrrz.me/static/b606743d039cfb56391c5793d0dda8f2/bb...

It shows test cost vs test isolation. I agree with the assumed correlation. My experience though is the real correlation is above the assumed correlation, not below it. My guess is that the author's organization moves part of the cost somewhere else where the author doesn't have to see it. At the start of my career, I was one of those people who had this cost dumped on them so that the devs could ignore it: I was an on-team QA engineer.


A lot of the software I use is extensively unit tested.

This software is still buggy and broken.

Ergo, unit testing is not working.


one example: sqlite3 has more testing code than its own code, so it can swap the whole design at will once a while, it is working there. unit testing is essential to me.


fuzz testing be damned.


One argument against 100% test coverage that I've never heard articulated in my decade career as a professional software engineer is that we need to learn to just trust trustworthy software engineers. Yes, there will sometimes be mistakes because we're human. But I'm convinced 100% test coverage is a form of extremism. Once an engineer has proven their ability, and as long as they continue to prove their ability, then a test suite becomes a new kind of tool: not one to prove to management or stakeholders that the code works, but one to prove to the senior engineers that a certain spot of the code works as they intended it.


Unit tests are never a tool to prove stuff to managers/customers/stakeholders. They are a tool to prove to yourself and your teammates that your code works and have some reasonable amount of confidence that it will continue to work when your teammate (who might be fresh college grad hire with no experience) changes it.

Testing has nothing to do with engineer ability or experience. Regardless of that you can always make a mistake. And even if you are a rockstar Sr dev who never makes a mistake, in the real world any piece of code is gonna get worked on by people of all skill levels and experience so the idea that "a senior engineer wrote it so we can just trust it" doesn't work even if one accepts the assumption that you can just trust the code from an experienced dev. I am against 100% coverage requirements as well (except for perhaps dynamically typed languages) but I would still expect a reasonable amount of coverage and that coverage include good tests.


> Unit tests are never a tool to prove stuff to managers/customers/stakeholders.

They are in certain agile circles. Ones I used to work in.


> we need to learn to just trust trustworthy software engineers

Absolutely agree, and when we have identified trustworthy software engineers, we also need to give them authority. This doesn't work however, because it puts the middle manager on the spot to make this judgement call, and it also goes against the grand master plan of software engineering which is to turn it into outsourceable factory work.



So, they suggest that you spin up the dependent services as part of the test runner and run tests for your service against those? I've worked on a codebase that did this extensively. Over time it lead to a test suite that takes a long time to run, slowing down the release process considerably, and inevitably requiring further engineering effort to make the tests run in a reasonable amount of time. I can't say I would recommend it as an approach based on my experience with it, but YMMV.


Didn't even knew that Spotify wrote an article about this "methodology" thanks for sharing.

No, all you have to do is spin up your service under test, its database and the service bus/event hub/whatever you use to communicate between services.

Nowadays with Docker it's really easy and fast to spin up test instances during your test run. (4-5 LoC with a good library in C#).

Since 1 or 2 years I'm doing the exactly same thing on my projects and it works great to speed-up development process and if a bug pops in production it's really easy to add it as a future test case.

You're only attached to the input request (endpoint, query params, request body) and the response body. But that's ok because it's already a part of the contract between the API and its consumers. That's it.


>Over time it lead to a test suite that takes a long time to run, slowing down the release process considerably

Slower, but the test suite is far more effective at catching bugs.

These tests do take more up front engineering but since they arent pale imitations of the code they also dont need to be totally rewritten every time the code is refactored. Higher capex, lower opex.


You don’t use persistent databases to run these tests but put them on a tmpfs or in memory, right?


Can you (or someone else) tell me more about how to run database tests efficiently?

I'm working on an app that has 200+ tables and requires a lot of data to be in the database before anything even works at all. Of course good tests interacting with the database need a fresh database every time so even a test suite of about 10 tests can take 10-20 minutes already.


Yes correct then you can do Them in parallel


Tests are an extension of documentation, in my opinion. I go on Github and I read the readme then I go to the test folder and try to see some valid usage examples.

This is what writing useful tests mean, from a client perspective I just want to understand what's valid and what's invalid.

Also useful to make sure we fixed specific scenario bugs.

I've seen a lot of people doing things wrong but we can summarise in two cases: - "We've got 100% coverage, it must mean we've got no bugs, right?" - We've got tests covering happy path, everything else is tested manually. It takes forever and it's not shared anywhere, but "trust me" I've tested this.


A similar discussion from 17 days ago: "The Big TDD Misunderstanding" -> https://news.ycombinator.com/item?id=30734241


To me it seems that author's quarrel is not with the unit testing per se, but with the idea of the 100% unit test coverage. And I agree with that, coverage is as useless metric as the number of lines of code, its sole purpose is to make managers happy.

Tests themselves are not overrated, just like with everything else in engineering one has to use some common sense when choosing what to test, and what not, it's as simple as that - but because this can't be easily bottled and prescribed as a rule of thumb, managers/leads prefer to cover their asses by insisting on 100% coverage...


Unit tests are useful because it improves software design by encouraging you to develop modular software units that can be tested independently.

However you need integration tests and end-to-end tests to improve software correctness.


> the interfaces we’ve defined serve no practical purpose other than making unit testing possible

That's actually almost never true - abstracting the sorts of external dependencies (like databases and web services) that actually cause problems for unit tests make the code more manageable in terms of maintenance and troubleshooting.

The problem is when people go mock-crazy and mock every class that the unit depends on. Rather, just mock the things that can actually make the unit test fail if they're unreachable for some reason independent of the actual code.


This is more of a criticism of mock heavy testing than anything else. The author should read Beck’s original TDD book and experiment with testing without mocks.


Used to feel this way and largely agree when there are less contributors on the project or the tests are so trivial like things doing what they obviously do. However, unit tests can describe behavior and written succinctly can be beneficial for other people’s eyeballs. I try not to get into discussions about “clean” or “elegant” code for the most part, as most people prefer the smell of their own farts.


I remember being on a team where I'd frequently point out that the tests were not only flaky, but useless. I suggested that we delete the useless & redundant tests and either fix or delete the flaky ones.

That suggestion was met with not only resistance, but almost hatred. I think some of them got way too attached to these tests and could no longer see the forest for the tress.


Unit Testing is software construction's "we have to do something to validate, this is something, we have to do it".


I was building an API and started adding some tests, then added coverage and It got me to think about the complex-ness of the codebase, how many branches / statements I had per file to test etc, I think it helps greatly to reason about modular and good software. But it takes time too. YMMV


For a contrary view, see:

https://quii.gitbook.io/learn-go-with-tests/meta/why

The book is Go specific but there's a lot of wisdom in this particular chapter that applies to any language.


To me, the value of unit tests really depends on the power of the type system. If you're in working with Haskell, Ocaml, ... , unit test have almost no value. However, if you're using python, php, javascript,... unit tests will help you to avoid most silly mistakes.


You're only talking about the subset of tests which the type system can supplant. Logic bugs are outside of the type system's domain, and the generalization "unit tests have almost no value" does not apply to unit tests which validate logic.


To me, Unit Tests are live documentation for your code. You definitely do not need 100% coverage, but if you can exercise a good percentage of your code, you are making your future "you" live easier.

Just consider that in many cases code lifespan is 20% dev , 80% maintenance.


>Tests, whose scope lies higher on Mike Cohn’s pyramid are either written as part of a wider suite (often by completely different people) or even disregarded entirely.

My experience is the exact opposite. Actually my experiences are the opposite of many things in this article.


unit testing is essential, TDD might be a stretch in practice.

instead of having separate test/ directory with multiple unit test files, I put unit test code right in the source code(at the end of the file), so when I modify the source I can easily update the unit test in the same place, and they never get out of sync. It worked for me very well. They can be turned on/off easily too. Sometimes it's easier to understand the code by reading the unit tests than reading the whole lengthy code, and it's helpful when they're in the same place. For me, unit test code is part of the real code, not something 'additional'.



the testing pyramid is worth knowing about; unit tests double code size and maintenance cost for the unit; I've found unit tests only fit long-lived code and coverage doesn't generally correlate with reliability of product performance; black box end to end testing at the interface level for the general scenarios does correlate with reliability, for those scenarios; writing tests that properly separate concerns in this way seems to make it easier to write and maintain tests, and the product


> Consider mocking only as a last resort

Consider trying to write tests in a non-trivial Spring Boot app without mocks


> Unit tests lead to more complicated design

You will here the exact opposite from proponents of test-driven design.

The most complicated code I've seen is integration-test heavy code. It turns into a big ball of mud very quickly as new features added simply add more lines of code to already long procedures. Over a hundred lines of database calls, HTTP calls, variable bindings, conditions and branches. Code like this doesn't compose well and is a refactoring nightmare.

Where writing testable code can lead to complicated designs is where developers are not thinking about abstractions and are, instead, introducing indirection. This is a common solution in many object-oriented languages with inheritance and virtual methods. There's a whole essay and reams of textbooks on what makes a good abstraction in a precise way but most developers think abstraction means indirection.

However, if developers are concerned about writing abstractions it often reduces indirection and leads to easier-to-test code. This is because abstractions introduce precise semantic layers and code under test then shouldn't be concerning itself with underlying layers. A test-driven approach can lead to better designs but I think the claim that it will guarantee them is a bit over-sold.

> Unit tests are expensive

It appears the author means in the form of developer time. In terms of execution time if the tests are slow there's an underlying problem that should be addressed.

In general is it more expensive to ship errors to production or to take the extra time to validate your assumptions, invariant properties, etc?

> Unit tests rely on implementation details

If you write them that way, they will. If you spend most of your effort mocking out dependencies in your code it means you're missing opportunities for abstraction. Users of your code must care how the underlying dependency is implemented and there fore there is no abstraction.

If folks are forging ahead with mocking out dependencies like this across the code base you have bad abstractions and tight coupling in your code. You could step back from this and treat it like a bug and fix it... but this isn't a problem of the practice itself if people are ignoring the signs.

The authors solution to their straw man is... more end-to-end, behaviour driven tests. While valuable these are demonstrably more time consuming activities: both in execution time and in developer time.

I think a good peppering of unit tests is healthy; it aids in refactoring different layers of the application and gives decent indicators if your refactoring changes behaviours in a rapid-feedback loop.

However I think unit tests are... still not great. There's not a lot of empirical evidence to support them in reducing error rates in code. And I think that tracks with their use in verifying systems: at best they're an example of a single input producing a single expected output... but the domain and range of most units is quite large and even a hard-core unit-testing team will not cover most of the problem space. In essence they only prove the existence of errors, not their absence.

For that we need better tools like property-based tests and verification techniques like refinement typing, separation logic, etc as suits the problem space.


I stopped reading half way through his claim.

TL;DR The guy doesn't understand how to test code and he doesn't understand test isolation. He responds to requests to eat his vegtiables as "but I dont hafta"

Unit tests are intended to test small mostly uncoupled parts of the code. (Usually scoped to single functions). Integration tests confirms the combination of functions, or the class in a context.

Testing of user behavior is intended to be done in the functional tests, and only that. When you get to functional tests you should assume that most of the code is trustable underneath, as that the tests support that.

The intention of the separation of the testing strategies means you don't have to do a ton of combination tests in a higher level (where the test is more expensive to run, write and maintain).


i really wish unit testing was easier for graphics / webgl / 3d development


Testing in general is overrated :)


This is how we get Windows Vista


amen.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: