
Write tests. Not too many. Mostly integration - wslh
https://blog.kentcdodds.com/write-tests-not-too-many-mostly-integration-5e8c7fff591c
======
moultano
I'd take a slightly different take:

\- Structure your code so it is mostly leaves.

\- Unit test the leaves.

\- Integration test the rest if needed.

I like this approach in part because making lots of leaves also adds to the
"literate"-ness of the code. With lots of opportunities to name your
primitives, the code is much closer to being self documenting.

Depending on the project and its requirements, I also think "lazy" testing has
value. Any time you are looking at a block of code, suspicious that it's the
source of a bug, write a test for it. If you're in an environment where bugs
aren't costly, where attribution goes through few layers of code, and bugs are
easily visible when they occur, this can save a lot of time.

~~~
1_player
I have adopted the same philosophy. A few resources on this, part of the so-
called London school TDD:

\- [https://github.com/testdouble/contributing-
tests/wiki/London...](https://github.com/testdouble/contributing-
tests/wiki/London-school-TDD) (and the rest of the Wiki)

\- [http://blog.testdouble.com/posts/2015-09-10-how-i-use-
test-d...](http://blog.testdouble.com/posts/2015-09-10-how-i-use-test-doubles)

\- Most of the screencasts and articles at
[https://www.destroyallsoftware.com/screencasts](https://www.destroyallsoftware.com/screencasts)
(especially this brilliant talk
[https://www.destroyallsoftware.com/talks/boundaries](https://www.destroyallsoftware.com/talks/boundaries))

\- Integration Tests Are A Scam:
[https://www.youtube.com/watch?v=VDfX44fZoMc](https://www.youtube.com/watch?v=VDfX44fZoMc)

All of these basically go the opposite way of the article's philosophy:

Not too many integration tests, mostly unit tests. Clearly define a contract
between the boundaries of the code, and stub/mock on the contract. You'll be
left with mostly pure functions at the leaves, which you'll unit test.

~~~
GFischer
Thanks for the links, they make sense - I've always had trouble with blind
"you should unit test" advice, but especially the video explains the reasoning
very well :)

------
kartan
So much people in this thread is talking about different domains and are not
able to see that they need different rules.

It is not the same creating a library that is going to be used to launch a
multi-billion rocket to Mars than developing a mostly graphical mobile app
where requirements are changing daily as you A/B test your way into better
business value.

The article has really good points and the reasons why they work. Apply them
wisely. Take the right decision for your project. Don't be dogmatic.

~~~
protonfish
The fight against the TDD cargo cult is vicious, barely started, and far from
over.

~~~
dang
Please don't do this here. If you have a substantive point to make, make it
thoughtfully; if you don't, please don't comment until you do.

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

------
hinkley
Good lord. Why integration tests?

    
    
        I think the biggest thing you can do to write more integration tests is to just stop mocking so much stuff. 
    

Okay. The biggest problem I see with people trying to write unit tests is that
they don’t want to change how they write code. They just want tests for it.
It’s like watching an OO person try their hardest to write OO code in a
functional language.

So they try to write E2E tests which work for about 5 or 6 quarters and then
fall apart like a cheap bookcase. If you can find a new job before then, you
never have to learn to write good tests!

I agree with the author that the trick is to stop using mocks all the time,
but you don’t have to write integration tests to get rid of mocks. You have to
write better code.

Usually if I have a unit test with more than one mock it’s bcause I’m too
invested in the current shape of the code and I need to cleave the function in
two, or change the structure of he question asked (eg, remake two methods into
two other methods).

Almost always when I accept that the code is wrong, I end up with clearer code
and easier tests.

Unit tests run faster, are written faster, and not only can they be fixed
faster, they can be deleted and rewritten if the requirements change. The most
painful thing to watch by far is someone spending hours trying to recycle an
old test because they spent 3 hours on it last time and they’ll be damned if
they’re going to just delete it now.

~~~
josteink
I work in a company where quite a few of the developers simply are incapable
of writing anything but integration-tests.

The reason? They don’t “believe” in unit-tests. They don’t think unit-testing
“works in the real world”.

They absolutely fail to accept that they need to write their code differently
for automated testing to work well.

How do you change such a mindset?

~~~
arkh
It depends on what you unit test and why.

If it's for 100% test coverage: forget about it.

If you test private methods: you're doing something wrong.

What people usually see from the "unit test evangelists" are codebase for
which you have tests for every method in the code. Then you do some
refactoring and you have to rewrite tons of tests. And as those tests are just
made to get 100% coverage you end-up with logic bugs because most unit tests
have been written to go through the code, not check limits and edge-case. When
you stumble upon this kind of test harness you can only think this as only
cons (more to write upfront, less willingness to refactor) and no pro (the
code is still brittle). Then your integration tests feel like your real
harness: you can change anything in your code it'll tell you what has been
broken when used.

Now if you consider your unit tests as a kind of integration tests for the API
your classes present then you get the benefits of unit tests. But this mean
testing only public methods. And mutation testing resilience is a better
metric than test coverage.

Also: those tests do not replace a real documentation which can be a lot
faster to read and understand than code.

~~~
growse
> If you test private methods: you're doing something wrong.

Maybe a silly question, but why?

If I refactor a class to pull a common piece of functionality into a private
method, why would I not want a test for that?

One of the principle benefits of tests I see is allowing me to change the
implementation without worrying about the behaviour, and I'm not sure why that
wouldn't apply to private methods?

~~~
chewbacha
One reason why is because you should be testing the public behavior of a
function/class not the details. The reason for this is because the public
interface is what other parts of the codebase will come to rely on.
Refactoring generally shouldn’t change the public interface as it will break
other pieces of code within your codebase, or other codebases if it’s a
library, and other systems if it’s a network api. So, if you test the public
interface, generally refactors won’t break the tests.

Testing private functions also seems to be a smell that the overall setup of
testing the class or function is too difficult. This can be because the class
has too many branches in it, the argument list is too large, or too many other
systems must be in place for it to function correctly. This, to me, indicates
a public interface that is hard to use and will pass much of these issues on
to the caller.

Lastly, if you are testing private functions to gain coverage then arguably
the behavior in the private method isn’t actually useful to the public
interface. The reason I say this is that testing the behavior of the class
should end up touching all branch conditions inside the class or the public
interface isn’t fully tested. By only testing the public interface it then
also becomes easier to locate dead/unreachable code.

Hope that answers the why.

~~~
BlackFly
I would argue you absolutely need to be testing the internal details. That is
the entire point of measuring branch coverage and performing mutation testing.
Unit tests are not black box tests. They need to know that for special values,
the unit has to follow a different code path but still produce sensible
output. Reading the documentation of a function is not sufficient to determine
what edge cases the unit has, but testing those edge cases is often critical
to verifying that the unit adheres to its specified behavior under all
conditions.

As for the smell, sometimes things are irreducibly complex. Some things in
this world do require tedious book keeping. All the refactoring in the world
cannot change the degrees of freedom of some problems.

Tests on consumers should not test branches of subordinate units. If you did
this then the number of tests would explode exponentially with the number of
branch conditions to handle all the corner cases. If a private unit produces a
list of objects, but has special cases for some values of its argument, test
those branches to verify it always produces the correct list. Then just make
sure each caller does the correct thing with the list of objects. That is the
purpose of separation of concerns: the consumer does not need to know that
some values were special.

~~~
afarrell
> the number of tests would explode exponentially with the number of branch
> conditions to handle all the corner cases

Then wouldn't you want to write something that was able to iterate through
those edge-case interactions and ensure they are correct?

------
throwaway5752
Why does everyone rethink a working strategy. Write lots of unit tests that
are fast. Write a good amount of integration tests that are relatively fast.
Write fewer system integration tests that are slower. The testing pyramid
works. He even talks about it in this post, and then ignores the point of it.

You write lots of unit tests because you can run them inline pre-commit or in
a component build. If you integration tests are numerous and interesting
enough, that won't work. They are better suited to gating feature branch
merges. System integration (the actual name for "end to end") take longer and
usually gate progressively more stable branches upstream, or nightlies
depending on where you are.

~~~
a-saleh
Because for certain kind of project the strategy stops working.

I work as QE on a fairly large, ~7 years old project. Micorservice
architecture has been attempted. We always merge to master, which means that
_everything_ more-or-less is a feature-branch merge. We have too many
repositories to count.

And what we learned is, that most of the components we have are just too thin
to allow for useful unit-test coverage. Almost everything is
[Gui]--request->[Middleware]--request-->[Proxy]--request-->[Backend]->[Database].

In reality, [Middleware] and [Backend] probably should have been a single
component, but devs wanted to do microservices, and be scalable, but they
didn't really understand the bounded contexts of their services.

All of this leads us to a place, where unit-tests don't tell us much.

On the other hand, we managed to spawn [Middleware]->[Backend]->[Database],
and we can run a useful integration tests-suite in ~2 minutes.

So, on one hand, if we desined this better, the good-old pyramid might be a
working strategy. On the other hand, if I can get actual services running in
minute, and test them end-to-end, I don't think I will bother with true unit-
tests on my next projects. I.e. why mock the database, if I can spawn it
seconds :-)

~~~
miskin
So, if I understand it correctly, Middleware and Backend should have been
single component since it's one bounded context and splitting it makes one of
those feature envy? Is there some benefit keeping these separate or is the
cost of change too high at this point? If it's not about features, but more
about API, have you tried Consumer-driven contract testing approach?

~~~
a-saleh
The reason was, you can have more instances of backend for a single middleware
and that should have helped with scalability.

If we had the resources to do the refactoring, we would probably end up with
two-three different backends for various contexts, and without the middle-man
between the gui and the backends.

On the other hand, the cost of change is probably too high, and most probably
this version of our product will be kept on minimum-resource life support.

We are looking for doing consumer-driven testing for our new set of services
we are working on.

------
shados
I go the complete opposite way.

I've tried various testing strategies over 15~ different companies in all
sorts of environments, and unit tests are the only thing that really work (IF
you can convince the team to do it...and that's a big IF).

The article starts with a point I agree with: the lower in the pyramid, the
cheaper the tests but the lower the confidence level they bring. That's true.

Where I disagree is how much the difference on confidence and cost are.

I can bang out 500 unit tests faster than I can do just a few E2E tests in
most large apps. They require almost no trial and error, no real engineering
(I feel strongly that abstraction in unit tests is bad), and all around are so
easy to write, I don't mind if I have to toss out 150 of time when I make a
significant refactor.

E2E tests are amazingly brittle and require a careful understanding of the
whole system. They're impossibly expensive to write. They're the only thing
that tells you that stuff works though. So you want at least a few of these.

Integration tests are just flat out awkward: you need understanding of a
significant portion of code you did not write or touch, they often require
complex fixtures (because your test will go through several code paths and
might depend on a lot of arguments), they're slower (because a lot of code
run), and while you don't throw them away when changing implementation details
(unless they involve side effects), you still throw them away when refactoring
or changing public interfaces. I've worked with a lot of people who were very
vocal about these being so much better, then in the same breath complain that
they spent all day writing integration tests.

There's an exception here which is acceptance tests for libraries, especially
when doing a full rewrite: the tests that tell you public interfaces used
outside of the current context work (as opposed to public interface of objects
used in the implementation). Eg: if I was to test lodash or react, that's how
I'd do it.

Unit tests to be are about a lot more than "is this change breaking my code".
And if that's all you care about, you're missing a big part of the point.

If you have 3 units, A, B and C. A calls B which calls C. If you have a test
for A in the context of B, a text for B in the context of C, and a test for C,
and they all pass, you know that A + B + C will work. But when writing the
tests, you only had to care about itty bitty tiny pieces of code, which made
things super cheap.

Then you get other huge benefit: the quality of the entire code base is higher
(a side effect of it having testable interfaces all across), the reasoning
behind each piece of code is explicit (no one wrote a function that work and
they 're not sure why, else the test would be very hard to write), you
automatically have a document representing "intentions".

And yes, if you change a module, even if its not expose to your customers, the
public interface of that module has tests and the tests will break. But they
usually take nothing but a few minutes (often a few seconds) to write. They're
cheap enough to be disposable.

And once you have 80%ish unit test coverage, you actually have a very high
confidence level. I've gone through major refactoring of multi-million line of
code apps with almost no bugs on pure unit tests. You't think the 20% of
untested code would be a source of bug, but statistically, that's just not how
it happens.

In term of person-hour to ROI, pure unit tests just straight up win out.

The reason software engineers fight back to hard against them is that they're
brain dead and repetitive to write, and they can't resist overengineering.
"This is such a simple test for such a simple piece of code, why should I test
it?!". That's the point. All unit tests should be like this.

~~~
hinkley
The second group I worked with that was earnestly interested in mature testing
developed the 5/8ths rule.

To move a test one level down the pyramid, it takes about 5x as many tests.
But the tests run 8 times as fast. So moving a test down takes more than 35%
off the run time, and _it fails the build minutes sooner_. If you drop it down
two levels it's 60% off the run time.

Interesting enough on its own, but maintaining those tests after one
requirements change, plus the cost of rewriting them in the first place, is
less work than the cost of maintaining the original tests. We didn't come up
with a number for this but the difference was measured in man-days and missed
deadlines about once a month, and we were convinced by the evidence.

I also agree with both your 'braindead' comment and your 80% estimate. The big
payoffs come between 75% and 85% and above 85% you start getting artifacts.
That 'data' distract more than it helps.

~~~
shados
Yup. I think one big issue is that a E2E or an integration test is useful on
its own, while a single unit test is almost totally worthless. You don't have
confidence of anything until at least 50% (and at 80% you have almost perfect
confidence).

So when people get started, especially on an old code base, they feel its
pointless and doesn't pay off. Can't blame them, I suppose.

Good that you bring up build time. I forgot to mention that. We have repos
with thousands of tests where the whole suite runs in <1 minute and gives us
very high confidence (actually the only other tests we run on that repo are
visual regression tests for CSS, and even E2E tests don't catch those
issues...). During that time I'm watching other teams waiting 20 minutes on
their integration test suite. Nope nope nope.

------
smizell
One of the most transformative things I've come across for how to structure
and test code has been Gary Bernhardt's talk on Boundaries [0]. I've watched
it at least ten times. He also has an entire series on testing where he goes
deeper into these ideas.

In this video, he talks of a concept called functional core, imperative shell.
The functional core is your code that contains your core logic that can be
easily unit tested because it just receives plain values from the outside
world. The imperative shell is the outside world that talks to disks,
databases, APIs, UIs, etc. and builds these values to be used in the core.
I'll stop there—Gary's video will do 100x than I can do here :)

[0]
[https://www.destroyallsoftware.com/talks/boundaries](https://www.destroyallsoftware.com/talks/boundaries)

------
edem
I agree with the part that you should write tests, but I definitely disagree
with the part that most of your tests should be integration tests.

As you pointed out the testing pyramid suggests that you should write more
unit tests. Why? Because if you have ever tried TDD you know that unit tests
make you write good (or at least acceptable) code. The reason for this is that
testing bad code is hard. By writing mostly integration tests you lose one of
the advantages of unit testing and you sidestep the bad code checking part.

The other reason is that unit tests are easy to write. If you have interfaces
for your units of code then mocking is also easy. I recommend stubbing though,
I think that if you have to use mocks it is a code smell.

Also the .gif with the man in pieces is a straw man. Just because you have to
write at least 1 integration test to check whether the man has not fallen
apart is not a valid reason to write mostly integration tests! You can't test
your codebase reliably with them and they are also very costly to write, run
and maintain!

The testing pyramid exists for a reason! It is a product of countless hours of
research, testing and head scratching! You should introspect your own methods
instead and you might arrive at the conclusion that the codebase you are
working on is bad and it is hard to unit test, that's why you chosen writing
mostly integration tests.

------
pleasecalllater
Sounds good in theory. In practice there is one problem with having
integrations tests only. The test are generally simple: they pass or they
fail. A unit test tests just a small functionality, so when it fails, it's
quite easy to find out the problem. When an integration test fails, then we
can spend hours debugging the whole stack of layers trying to find out the
real problem.

I had this situation once. Every failing integration test ended with hours
spent on writing unit tests for all the places used by the test.

~~~
fpoling
From my experience an integration test failure that requires significant
efforts to investigate can only be covered with unit tests after one knows
where the problem comes from. One cannot realistically write a bunch of unit
tests and expect them to cover the problem unless one already knows about the
problem.

~~~
ndh2
It's called shotgun unit testing.

------
methodover
Is there solid evidence to back up some of the assertions that have been made
about testing?

It feels like an area where lots of people have opinions, and there are not
much in the way of facts.

~~~
p0nce
There are very serious books about software quality with actual data, but it's
much easier to tell each other anecdotic experiences on the internet - in a
weird mix of bragging and strawmen arguments. That's how our field is
stagnating.

------
merb
I think the answer would be it heavily depends on what you are doing. if you
are creating a library that operates on a protocol, unit tests are necessary /
extremly important.

if you are writing a ERP where a lot of your code NEEDS to operate WITH the
database you are better of with integration tests, because mocking away the
database would lead to so much bugs, especially if your database is extremly
important (and not just a dumb datastore)

Edit: having any tests is always better than having none.

------
nathan_f77
The puffing-billy [1] library is awesome, and has changed the way I write
integration tests. I also use VCR [2], and now my entire application (both
backend and front-end) is wrapped with a proxy that records and replays every
request. I can run all my tests once using test Stripe API keys, a test
Recaptcha response, or any other external services that I want to test. I
don't have to mock anything, which is nice. Then everything is recorded, and I
can run all my integration tests offline.

I've also really enjoyed using stripe-ruby-mock when testing specific
webhooks, jobs, and controller actions. I don't always aim for 100% test
coverage, but I try to write a LOT of tests for any code that deals with
billing and subscriptions.

Ooh, I've also been enjoying rswag [4]. It's quite a cool idea - You write
rspec tests for your API endpoints, and the tests also serve as a Swagger
definition for your API. So when your tests pass, you can use the output to
generate documentation or API clients for any language.

[1] [https://github.com/oesmith/puffing-
billy](https://github.com/oesmith/puffing-billy)

[2] [https://github.com/vcr/vcr](https://github.com/vcr/vcr)

[3] [https://github.com/rebelidealist/stripe-ruby-
mock](https://github.com/rebelidealist/stripe-ruby-mock)

[4]
[https://github.com/domaindrivendev/rswag](https://github.com/domaindrivendev/rswag)

------
breatheoften
I think the testing pyramid reflects a false correlation — it seems to assert
that higher up the pyramid tests are more expensive to write/maintain and
longer to run.

In reality the execution time of a test says nothing about how hard the test
is to write. Sometimes a very fast to execute unit test can be much harder to
write/maintain than a longer running test that avoids mocking an api and
perhaps utilizes abstractions in the test definition that are already written
to support the program’s features.

I think test suite execution speed is the real metric to focus on for most
projects — to get the most value, test suites should accelerate the time to
useful feedback. Write tests in the simplest way that provides useful feedback
into the behavior of the system and runs quickly enough that you can receive
that feedback with low latency during development.

I quite like tools like jest and wallabyjs that use code coverage data to
figure out which tests to rerun as code changes — means you can have a test
suite that includes slow(ish) to execute tests but still get feedback quickly
in reasonable time as you make changes to the code.

~~~
Vinnl
> to get the most value, test suites should accelerate the time to useful
> feedback

Well, they should also optimise the usefulness of the feedback they provide.
Typically, tests higher up the pyramid are also more brittle (e.g. end-to-end
tests might fire up an entire browser and Selenium), and thus are more likely
to fail when in actuality, nothing is wrong. That's an additional reason for
limiting the number of those tests.

~~~
breatheoften
Brittle tests seem not useful in general though aren't they?

I'm not sure its necessarily true that brittleness must correlate with height
in pyramid or execution time -- in my experience brittleness correlates with
selenium more than it does pyramid height (that's a statement about selenium
more than it is a statement about any particular category of testing pyramid).

Its possible to write very useful non-brittle tests using something like
headless chrome ...

~~~
Vinnl
No they're not.

But yes, Selenium is brittle. That said, Google engineers actually did some
investigation into this, and although I think their methods were probably a
bit heavyweight, they did conclude that it's mostly RAM use that leads to
brittleness.

[1] [https://testing.googleblog.com/2017/04/where-do-our-flaky-
te...](https://testing.googleblog.com/2017/04/where-do-our-flaky-tests-come-
from.html)

~~~
breatheoften
Interesting thanks for the link!

I’m curious how many tests were in the small size range for that chart which
provides evidence to show the size-flakiness correlation holds in tests that
use tools associated with higher than average flakiness...

I’m also feeling like I want to have more clarity around the mechanism for
measuring flakiness — the definition they use is that a test is flakey if it
shows both failing and success runs with the “same code” — does “same code”
refer to a freeze of only the codebase under test or also a statement about
change to the tools in the testing environment ...?

I wonder what the test suites for tools like selenium/WebDriver look like ...
do they track a concept of “meta-flakiness” to try and observe changes to test
flakiness results caused by changes to the test tooling ...?

~~~
Vinnl
Yeah, good questions, the post leaves some to be desired. And meta-flakiness
tooling actually sounds like it could be really useful!

------
branko_d
Integration testing is especially important when talking to a database. People
seem to like mocking the data, but that completely misses the subtleties of
how databases actually work, including aspects such as concurrency,
transaction isolation levels, locking and such.

There should be a few well crafted tests that modify the database from several
parallel threads, and afterwards verify that no invariants have been broken.
This is pretty much the only opportunity to catch a race condition in a
controlled environment.

~~~
gipp
This is the part I don't get when people preach the gospel of ultra-isolated
test suites where nothing talks to an external system. The most nontrivial
parts of your code, the most likely to break, are those that talk to external
systems.

Building a mock of that system sophisticated enough to sufficiently capture
even a significant fraction of the "real" failure modes is just as, if not
more error-prone and daunting a task.

Totally isolated tests make it impossible to test the most failure-prone parts
of your code in anything approaching a satisfactory manner. Maybe I'm just
misunderstanding people's advice but it drives me crazy when I see that.

~~~
MartinCron
A lot of the ultra-isolated test advocates seem to come from the kind of
development world where you can exercise most of the value without needing to
interact with external dependencies. With aggressive enough mocking, you can
deceive yourself into thinking that any application is in that domain.

I know I have done it, and the application worked PERFECTLY until I hooked it
up to a real database.

------
oweiler
As always it depends. There are projects where integration is the hard part
and others where the business logic is impossible to get right without unit
tests.

Unit tests also have the added benefit that they tell you exactly the reason
they failed (because there can only be one) whereas integration tests can fail
for multiple reasons.

~~~
toasterlovin
Yeah, but usually an integration test tells you what line it failed on, which
gets you fairly close to knowing the exact reason it failed.

------
ezanmoto
Just reading the comments, I agree with the common sentiment that you should
have more unit tests than integration tests, but I have come around to the way
of thinking that if I only have time to write a few tests then I would rather
write E2E tests. This way, at the very least your entire stack is being
exercised, and you have a way of ensuring that the happy path is passing
consistently, which is the most important flow for an application (even if I'm
personally more interested in keeping other flows sane). While I prefer unit
tests due to their simplicity, speed and the speed at which they can aid
debugging, these days I will only implement them after I have added a few E2E
tests.

------
jph
Behavior driven development (BDD) and test driven development (TDD) enable
much faster coding when done well in my experience, and that includes full
coverage fast unit tests, functional tests, benchmark tests, and integration
tests.

Unit tests are worth their weight in gold for quickly finding issues, both in
our team's code and especially in cases of subtle changes among language
releases, or unanticipated input changes, or dependency changes that are
supposed to work but don't.

IMHO unit tests lead to better functional approaches, better long range
maintainability, better security, and much better handling of corner cases.

~~~
hacker_9
I think the main selling point for me on TDD was breaking the boring build
loop where I would have to wait for the project to compile, the webpage to
load, click on all the buttons to get to the bit I'm testing, and then finally
see if my code worked. Then repeat as many times as it fails.

With TDD testing functionality is always just one button click away - I
actually have fun again at work.

------
ivan1931
I disagree with this article.

Software projects can be extremely large and have wildly different
requirements. Software that needs to operate at high scale and require high
reliability will have differenet requirements for say - a web application that
has low traffic.

I think making rules of thumbs like the title of this article defeats the
purpose of one of the essential tasks of being an engineer - making good
tradeoffs between different approaches to solving problems. It's not hard to
see that some projects will require more unit testing, some may require more
integration testing and others may require more of both.

------
MichaelMoser123
The problem with this kind of advise is that projects are all very different -
there is no one fit way to test it.

~~~
misja111
This is the only true statement that can be made about this whole
integration-/ unit testing debate but it will never be popular because it
doesn't seem to be simple enough.

Most people like black and white guidelines. 'Unit testing is good,
integration testing is bad', something along those lines. Simple to remember,
simple to apply. Unfortunately it doesn't match reality.

The truth is, every testing strategy is a tradeoff. Integration tests are
great at catching regressions and verifying business requirements, but they
can take a lot of time to construct, tend to be slow and don't give you much
clues where to find the error when they fail.

Unit tests are fast, easy to build and when they are well written they will
point you right to the point of your code that is wrong. However, the more
specific they are, the more they tend to test your implementation choices
instead of real business requirements.

But even the above is not always right. When you are building a library, you
might be able to create unit tests that test your api and then you have the
best of both worlds, your unit tests do actually verify your business
requirements. On the other hand, it might be that you are building an
application which doesn't depend on a lot of data and which can be
orchestrated pretty well in a test set up, and therefore integration tests are
actually easy to build; in that case you'd like to focus more on integration
tests.

Tl;Dr; there is no silver bullet, in the end the only right thing to do is let
your test strategy depend on the characteristics of the project.

------
waibelp
Why are most people using huge hero-graphics (495kb?!) for short blog postings
nowadays?

On topic: Testing gets worse if the codebase which needs to be tested is
garbage. My experience with other developers who learn testing don't need to
learn how testing works - they need to learn basic development rules:
Components, loose coupling, dependency injection, ...

------
andmarios
Integration tests are very important and in many cases mandatory. I'd argue
though unit tests with good coverage are more important. The latter helps you
make sure your application has the correct output. Nothing worst than having a
piece of software you think is working and silently introducing errors. The
former (integration tests) makes sure your application starts and probably
works in some scenarios.

With docker and containers, integration testing is easier than ever.

I've written a very simple integration testing tool (a quick hack to be
honest) that executes commands, checks their exit code, the output through
regexp, etc and produces a nice html report.

[https://github.com/landoop/coyote](https://github.com/landoop/coyote)

Originally written for testing some configuration tools and scripts, we
quickly found a use for it for our Kafka connector collection (Stream
Reactor). Our connector integration tests are at github as well:

[https://github.com/Landoop/kafka-connectors-
tests](https://github.com/Landoop/kafka-connectors-tests)

Obviously unit tests make sure that the connectors work as expected.
Integration tests catch simpler errors that can be showstoppers. For example
an internally renamed configuration option that didn't trickle down to the
configuration parser, class shadowing issues between the connectors,
unexpected errors in logs that the developers should check out etc. As the
tests are written by people who don't do development, they also expose
problems in the documentation. In some cases they provide us an easy way to
quickly run locally a connector and catch some issues manually, like extensive
CPU usage or way too much logging (e.g a function with a log statement was
inside a loop that run hundreds or thousands of times per second).

It gives us confidence in what we ship with every release.

------
sandos
I am currently in a project where we do not use TDD, even remotely, but we do
have almost 100% coverage, since we need it for certification purposes.

These tests are fairly easy to write, actually, with a test framework that
helps mock everything automatically.

Then we have some internal tools written that makes these tests worth a lot
less, we produce a lot of the test data and then just record the outputs as
"correct". We have no real idea if they are correct, but at least the tests as
they are now works as regression tests. This is what happens when you realize
you need unit tests for tens of thousands of line of code in a few weeks'
time, instead of writing the tests when you write the code.

There are of course testing on higher levels for this industrial system,
multiple levels even and I have to say that the unit tests are fairly useless
in comparison.

------
choward
Why did I just have to scroll three pages to get past a pineapple. Nobody
wins. It takes extra effort to add. There is no ad revenue. The reader gets
annoyed. WHY?!?!?!

------
laurent123456
On a hobby project recently I've only written backend integration tests. I
found these to be extremely useful though since they touch directly or
indirectly to all the critical parts of the software, so if something's broken
it will eventually be caught there. Also since it's relatively high level
there's also rarely a need to change them whenever I change something to the
backend. All in all it's definitely a time saver, both in terms of catching
bugs and maintaining the tests.

I think there's still a value in detailed unit tests but mostly for library
code, when you want to test each function properly with various inputs.

------
didibus
I agree very much with this. I'd add one thing, adding tests and testing your
code are not the same thing. You should write tests, mostly integration tests,
testing public well defined boundaries of your class, your component, your
service. Mock only IO if needed, but not other classes. But also tests the
rest, but no need to add a test for them. Just run the code, try out the
private functions, make sure they work.

And also go read Testivus:
[https://www.artima.com/weblogs/viewpost.jsp?thread=204677](https://www.artima.com/weblogs/viewpost.jsp?thread=204677)

------
therealdrag0
I trust my own sense of when unit-tests and integration tests are appropriate.

If I'm working on a bit of code that has actual internal logic, then I'm happy
to write unit tests so I can iterate on the test with scenarios while I code.

OTOH, if I'm writing a glorified CRUD pass through, api > model -> db-entity >
do a thing > return result. There is nothing to unit test. Writing a few
integration tests that call the API gives me much more confidence that
everything is hooked up correctly than mocking a bunch of shit and asserting
that a method was called that was obviously called.

------
TimJYoung
The one thing that I didn't see mentioned is the fact that you _can_ test very
specific branches of code with higher-level testing, but the same cannot be
done in reverse. In fact, higher-level tests, especially tests driven with
real--world data, routinely execute branches of code in ways that are
unforeseen by the developer and unlikely to have been tested at a lower level.
I suspect that fuzz testing will some day remove much of the need for lower-
level testing.

------
danrspen
I started writing a response to this ... but it got a bit too long and became
my first ever Medium post: [https://medium.com/@danrspen/write-tests-a-
sensible-amount-o...](https://medium.com/@danrspen/write-tests-a-sensible-
amount-of-appropriate-types-d19819d39bf0)

------
Jach
I don't practice TDD (see Norvig, Jeffers, and Sudoku) but I do like unit
tests. I also like integration, e2e, property, mutation, and fuzz tests. I
like proving specs with TLA+. Sometimes type theory proofs are useful. There's
a lot of ways to improve quality. Some of those I've only been able to toy
with personally, though, because the dirty secret of software engineering is
that for most things we don't need very high levels, we just need most
customers to not be mad. So that leads to articles like this, where people
claim their experience showed the most bang for buck with a certain approach,
despite not always even trying other approaches...

> One thing that it doesn’t show though is that as you move up the pyramid,
> the confidence quotient of each form of testing increases. You get more bang
> for your buck. So while E2E tests may be slower and more expensive than unit
> tests, they bring you much more confidence that your application is working
> as intended.

In my experience this is wrong. Working with an inverse pyramid, you'd think
confidence would be high, bugs few. It's the exact opposite.

> just stop mocking so much stuff

Here here. But this gets into subtle arguments over "is this really a unit
test if it's using RealFoo even though we're trying to test Bar, such that if
RealFoo breaks this test will also break but not because of Bar?" I'm not too
strict on my definition of unit test, my best attempt is something like
"relatively small, isolated from the broader module/library/application, runs
fast, easy to find root failure when test fails, avoids testing the
implementation rather than behavior (often hard), and asserts something." It
leaves open the possibility for technical 'integrations' but there's a pretty
big space of possibilities between a minor integration to avoid mostly useless
mocking and suddenly requiring the whole application server to have started up
before you can do anything.

------
maxxxxx
Reading all the comments reminds me why I really don't like working in large
enterprise teams. People argue a lot about semantics ("what is a unit test?")
and the only right way to do things but there is almost nothing practical to
learn from.

I have found that in some projects unit tests are really easy to write and
helpful but in others you spend more time on writing mocks and dependency
injection than writing stable code.

I don't even know what I want to say exactly other than that we should focus
more on practical solutions to real problems and less on debating semantics.

------
cbhl
I feel like there's definitely an art to mocking. Rather than mocking out the
module being called (A calls B, A_test mocks B), you want to mock out its
dependencies (A calls B calls C, A_test mocks C). But if you swap out said
dependency, now you have to go update other modules' integration tests to mock
out the new service.

------
tybit
I think this is poor advice because it doesn't give context.

I think unit tests are good for testing logic, and integration tests are good
for testing functionality. Lots of complex logic? Then write lots of unit
tests. Got a service that just wraps a database? Then you're going to want to
write a lot of integration tests.

------
scyclow
Does anyone have any good resources on writing front-end integration tests?
I've had nothing but terrible experiences with selenium. And I recently tried
Nightmare (which is Electron-based), but it wasn't much better.

~~~
MartinCron
I too have had selenium nightmares but have landed someplace pretty good. Here
are some quick tips. YMMV.

Don’t think “integration” — think “full stack” these will find configuration
and connectivity bugs more than business logic bugs. These can’t be your only
tests.

They need to run as part of your CD/CI pipeline automatically, otherwise, they
won’t get run and will decay from disuse.

Headless browsers (HTMLUnit and PhantomJS) are easier to work with than “real”
browsers. Haven’t used Chrome headless yet.

Front end bugs are often fiddly and visual. Screenshots + human review can be
a cost effective supplement to manual testing, but can never replace it.

Good error logging and reporting is also key. If your front end tests break
something, having the backend tell you what broke will save you time.

I tend to keep these “full stack” tests to happy path scenarios, as they are
slower to write and to run than lower level integration tests.

Good luck.

------
juandazapata
Unit tests force you to think about good design. Integration tests don't. If
you only use integration tests, you'll end up with a big ball of mud.

Certainly I wouldn't want to touch a codebase by the author of this post.

~~~
mannykannot
This is exactly the wrong way round. Only by thinking about interfaces can you
avoid a big ball of mud, and interfaces are what integration testing tests.

------
lucidguppy
Clean architecture: [https://smile.amazon.com/Clean-Architecture-Craftsmans-
Softw...](https://smile.amazon.com/Clean-Architecture-Craftsmans-Software-
Structure/dp/0134494164)

Anything that touches the real world should be as small as possible.

I've been writing code using tests for about 4+ years and I now can't think of
writing code any other way.

I would be scared of refactoring. Also I'm testing the code anyway - why not
write it down so it gets done every time?

Run integration tests too for sanity - they're testing the integration of
things.

To me: coding without tests is like going caving without a flash light. You
don't really know what your code does until you run tests. Your confidence
rises when more code is covered.

No it isn't perfect - but not writing tests is not _better_.

~~~
wolco
The problem with always relying on testing is you lose your ability to create
without the test crutch. Everything is red/green and your brain can get a
lazy. Balance is the best.

~~~
lucidguppy
What is science without tests? I say coding is similar.

------
luord
> You should very rarely have to change tests when you refactor code.

I'm going to say this is poor, if not outright dangerous, advice. I'd argue
the opposite: every time the code changes, it would be great if a test
somewhere broke.

Hell, if the tests originally were for a gigantic class with a ton of mocks
but the code changed so there was lower coupling and the dependencies were
injectable or separated in different modules, the tests _should_ be changed.

> even a strongly typed language should have tests

Indeed, yet I've seen so many people in the strong typing "camp" mentioning
"the compiler catches some bugs right away!" as an advantage, which always
makes me think "... and?"

> It doesn’t matter if your button component calls the onClick handler if that
> handler doesn't make the right request with the right data!

... But there should be another unit test for that handler, checking that it's
using the right request with the right data.

In general, I agree with some points and disagree on others.

~~~
marcosdumay
> every time the code changes, it would be great if a test somewhere broke

So, it doesn't matter if your code is correct or not?

> "the compiler catches some bugs right away!" as an advantage, which always
> makes me think "... and?"

And that leaves more time to design things right and test the stuff that
actually matters.

~~~
luord
> So, it doesn't matter if your code is correct or not?

I'm going to need you to walk me so I can see how that part you quoted implies
that the code being correct doesn't matter. I honestly don't see the
correlation.

> And that leaves more time to design things right and test the stuff that
> actually matters.

I'm not gonna get into a religious flamewar; if you prefer static typing, all
the power to you and I'm not interested in convincing you otherwise.

However, I fail to see how having to write "int", "str", etc before or after a
variable name impacts the design process in _any way_.

As for testing:

\- A test in Go:

    
    
        func TestAvg(t *testing.T) {
        	for _, tt := range []struct {
        		Nos    []int
        		Result int
        	}{
        		{Nos: []int{2, 4}, Result: 3},
        		{Nos: []int{1, 2, 5}, Result: 2},
        		{Nos: []int{1}, Result: 1},
        		{Nos: []int{}, Result: 0},
        		{Nos: []int{2, -2}, Result: 0},
        	} {
        		if avg := Average(tt.Nos...); avg != tt.Result {
        			t.Fatalf("expected average of %v to be %d, got %d\n", tt.Nos, tt.Result, avg)
        		}
        	}
        }
    

\- The exact same test in python:

    
    
        def test_average():
            for param in [
            	{'nos': (2, 4), 'res': 3},
            	{'nos': (1, 2, 5), 'res': 2},
            	{'nos': (1,), 'res': 1},
            	{'nos': (), 'res': 0},
            	{'nos': (2, -2), 'res': 0},
            ]:
                assert average(*param['nos']) == param['res']
    

In a real situation, tests like this one need to be written and the fact that
in Go we're specifying types doesn't change a thing, so I'm not convinced that
not writing the types means that in a dynamic language I have to test stuff
that doesn't matter. For that matter, I'm having a hard time imagining a
scenario where the type wouldn't be tested anyway so back to my "... and?"

But, again, I'm also not trying to convince you otherwise. Different
approaches for different folks.

~~~
marcosdumay
> I'm going to need you to walk me so I can see how that part you quoted
> implies that the code being correct doesn't matter.

Well, that quote was your entire sentence. It wasn't out of context. So:

> every time the code changes, it would be great if a test somewhere broke

There isn't anything anywhere about the code being incorrect. Thus it is
irrelevant.

About the types, their entire point is that they save you from writing the
tests. What the compiler proves, you do not test. If your compiler won't prove
anything, you'll have to write all the tests.

As an example, you forgot to test if `'nos': ("", None)` yields the correct
error.

~~~
luord
> As an example, you forgot to test if `'nos': ("", None)` yields the correct
> error.

It isn't in the go code either.

> There isn't anything anywhere about the code being incorrect. Thus it is
> irrelevant.

Why, exactly? I'm not following your inference there; again, walk me through
that one.

Let me be more explicit: what exactly did you think I meant with that line?
Again, I don't see how it implies that "it doesn't matter whether the code is
correct".

> About the types, their entire point is that they save you from writing the
> tests.

Ok, this is getting circular so I'll just ask you to give me an example of a
test that would be absolutely required in a dynamic language but not in a
static one.

Mind you, that isn't going to convince me one way or another in the "static vs
dynamic" flamewar since I subscribe to the idea that more tests is better. I'm
asking mostly out of curiosity.

------
spopejoy
The original tweet is a takeoff on the Michael Pollan maxim right? "Eat food.
Not too much. Mostly plants"

------
egeozcan
This is my version: Don't stop writing tests. Not until 100% coverage. Mostly
unit.

Especially if you are developing a library. An untested code branch is a
ticking time bomb, in my book.

An application for the end users indeed does benefit from integration tests, a
lot. The problem is running them efficiently. If they take an hour to run,
nobody will care to analyse them.

~~~
fiatjaf
Analyse? Shouldn't tests just pass?

~~~
egeozcan
After a point they just fix them to make them pass. Integration tests are
really easy to cheat when the people writing them and using them are both
really demotivated.

------
jrs95
Reminds me of this: [http://david.heinemeierhansson.com/2014/tdd-is-dead-long-
liv...](http://david.heinemeierhansson.com/2014/tdd-is-dead-long-live-
testing.html)

------
agentultra
Testing.

My current theory for why there is so much confusion about testing is because
developers are not often taught the difference between _specification_
(theorems) and _implementation_ (proofs) and why you want to separate the two.

It seems like business and investors want us to write code, more of it, and
faster. Value, value, value!

So my question is: what is valuable?

Do you value your customers' data? Do you value their time? Their safety? Your
brand and reputation? If you answered yes to any of these (and the other
questions I may of forgotten or elided) then you should be encouraging your
developers to write specifications.

One such specification, and a weak one that developers can write and maintain
on their own without involving stakeholders, are unit tests. It's a weak form
of specification for a library/module/component you would like to have because
it specifies properties and behaviors by example. The spec gives an example of
use and the expected outcomes. A good test is a verbose Hoare triple: _given
some context_ , _when this method is called_ , _then this result is expected_.

Your implementation of that specification is what you're after and it's the
reason why you should write the tests first. Writing tests first has little to
do with productivity or design. You should write them first because your
implementation should _prove_ the _theorems_ in your specification. Theory
first. Proof after.

Sometimes you have to revise your theories after attempting the proof.. but
that's a story for another day.

But we can write better specifications! Unit tests are weak because each test
only demonstrates a single expectation. It doesn't quantify over the space of
possible inputs! If you want to write a better specification for your parser
or transformation try _property based testing_. Use a library like QuickCheck.
You give it a theorem that quantifies over the input space of your function
under test and it will find out for you if your proof holds (limited only by
how many examples you want to try... say 10000). It's not a _proof_ that your
implementation is correct but its a much stronger guarantee than a suite of
unit tests and it doesn't cost you that much more to use it.

Integration tests though. I'm not sure if I agree with the advice. You should
definitely write them but they can become a time sink and cost quite a bit to
run if you need to test a non-trivial system. Where they lack is at level of
quantification and this is where the real errors lie in software systems with
many components. Integration tests, like unit tests, are proof by example. You
write a specification for how a given configuration of components should
interact and you supply a context and run the test and see if your assertions
hold. It's wise to be aware that it won't catch most safety errors and it will
never catch liveness errors.

Safety errors having to do with _correctness_ of expected values over the
lifetime of a computation.

Liveness having to do with maintaining invariants over the lifetime of a
computation.

So integration tests are useful, do write them, but I wouldn't advise spending
most of your time on them if you're dealing with more than 2 or 3 components.

Once you have messaging and co-ordination you're going to want a stronger
specification and that would probably look more like a theorem written in a
language that can be verified by something called a _model checker_. Something
like TLA+ is making good progress breaking into industry.

... this is getting long. To summarize: developers should be taught and given
time to write specifications. Most errors in software arise from poor,
incorrect, or missing specifications. Weak specifications are better than
none. Think of tests as specifications, write them first, and prove your
software meets those specifications. Then you can change your implementation
as you refine your specification and write better, faster, more reliable
software.

~~~
hwayne
> Liveness having to do with maintaining invariants over the lifetime of a
> computation.

Nit: liveness is about reaching 'good' states over the lifetime of your
computation. If your invariant is violated, even if it's a multistate
invariant, it's still a safety error. Liveness would be something like "x is
eventually true", or "the program always terminates."

It's not just specifications we need. We also need better tests. "Better" here
doesn't mean "integration" or "acceptance", it means things like "fuzzing
through contracts" or "comparing snapshots" or "rules-based state machines".
Testing is vast and we're not very good at it.

------
titzer
> I’ve heard managers and teams mandating 100% code coverage for applications.
> That’s a really bad idea. The problem is that you get diminishing returns on
> our tests as the coverage increases much beyond 70%...

I call bullshit.

I work on V8, on JITs and WebAssembly. 70% coverage for these code bases would
be absurdly low. We would never ship code that is that poorly tested, and you
shouldn't either.

> You may also find yourself testing implementation details just so you can
> make sure you get that one line of code that’s hard to reproduce in a test
> environment. You really want to avoid testing implementation details because
> it doesn’t give you very much confidence that your application is working
> and it slows you down when refactoring. You should very rarely have to
> change tests when you refactor code.

What in the. serious. fuck. Of course tests test implementation details.
Because _implementation details_ are where the goddamn bugs are.

> ... Maintaining tests like this actually really slow you and your team down.

That's the whole _point_. It slows you down in the short term but it keeps you
from experiencing a full-on system meltdown when everything seems to be
breaking at once.

Please don't follow the advice of this. It's total crap.

If you've never worked on a system that has survived more than 3 years, sure,
go right ahead, run against the wall. But when you work on a system survives
5, 10 (V8), or 20 years (HotSpot JVM), then you really, really want to have
good tests.

~~~
away2017throw
No offence, but you work in a bubble of sorts. In enterprise we are absolutely
expected to run against a wall - preferably fast.

~~~
titzer
The bubble happens to be at the bottom of everything everyone runs. If we--or
kernel folks for that matter--applied the advice of the article to our
development practices, our system meltdown would be your system meltdown.

Sure, you have requirements from management. So do architects and engineers
for building bridges. Yet they still have a duty to build bridges that don't
fall down.

~~~
jcrben
Are you suggesting that the Linux kernel is unit tested? Last I checked, I
couldn't find them. And discussions online say the same (e.g.,
[https://news.ycombinator.com/item?id=9543336](https://news.ycombinator.com/item?id=9543336)
and
[https://news.ycombinator.com/item?id=9544306](https://news.ycombinator.com/item?id=9544306)).
Kernel bugs tend to show up in userspace.

Fortunately, integration-oriented projects have arisen more recently such as
[https://kernelci.org/](https://kernelci.org/) and [https://github.com/os-
autoinst/openQA/](https://github.com/os-autoinst/openQA/)

Anyhow, it would make me a bit sad if the v8 team is writing unit tests
tightly coupled to the implementation. I've messed around with the codebase
and I didn't see tests like that - could you point to some?

~~~
titzer
[https://cs.chromium.org/chromium/src/v8/test/unittests/](https://cs.chromium.org/chromium/src/v8/test/unittests/)

~~~
jcrben
The ones I glanced at appear to test output without much mocking, which isn't
as tightly coupled as unit tests commonly end up.

------
C7H8N4O2
"Eat food. Not too much. Mostly plants."

-Michael Pollan

~~~
tantalor
So much for "deep" & "profound".

~~~
emodendroket
What? Does it take away from it somehow that it's a twist on someone else's
famous phrase? I immediately recognized the reference despite hardly being a
Pollan-head, so I doubt anyone's idea was to conceal the connection.

------
killjoywashere
The title, by the way, is a play on Michael Pollan's famous essay "Unhappy
Meals". The top line becomes the subtitle, the lesson, and eventually the
title everyone googles for: "Eat food. Not too much. Mostly plants."

[http://www.nytimes.com/2007/01/28/magazine/28nutritionism.t....](http://www.nytimes.com/2007/01/28/magazine/28nutritionism.t.html)

------
jlebrech
and keep as much logic away from templates, that way you can unit test the
model.

problem is templating frameworks are too smart.

