
Test Desiderata 8/12: Tests should be isolated from each other [video] - KentBeck
https://www.youtube.com/watch?v=HApI2cspQus
======
umvi
Seems to me like your test bed would be more reliable if each test is
responsible for resetting its own global state before it is run. If you rely
on everybody else cleaning up after themselves, you're gonna get burned by
another developer.

To use the drier example, if my test always cleans lint _before_ running, I
know my test will always run with a clean lint filter.

If I instead rely on _all other tests_ cleaning the lint filter _after_ their
tests, it only takes one developer to make a mistake and break my tests'
assumptions.

As for tests "cleaning up after themselves", that's generally a good idea if
you are, for example, altering a database, generating a file, etc. However,
that is also error prone (just takes one developer to mess it up and forget to
delete a generated file). The way I've made that more reliable is running
tests in a docker container so that any new files, etc. created are destroyed
when the container is destroyed at the end of testing. I've also toyed with
the idea of using the VCS to clean test fingerprints (i.e. git clean) but the
main problem I've run into with that is some developers' tests will write
stuff to /tmp.

~~~
wfleming
In my experience a mixed approach can be valuable, but I think whatever
approach is taken the real key is to build it in such a way that it's not an
individual test's responsibility to clean up either before or after - it's
extracted to the test runner/framework so every test gets the appropriate
behavior automatically.

As an example of what I mean by a mixed approach: if the thing being tested is
software that primarily interacts with, say, a SQL database, then the "clean
up before" approach (which might mean running `TRUNCATE foo` on every table)
works, but as the schema gets bigger it gets slower, and while it might not be
_very_ slow it gets very noticeable if every test runs it (particularly as the
test suite is presumably also growing). An effective way to implement the
"clean up after" approach might be to setup the test runner so every test runs
within a transaction and after the test the transaction is rolled back: but in
order for that to be reliable the database must be in a clean state when the
test suite is started. So the mixed approach I've used (and is close to the
default for Rails & Phoenix apps nowadays, I think) is to do the "clean" op at
the start of the entire suite and configure it so every test runs in a
transaction.

The transaction/snapshotting features of modern SQL databases make that
approach really easy, of course, so if a project's main external interaction
is instead filesystem interaction or something it may not be so simple and
more of the infrastructure would have to be hand-rolled, but if it has to be
done at scale (many tests, many developers), I do think it's worth abstracting
into the test runner so that individual tests don't need to remember those
details: I've done that for some CLI tools where the test runner sets up a
sandbox dir for each test and so the tests just need to use the appropriate
patterns for FS interaction (which is easy since all the existing tests do it
and people imitate those, and in my case it helped stop the "just write to
/tmp" approach that some had taken before that).

The container isolation approach certainly has the advantage that it's
difficult to mess up no matter what any particular test does, but if you run
each _test_ within a container, in the long term I'd be worried about the
overhead causing the same headaches as the "truncate DB tables before every
test" approach. And if you're running the entire test suite in a container it
seems like you'd still have the potential challenges of each test needing to
be responsible for stuff unless some of the other approaches are also taken?
Sorry if I didn't understand the container approach you described.

------
Strilanc
In my experience, it's even a bad idea to share _immutable_ data between
tests. At least, certain types.

Basically what happens is you write a bunch of tests, and they all operate on
a Person let's say. So you get into the habit of making a 20 year old Tom with
bla bla bla each time. Very repetitive. So you decide to move Tom into a
common file that every test can import. Typical DRY stuff.

Then, over time, Tom grows tentacles. Every test that uses Tom needs just one
more little detail to be added. Tom needs to have a height for the height
test. For the shopping test Tom needs a walking speed and a preference for
ladders. These properties keep accumulating, and Tom grows huge. His growing
complexities, and his presence in so many tests, make it impossible to remove
or edit anything about him. You resort to making Tom variants. In each test
you clone Tom then tweak his weight and shoe type or whatever. Of course
nobody is making sure that all these properties and property edits are inter-
consistent, and you find that because of this even adding properties to Tom
has somehow become difficult. Eventually, understanding all facets of Tom
becomes a black art. New tests fail sometimes, and no one knows why. You poke
and prod Tom for a couple hours, and the problem goes away. Eventually even
basic usage of Tom also becomes a black art.

You should have just repeated yourself.

------
mceachen
@kentbeck, you might want to fix the medium link in that YouTube video
description. It 404s for me.

Kudos for pushing best practices!

