
Questions to ask yourself when writing tests - michalc
https://charemza.name/blog/posts/methodologies/testing/questions-to-ask-yourself-when-writing-tests/
======
discreteevent
"consider instead writing higher level tests that each each test more of the
code."

I really like the way they keep repeating this as the answer to so many
questions. To me it reads as a softly softly approach to weaning people off
what appears to be a mania for micro level unit testing, driven by people like
Uncle Bob.

~~~
Ace17
> ... micro level unit testing, driven by people like Uncle Bob.

"The structure of your tests should not be a mirror of the structure of your
code. The fact that you have a class named X should not automatically imply
that you have a test class named XTest."

"The structure of the tests must not reflect the structure of the production
code, because that much coupling makes the system fragile and obstructs
refactoring. Rather, the structure of the tests must be independently designed
so as to minimize the coupling to the production code."

[http://blog.cleancoder.com/uncle-
bob/2017/10/03/TestContrava...](http://blog.cleancoder.com/uncle-
bob/2017/10/03/TestContravariance.html)

~~~
ancarda
I really wish Uncle Bob would provide code examples with his blog posts and
videos. I struggle to imagine what my tests would look like if I followed
this. Possibly as my tests are so tightly coupled right now that refactoring
is actually not possible in some cases.

Does anyone know of explations of this with a more hands-on approach, or is
this simply a collection of ideas that can’t really be shown?

~~~
hdi
I've been wondering about that for a while.

So far, I've not been able to find any examples of test architecture that
follows those principles online or in my day job.

I've also been too lazy to do a side project and explore a more decoupled and
scalable test suite. Maybe I should get off my arse and finally do it.

~~~
twic
If you're writing a web application, i've had a reasonably happy time with
tests at the controller level, or perhaps right below. You want an interface
where values go in and out that correspond to fairly user-level ideas, but
which are still tractable for testing.

------
conatus
The most important thing here seems to be "For every part of my code change,
if I break the code, will a test fail?". Or as I'd put it "if you break the
code deliberately does the test actually fail?".

Seen quite a few tests in my time that don't capture the functionality they
think they are. They pass but wouldn't be able to tell if the underlying
functionality they capture is genuinely broken. This is why I guess the
standard practice is to go test red before you go test green.

------
simonhamp
The most important question to ask when writing tests: Can I do this faster if
I don’t write a test?

The answer is always no. Even if you are the only person building something,
future you will lick your boots clean in gratitude if there are tests.

Because even the best developers have to work with their own code sometimes.

~~~
crdoconnor
>Can I do this faster if I don’t write a test? The answer is always no.

I think the answer is mostly no but it's dangerous to think that it's always
no.

I've been given many stories in the past for which writing a realistic
automated test would have taken days, manual verification took minutes and the
code was fairly isolated and did not get changed very frequently.

Writing a test under those circumstances is actually a pretty poor investment.

~~~
simonhamp
I would be very surprised if an automated test literally took days to write
when a manual test is just minutes.

Also the time savings still pay off later, as automated tests usually take
seconds to run and there’s no training required - once it’s in the test suite
and the test suite runs are automated, it will always run and quickly identify
a failure - no “oops, we forgot to show Jim the Intern that he had to test
that part manually...”

Setting up your test suite and automation is longer for sure, but not days.
Even a complex manual process can be automated relatively quickly... the
manual process should be fairly scriptable in any OS nowadays, and most
platforms have great test frameworks.

~~~
andrewflnr
Think about something that necessarily involves hardware interaction, or a
GUI, or where all the interesting error cases are non-deterministic
(concurrency, network error handling). Ok, on the last one you're pretty much
hosed anyway. But we're not all writing nice data-in/data-out apps.

~~~
crdoconnor
Actually, I probably wouldn't count network error handling, because you can
probably use something like vaurien - [https://github.com/community-
libs/vaurien](https://github.com/community-libs/vaurien) \- to
deterministically mimic bad network conditions in an integration test in a
fairly reasonable amount of time.

Also, integrating that tool would probably have applications beyond a single
story so even if the change takes 5 minutes and making the test work with
vaurien takes half a day, it's probably still worthwhile.

Assuming that no tool like vaurien exists, though (and there are plenty of
scenarios out there where you'd have to build it from scratch), building the
test scenario could become prohibitively expensive.

------
falsedan
Here are some more questions:

How much business value is this test adding? That is, if this test failed and
we ignored it, how much would the business suffer?

Is the code easy to test? That is, does the design have lots of self-contained
components with well-described input/outputs & conditions/assumptions? Do the
docs clearly communicate that?

Will the test still work if we change the implementation? How much work to
update/remove the tests if the behaviour has to change to follow new business
requirements?

------
nexfitter
> Have I just made something public in order to test it? > > If yes, consider
> instead writing higher level tests that each each test more of the code.

This is one that I struggle with in JS with React.js components. If you have a
little helper component in a file that isn't exported but used in the same
file by a component that is exported, it is sometimes difficult to test that
non-exported component. Because of how enzyme shallow rendering works you
don't get the full tree so that component, if sufficiently nested, might not
ever be touched. This forces me to export the component just to test it.

example: [https://imgur.com/a/gqgcS](https://imgur.com/a/gqgcS)

~~~
reledi
I run into the same problem with React and it bugs me from a testing
standpoint.

Extracting code from a big component to helper functions and extracting those
functions from the component can lead to cleaner code, and it makes it much
easier to test the behaviour of the helpers directly than having to render the
component with enzyme.

A good example of this is moving state changes to pure functions [1] which
makes them much easier to test, but you'll need to export those functions to
test them.

1:
[https://twitter.com/dan_abramov/status/824310320399319040](https://twitter.com/dan_abramov/status/824310320399319040)

~~~
nexfitter
that is a really cool technique, thanks for the link!

------
sethammons
> Mocking introduces assumptions, which introduce risk.

This boils down something I've had on my mind a lot of late. Though, with a
different spin. I write a lot of Go. I prefer testing interfaces while some
others what to use mock generators. This quote captures part of my reasoning
behind avoiding mocks. I plan to write a detailed post at some point full of
examples. I think this quote will work its way in there.

~~~
michalc
I would be interested in reading it... if / when, if you can send me a link,
that would be great!

------
reledi
Couple more which haven't been mentioned yet:

\- Can I run the tests in random order?

\- Are the tests optimised for readability?

\- Are the tests unnecessarily testing third-party code?

\- Do individual tests contain the whole story? (Or do you have to look in
many places to understand each test?)

~~~
baristaGeek
Being able to run tests in a random order is indeed pretty important. It is
not only an indicator that the features were written with high quality, but
that the tests are high quality as well. Things such as the mocks mutating
happen and the developer writing the tests should avoid those kinds of things
happening.

------
KirinDave
I'm glad to see that increasingly people are recognizing that shallow mocks of
an API interface are not very good at testing anything.

------
securingsincity
I'll add a couple more

Can I run these tests more than once? Will they ever go stale? Can I run these
tests and they'll clean themselves up?

Having to update tests because they didn't take into account the date changing
(Happy birthday Joe Test!) or manually cleaning up data is a huge time suck.

------
spion
Here are 3 more questions:

    
    
      * Are these tests relying on API that is likely to change?
      * Can I make the API surface area used by all tests even smaller?
      * Can I make a library that wraps the existing API of the unit to 
        get a smaller and/or more stable API for use in the tests only?
    

These 3 help get tests that withstand refactors.

An example:

An acceptance test for an editing form is relying on the save button having a
certain CSS style to find it and click it. This is API that is likely to
change. An unrelated change in the looks of a button may break the test.

If we switch to using the text of the button ("Save"), that's better because
that is what the user is likely to rely on too when they try to find a sav
button. But its still not perfect as the text of the button could change.

The final step is to make a little library function that finds a save button
within a certain form. Then we can encode the logic of the test but vary the
kinds of text that are considered "save" (or even the method of finding a save
button - perhaps a standard CSS class of save buttons in the future!); the
test logic itself remains "permanent" since it doesn't rely on any
implementation detail anymore.

~~~
atoko
A small library function like document.getElementById?

~~~
spion
The example wasn't perfect, but no, it wouldn't be that. What if in the future
we have a screen that needs to show more than one save form? Then we'll need
to stop using the "save" element id as a mechanism to denote save buttons; to
do that, we need to update all the tests that rely on this mechanism.

The small library function would be `getSaveButton(form)` or even `save(form)`
- now every form save test no longer encodes the knowledge of how a form's
save buttons are made, whether that's by using a certain ID, class, text or
anything else.

Now when we get that requirement for a screen with two forms, we'll no longer
get mad and try to attack the idea (two save forms on one screen? that's
inconsistent with our product, its confusing the users, etc etc) when the real
reason is that it creates pain updating our tests. Instead we just update the
save function.

The general idea is to encode the meaning of the test and separate the
implementation details (clicking the save button might even be too concrete -
"saving the form" is probably about the right level of abstraction). A good
way to do this is to describe this test in prose and check if its encoding
details that may vary - does this sound like something universally true /
something that will be true forever, or accidentally true due to current
circumstances?

------
lambdabit
Here's one more. Is maintaining this test more time and work than testing
manually?

------
atticusCr
The author does not cover any question related to application security. Things
like is this parameter/input value properly sanitized, does this piece is/is
not vulnerable to injection attacks, does this piece of code performs
authentication/authorization checks? Is RBAC properly implemented for this
method?

~~~
viraptor
I agree with some cases, but "is this parameter/input value properly
sanitized" is a bit weird. It should only every apply to a) the db framework,
b) those N really weird cases that have to break the abstraction and don't use
the db framework. If you have to test every input, then the problem is on a
completely different level than missing a test.

~~~
atticusCr
Kind of, if you have a centralized place to perform input data validation, as
it should, then it is just a matter to test that piece of code same if you are
using a framework. However, I don't understand why you refer to a db in the
first place? Is it because I used the injection attack as an example? if
that's the case bare in mind that Injection target other interprets as well
not only a db.

But getting back to my original idea, what I want to highlight the need of
adding cases to cover application security.

~~~
viraptor
Yeah, my mind substituted parameter with query parameter. Too much database
stuff at my $dayjob recently and I get tunnel vision ;-)

~~~
atticusCr
lol! thanks for your comments.

------
V2hLe0ThslzRaV2
>> "For every part of my code change, if I break the code, will a test fail?"

Since "breaking code" is super subjective, and generally speaking, trying to
"cover everything" is a recipe for hell.

Anyone able to expand on what the author meant by this?

~~~
michalc
Perhaps this part is not clear / well defined, but roughly, I meant that code
coverage is (about) 100% for the lines added/changed, and some "reasonable"
subset of possible breaking changes would be picked up by failing tests.

What I had in mind specifically in the answer, was the case of changing
"interfaces" between parts of code. For example, the case of changing a
function's arguments, or how it uses them, but omitting to change a call site.
Checking that the call site just calls the function would not be enough to
produce a failing test, especially without type safety. The test would
actually need to assert on what the function does, e.g. its return value for a
pure function.

Yes, I think trying to test against every single possible breaking change is
not valuable.

------
yawaramin
Looks like Michal has been immersed in Haskell for at least the past year. I
wonder if he will have something to say about balancing testing and coverage
with static typing.

~~~
michalc
Thinking about this, at the moment, I don't think static typing is a
replacement for testing (or vice versa!). Although, apparently, with something
like LiquidHaskell, you can get more logic "into" the types and checked by the
compiler, but I'm unsure how much you can do when it comes to more complex
business logic.

Regarding testing "glue": static typing often gives evidence (but not proof)
that code is glued together appropriately [refactoring even small projects
without tests in Haskell is a joy: the compiler essentially tells you what to
change]. However, it doesn't give evidence that the high level behaviour is
what it needs to be. So I think higher level tests are still needed.

I think maybe changing the first question from...

> Am I confident the feature I wrote today works because of the tests and not
> because I loaded up the application?

to...

> Am I confident the feature I wrote today works because of the tests and type
> checking, and not because I loaded up the application?

will probably help you to answer the question about how much static types
allow you to forgo tests. My instinct is that in most cases, high level tests
are still worthwhile.

~~~
yawaramin
I'm thinking much the same, though I do think of tests as looking after the
dynamic behaviour and types after the static behaviour. Which seems obvious in
retrospect, but once you wrap your head around it you can build neat
abstractions like lightweight static capabilities:
[https://github.com/yawaramin/lightweght-static-
capabilities](https://github.com/yawaramin/lightweght-static-capabilities)

------
dvh
1\. Am I paid to write tests?

~~~
jyriand
Yes. You are paid to write working software.

