
A survey of testing techniques we've found useful - lukes386
https://vector.dev/blog/how-we-test-vector/
======
joshvm
One thing that is rarely discussed (I think?) is how to test things which
don't have a correct answer. It's not just "refactor until you can test" it's
output that may be subjective. For example, suppose you write some code to do
some image processing like a stereo matcher. How do you check your code works?
Usually you have some ground truth which you can compare, but it's difficult
because you'll never get 100% accuracy. At best you can declare a baseline, eg
that your algo should be say 90% accurate (if you implemented it properly
based on literature results) and if you don't get that, then error. In that
case you can use a numerical metric, but other applications you might care
about the result being aesthetically pleasing (eg a video ISP where you do
colour correction on the stream coming from a low level camera).

Or hardware where the advice is usually to mock the device under test. But if
you don't own the hardware the most you can do is try and emulate it, and
maybe check that your simulated state machine works. In my experience its
easier to run with hardware connected and just skip those tests otherwise.
There are also extremely subtle bugs that can crop up with hardware interfaces
like needing to insert delays into code (eg when sending serial) that will
otherwise fail in the real world.

OpenCV has some interesting approaches to this, for example testing storing a
video in a certain format, inserting a frame with a known shape (like a
circle), then reading back the video and checking that the shape can be
detected.

~~~
Ididntdothis
Testing with hardware is hard. You can emulate to some degree but that just
helps making sure your tests are written correctly. In the end you have to run
tests against the real thing. I deal with complex UIs that interact with
hardware. If you are smart you can split things up so they are easier to test
in isolation but the whole system has a ton of potential interactions that are
hard to write test cases for.

The OpenCV example is a pretty easy one. You have clear inputs with clearly
defined outputs. The only thing you have to do is to create sample data.

~~~
dbcurtis
> Testing with hardware is hard.

Yup. I work in robotics.

I try to isolate the actual hardware interaction layer so that for testing you
can mock the driver and hardware in one piece. Of course that does not test
the driver. With any luck, the driver is pretty stable once it works, though.
And the driver+hardware piece can have it's own (physical) test bench so that
at least manual testing is, well maybe not efficient, but at least not
painful.

Simulators are great but not always available. Or are too much work to get
going.

One configuration often used for robots is the "boneless chicken". Take a
bench, and bolt all the guts down to it in a configuration where they are easy
to probe. Put the wheel motors someplace safe, with a synthetic load like a
pony brake. Of course you can't test the nav stack that way. (I once
interviewed a firmware engineer who was coming off of the Juicero shutdown --
say what you want about Juicero, but from the sounds of it their boneless
chicken was outstanding, even integrated into the CI automation pipeline. Of
course, they didn't have the nav problem).

Speaking of nav, I once saw a warehouse robot company's nav PR test micro-
warehouse. Not the full test warehouse, just a 500 square foot or so area
dedicated to testing nav PR's. It was integrated with CI automation. I could
tell from the accumulated tire marks on the floor that they had nav pretty
much nailed.

I have done several robot-to-elevator interfaces (probably more than anyone
else). In the end, final testing always required something akin to a few
midnight to 4 AM test blocks on the real elevator. And then of course as you
point out:

> the whole system has a ton of potential interactions that are hard to write
> test cases for.

They often don't show up until the system is under load.

~~~
ptsneves
Nice write up. Just wanted to add that the problem of using simulators or mock
is that now you have one extra code base to maintain totalling: the code, the
test, and the mock. For mock drivers this can be a quite big task. In the end
I just preferred to run it in real hardware as much as possible and go for
unit tests. This from a person that generally does not like unit tests, but
there is not a very cheap way of going with simulations sometimes.

------
Lichtso
I think this is a fairly good summary of the most important testing styles and
where / when to (not) use them.

One more category of tests I would add are meta tests (like mutation tests).
These are tests which test the tests, seeing if they would actually catch any
errors / bugs or just report everything to be alright always.

------
discreteevent
"If something is difficult to unit test, refactor until it's easy."

This is often a good idea but if you only need the flexibility to enable unit
testing then it may make your system more complex than it needs to be. Only
introduce indirection where it's really needed. See also "test induced design
damage" and "write tests, not too many, mostly integration".

~~~
pkolaczk
Only if you want to minimize the total complexity, which is almost never a
good idea. What is more important, is the minimum amount of complexity needed
to be understood in order to make a change. If you are able to test a small
subset of components in isolation, you can also understand them in isolation,
and modify them without the need to understand the whole system. I'd rather
read 20% of 110% code than 100% of 100%.

The advice to write mostly integration tests is a terrible one. Particularly
when they test integrating of everything. When such tests catch bugs, they
don't tell where the problem happened. They also take long time to execute.

~~~
taeric
The problem I have been exposed to is unit tests that have locked in what
should be just an implementation detail. Can be fine, if it is an important
detail. That said, your tests should not have to import all of the same pieces
that your coffee does. I prefer, actually, to use alternatives in the tests,
where possible.

For example, if you're code zips something, your test could use many zip
engines to verify.

------
epdlxjmonad
When writing code, we often think "according to the design of our system, this
condition must be true at this point of execution.” Examples are:

1\. The argument x must not be 0.

2\. The variable x must smaller than the variable y.

3\. The list foo must be non-empty.

4\. The variable x should have value 'Success' if it had value 'Try' in the
beginning of the function call.

These 'invariants', or assertions, can be extremely useful for testing the
correctness of the code. Put simply, if an invariant is violated (during unit
test, integration tests, or system tests), it indicates that either the design
or the implementation is wrong. An article on testing methodology would be
more appealing if it had some discussion on exploiting invariants/assertions.

~~~
AdieuToLogic
I believe "Design by Contract"[0], or "DbC", is the concept you are
describing. In the Wiki page Notes and External Links sections there are some
good resources IMHO.

0 -
[https://en.wikipedia.org/wiki/Design_by_contract](https://en.wikipedia.org/wiki/Design_by_contract)

~~~
epdlxjmonad
Yes, closely related, but invariants can appear anywhere in the code (like
loop invariants), and are less restrictive than pre-conditions and post-
conditions which must appear in the beginning and end of methods. So,
invariants are more about testing than design.

Arguably, invariants are especially powerful in testing distributed systems:

0 - [https://www.datamonad.com/post/2020-02-19-testing-
mr3/](https://www.datamonad.com/post/2020-02-19-testing-mr3/)

------
choeger
Very nice overview. I very much agree that all these techniques are very
useful. But one thing still bothers me. How does one test with a notion of
time? How do you test a cron or a calendar with alarm function?

~~~
ptsneves
Yeah, those are the hard ones. Worse, time is the essence of a good amount of
issues from performance, to startup stabilization to the deadliest of all
errors: the race condition.

------
RaoulP
I’m a complete newbie when it comes to writing tests. But I know that SQLite
is tested to hell and back, and I believe Richard Hipp (the creator) has said
he spends more time and lines of code on the testing suite, than on the SQLite
code itself. I hope he shares some of his insights some day.

~~~
idoubtit
> I hope he shares some of his insights some day.

You didn't search much:
[https://sqlite.org/testing.html](https://sqlite.org/testing.html)

~~~
binarylogicben
I love this page! We (Vector) hope to post something similar in the future.

