
Tests Should be Specific [video] - KentBeck
https://www.youtube.com/watch?v=8lTfrCtPPNE
======
azangru
I would like to hear some words of wisdom from either your guys personal
experience or maybe opined by by the gurus about whether or not to use random
data in unit tests.

On the one hand, I understand the argument that random data makes your test
irreproducible; so if something breaks the test, it make take a while to
figure out exactly what and why went wrong.

On the other hand, I feel that hard-coding test data is too restrictive. For
example, in the linked video, they have a 40-hour week with a 8 dollars per
hour rate, and they expect the result of the calculation to equal 320. An
immediate question arises, what's so special about these numbers? Would the
test pass if the input were different? What about a 38-hour week and 20
dollars per hour? And so on...

What's your take on this?

~~~
hexaga
IME, random data is extremely useful. Consider fuzzing, property testing ala
quickcheck, automated fault injections, etc.

The key thing to make it work well is to keep track of your prng seeds to make
things deterministic. If you find that seed `484382943` makes a test fail,
append it to a list of seeds to always re-test.

~~~
DelightOne
Are random test cases only for checking code throws (no) exceptions? If not,
how do you create the expected results from the random data without
reproducing the method you call to begin with?

They only idea I get is to fix the seeds/inputs, write the expected results
into a file (making it persistent), and checking them manually the first time.
Subsequent test runs check whether the output changed.

~~~
dean177
You can test properties of the result without an assertion of the exact
result:

Take adding two integers, you can test 3+18= 21 or you can assert that, when
both arguments are positive, the result is greater than each argument.

~~~
pavel_lishin
No offense, but that's a toy problem that's hard to generalize to actual
issues we see in our code. This might apply to some financial code where some
complex situation _should_ always result in a great or smaller result, but
what about the things people typically work on? I'd wager a significant
percentage of us are writing CRUD apps; what's the crudsmith's equivalent to
this?

    
    
        expect(add(A, B).toBeGreaterThan(A);
        expect(add(A, B).toBeGreaterThan(B);

~~~
hexaga
It works the same way with actual (non-contrived) problems. You'd identify a
property you want to test, and assert that it holds for randomized inputs.

Say you have a CRUD api (TheTrackerOfWidgets), and you want to test a create
Widget call. Widgets can have a name, size, and each is associated with a
unique id.

Your property test might look like (in pseudo-rust):

    
    
        #[quickcheck]
        fn created_widgets_exist(name: String, size: u32) -> bool {
            let resp = create_widget(&name, size);
            if !resp.is_valid_json_or_whatever() {
                return false;
            }
    
            let id = resp.get_id();
    
            let resp = retrieve_widget(id);
            if !resp.is_valid_json_or_whatever() {
                return false;
            }
    
            resp.get_name() == name && resp.get_size() == size
        }
    

You would probably go further and not use default types but instead use
bounded types based on any domain specific verification logic you have. Note
that nowhere does it actually assert that a response looks exactly like what
you expect, you assert that it is valid and/or satisfies some property (that
created widgets exist and are accessible by further api calls).

~~~
DelightOne
Ahh! So with random inputs(from a pre-defined range) you can assert about
properties of the result (if you find some), because you limited your input
range.

Exact results still have to be created manually or automatically (randomly)
created and the result manually checked once.

------
rglover
If anybody is interested, this appears to be part of a series:
[https://www.youtube.com/watch?v=5LOdKDqdWYU&list=PLlmVY7qtgT...](https://www.youtube.com/watch?v=5LOdKDqdWYU&list=PLlmVY7qtgT_lkbrk9iZNizp978mVzpBKl)

------
bump64
Specific tests doesn't exactly mean "Simple" tests. It is hard to balance
between the two but from my experience when people try want to write specific
tests they just start writing extremely simple tests.

------
cammil
They seem to start talking about stacks and leaves of stacks, as if one should
only test at leaves. Surely this is woefully inaccurate?

Isn't the point of tests to test behaviour? Sure be specific about behaviour.
And sure, unit tests should not overlap too much, but this is a separate
matter to multiple tests failing, and now I don't know where my bug is.

?

~~~
bluGill
The purpose of a test is to state that "This should NEVER change". Thus you
need to figure out what will NEVER change before you can write any tests. You
can use architecture to guide you - if you have a horizontal layer or vertical
slice - either you know that whatever your layer/slice's interface to the
other layer/slices will be hard to change and so you can always safely test
there. The problem is your layer/slice is complex (if it isn't then your
architecture is too inflexible!) and so you need to pick internal points to
test. These internal points are then asserted they won't change - but you can
legally refactor them latter if only you didn't have those pesky tests that
would fail on conditions that might still be important.

Where to inject tests is an extremely hard problem.

------
2rsf
This is also important for system, integration, e2e or any other complex test.

The problem is that it is very hard to filter out failures due to environment
or other unrelated issues, and even harder to pinpoint the problem

~~~
mihaigalos
High-level testing suffers from this symptom, but catches overarching system
problems. It usually means that you are missing a test at unit (spec), and
possibly at contract/collaboration level.

------
jve
I rather prefer higher level tests (Integration tests?) that says, hey, you
have an error. Or user will experience error. Or the expected high-level
outcome is wrong.

Because with unit testing, I miss and can miss so much conditions. If I have
high level error, sure, I will figure it out - but it is part of development
process rather than something breaks live and then you have to figure that out
anyway.

Ofcourse Unit tests have their place where I want to test that this input
produces particular output (for example some parser, sanitizer class etc.) and
I want to receive a signal when my future development breaks some stuff.

Tldr: I don't think that having a very specific "what went wrong" is that
important. I'm grateful when tests fail, because that saves me from mistakes
going into production. It's like a safety net.

~~~
bluGill
I too prefer higher level tests, but for a different reason: the deep unit
tests tend to make it harder to change the code in non-functional ways. If I
decide my program structure is wrong most of my unit tests will not apply to
the new structure, but the high level tests need to pass.

------
cammil
Not sure I get the point of this

