
Types and Tests - ingve
http://blog.cleancoder.com/uncle-bob/2019/06/08/TestsAndTypes.html
======
thethirdone
> Is [not testing nil/invalid inputs] a risk? Sure, but only if some other
> part of the system was written without tests. If you call this function from
> a module that you wrote with tests, using something as effective as the TDD
> discipline, you won’t be passing nil or negative numbers, or any other form
> of invalid arguments. So I’m not going to worry about that. It’s on you.

The problem with this way of thinking is that it makes tracking down a bug
much harder, and even if all software is correct at the end it will still make
development slower.

Imagine a bunch of functions A, B, ... and Z where A depends on B and so on.
If every one of those doesn't test what happens on incorrect inputs, then an
incorrect input to A may cause an exception in Z. You would have to track all
the way back up to find the cause of the issue.

For the given simple example, not testing negative of nil inputs may be fine
because just looking at the code is enough to document it, but on a larger
scale, failure conditions can be very obscure.

Simply put in a long chain of software, you will want to define on what values
your function is valid and only accept those (so users don't think something
works because only one case does). And to keep your code correct to that
specification, you need to make sure it errors.

------
coldtea
> _My answer to that is that even using a dynamically typed language I don’t
> have to write the test for nil because I know that nil will never be passed.
> I don’t need to make states unrepresentable, if I know those states will
> never be represented._

If only programming actually worked this way... You can never know what "will
never be passed" unless you prove it, and types offer that proof.

Tons of bugs in languages that allow null to break their type system, for
example, concern null pointer dereferencing...

------
mgraczyk
I think it is pretty easy to illustrate why fewer tests are needed in certain
statically-typed languages.

JS

    
    
        function max(a, b) {
          if (a < b) {
            return b;
          }
        
          return a;
        }
        
        console.log(max(1, 2));
        // 2
        console.log(max(2, 1));
        // 2
        console.log(max([1,2,3]));
        // [1,2,3]
    

Rust

    
    
       fn max<T: PartialOrd>(a: T, b: T) -> T {
           if a < b {
               b
           } else {
               a
           }
       }
       
       println!("{}", max(1, 2));
       // 2
       println!("{}", max(2, 1));
       // 2
       println!("{}", max([1, 2, 3]));
       // Compiler error
    

In JS, it would be prudent to test for reasonable misuse of this kind (return
undefined, throw Error, etc). The author claims _This is a dynamically typed
language. You get exceptions when you mess up the types._ That isn't true in
this case and it is not clear from the code what would/should happen.

In rust, you don't need any tests like this.

~~~
z3t4
A common philosophy in JavaScript is to make it _just work_. For example if an
array is passed, sort it and return the max value, which was probably what the
programmer intended.

~~~
almostdeadguy
The max value of what? The array in one of the arguments? What if there are
two arrays? Or an array and a number? Or a multi-dimensional array of numbers
and any of the above?

I think this attempt to come up w/ reasonable interpretations for
unanticipated usages of a function is kind of a rabbit hole. Beyond that I
think it's also problematic to assume that all usages of a function in a
language without a type system are intended usages, i.e. that programmers
_mean_ to pass arrays whenever an array appears as one of the arguments.

~~~
z3t4
Sometimes you want to call a function with something else, or are not using an
IDE (no argument hinting), and just call the function intuitively. Instead of
converting the input, or writing a middle-man, you edit the function to accept
it as is. You don't even have to pass values in the right order, the function
will figure it out.

~~~
almostdeadguy
There's nothing wrong w/ polymorphic functions, but there has to be a
reasonable rule to apply for what kinds of shapes of data are permitted.
Trying to aggressively handle unforeseen cases instead of indicating an error
doesn't actually help anyone, it's akin to swallowing exceptions. It's why
falsey values and automatic type conversion are some of the most problematic
features in javascript.

Your example of how to handle max for array arguments is exactly the kind of
thing I would discourage in js code, because it overloads the meaning of
"maximum" to data that don't have a natural way of assigning an ordering if
heterogeneous arguments are permitted (actually it sounds like you're saying
something like max([1], [9]) should return "9", which doesn't correspond to an
intuitive understanding of "maximum" at all given it returns something that
isn't either of the passed arguments). Whereas the rust code can use the
"PartialOrd" trait to ensure we only try to produce a maximum of like elements
with a defined ordering relation. That's fine for a limited number of cases
where its easy to show type equivalence in js, but I think it's problematic as
a general design principle, and because you don't have ways to express the
limits of polymorphism I've seen this kind of thing proliferate to unsafe uses
of a function very quickly.

~~~
z3t4
Only after you have tried to call the function with an array many times,
that's when it would make sense. And there's another option too, to instead
add a check and a friendly error. But it should be a fairly rare thing to do.

Naming things can be hard and maybe more so if you do not have type
annotations. But being used to dynamic languages I always find type
annotations a chore. And I think it's better to have a good name and no types,
then a bad name and types. eg. max(...numbers) vs x(y:number, z:number) ->
number . Even though x is more verbose it can be anything, but with
max(...numbers) I can be fairly sure.

~~~
mgraczyk
but then you have to test for this behavior, which was the counterpoint to the
article.

------
agentultra
When I’m writing Haskell I still practice _Test Driven Development_ but I
usually start with _Type Driven Development_.

The richer the type system, the more expressive the propositions I can encode,
and the easier it is to write a proof — a term with a type that matches the
proposition. Leaving me free to write only the tests that matter and
eliminating the need to start with specifying degenerate cases.

You can express concepts like non-empty lists, constraints on kinds and types
which encode the propositions you’d normally test by example using unit tests.
It’s simply not necessary in some cases to start designing programs with unit
tests when you have a type system like Haskell’s (or Idris’, Agda’s, or
Lean’s, etc)

~~~
johnday
The hidden thing about Haskell that makes it super-powerful for expressing
safe programs (in my experience) is not only that you can express tight
propositions in the types. It's that Haskell makes it very easy to write total
functions over wider types.

When writing OOP code you often start with a number of base assumptions about
the input and build the code around them. With Haskell, you make no such
assumptions initially and either encode total functions (most of them!) or
tighten the types until they match the true system requirements.

There are some Prelude functions which violate this understanding (`head`
being a classic example). Writing code without these as much as possible makes
your code almost guaranteed robust right out of the gate.

------
iainmerrick
_The odds that this test will fail are on the order of a million to one. I
could better those odds if I thought it necessary._

That seems... pretty bad? Am I missing something?

If you have 1000 tests like that in a large codebase with a continuous build,
you’re going to get an irritating number of random failures.

It’s not really a concern until you have a really large codebase. But if you
leave it until you have a really large codebase, it’s a lot more effort to go
back and fix the flaky tests.

~~~
hinkley
50 tests that fail 1% of the time will cause almost half of your builds to
fail. And the odds of two failing in a row are about one in four. As the
number of builds per day increases, the probability of a given day having
multiple failed builds in a row (causing you to look closely at the build and
commit logs) approaches 100%

~~~
Chyzwar
Unit tests should not be flaky. Integration tests, acceptance test can be
flaky by nature of networks but not unit tests.

------
jeremyjh
I've spent a lot of time now writing code in both Haskell and Elixir. Where I
have found that static typing really shines is when you do large scale
refactoring. Especially in a large code base it is much, much faster to fix
compiler errors after refactoring, than to run the tests and figure out why
each one is failing. In Haskell by the time my tests all compile after a
refactor, they are already passing.

~~~
dgreensp
Yup. In a codebase without static types, large-scale refactors are probably
just not practical, which may be why fans of dynamic types don't have this
problem; they can't, and thus don't, do substantial refactors without
rewriting.

~~~
z3t4
There are also static analysis tools for JavaScript. For example Google
closure compiler. Which can infer about 80% of all types.

------
cjfd
I would like to ask the question what we are actually trying to achieve with
both types and with tests. The answer is to, in case of error, find the error
as soon as possible instead of in production when the angry customer calls us.
The sooner the error is found the smaller the problem is. If it is found in
compilation it says 'you have an error on line 43'. If it is found in
automated tested it says 'there is something fishy about functionality such-
and-such'. Pushing forward finding the error from compile time to automated
test time is a postponement and therefore not an improvement. Also, I really
wish I could imagine encountering a production system with a test coverage
approaching 100% but I have never been that lucky. In that case your type
system is still protecting you but the tests that people failed to write are
not doing very much.

------
anaphor
I highly recommend reading this tweet thread from Shriram Krishnamurthi on
this subject:
[https://twitter.com/ShriramKMurthi/status/113641175359047270...](https://twitter.com/ShriramKMurthi/status/1136411753590472707)

------
danite
I'd say there's also a big difference in the sort of correctness that static
types guarantee vs the correctness that tests give you. Tests, to borrow some
phrasing from Donald Rumsfeld, are only really effective against "known
unknowns" not "unknown unknowns." Tests are only as thorough as the test
writer and the "known unknown" cases they can come up with to test against.
But as software developers we've all been in the situation where some case
that you didn't think to test for is what winds up causing a bug (an "unknown
unknown").

Static type systems on the other hand, can give you much greater guarantees of
correctness (if they're sound). Due to the Curry-Howard isomorphism we know
that programs in sound static type systems are the same as mathematical
proofs. That's a much stronger guarantee than what you're given by testing
alone. The problem of missing a case in your testing goes away if you have a
mathematical proof that that case cannot occur. You've taken away some of the
burden of thoroughness from the programmer and given it to the compiler
instead.

In a perfect programming utopia we would all encode the desired properties of
our programs in static types and have no need for testing because we'd have
proofs of correctness. The problem of course is that writing those kinds of
proofs into types is very time consuming and tedious and isn't realistic for
most software development. So we still need tests to cover what is too
expensive or complicated to type.

------
redact207
I'm bewildered because every senior dev I've ever spoken to has been somewhat
unanimous about the use of types and tests. Reading this I'm wondering if
there's a special context where this proposed approach is more suitable?
Possibly a single dev, tiny sealed program?

For every other system, be it with more than 1 developer or that will grow to
more than 10s of functions, types and tests eliminate being anchored to the
code you've written in the past.

The #1 be benefit to them is you can forget about what you've written. It's
unlikely you'll remember in 6 months time that one function where you can't
pass in negatives or Nil. Do you want to have to read the internals of every
function you're calling? Isn't that a violation of some open/closed principal?

There is nothing sweeter than defining your intent in code contracts and
confirming those contracts work before letting you or anyone else use them. If
you don't want negatives being passed to your function, maybe unsigned
integers aren't a good idea. Maybe make it explicit that you're sending in an
'age' value by defining it as a value object with its own validation. Make it
blow up in the hands of the consumer as fast as possible. If you're waiting
until runtime to tell you these things then you're not going to enjoy
development.

And your peers will hate maintaining your junk code.

------
whatshisface
The long-term benefit of typing doesn't show up in examples that fit in
tweets: if the desired behavior is encoded in the types, the checks appear for
free everywhere you use them, and you don't risk forgetting to test something
in the n+1th function that you remembered to test in the nth.

------
2T1Qka0rEiPr
I really enjoyed this style of writing. It respectfully expressed why he
disagreed with someone else, but acknowledged that both he and said person
achieved their results and had tests which covered that fact. Nothing
inflammatory or overly partisan, thanks :)

------
millstone
Regarding the Haskell code, what about infinite lists?

1\. Why didn't QuickCheck test the infinite case?

2\. How do we write tests that fail (instead of hang) if an infinite list is
mishandled?

It seems like infinite lists act somewhat analogously to nulls: rndSelect
cannot handle them, and that fact is not obvious in its type.

~~~
uryga
> _How do we write tests that fail (instead of hang) if an infinite list is
> mishandled?_

easy, just solve the halting problem!

~~~
millstone
Deciding type-safety of code also requires solving the halting problem, but
haskell still tries!

I guess the question could be practical. Some functions are supposed to handle
infinite lists: rfold, map, etc. How are these functions (or other, nontrivial
ones) tested in practice, to ensure that they do not become accidentally
strict?

~~~
uryga
> _Deciding type-safety of code also requires solving the halting problem, but
> haskell still tries!_

does it? afaik typechecking is decidable for most type systems, and only
becomes undecidable with something like dependent types (or c++ templates) but
i could be wrong

------
kybernetikos
I agree that you don't generally /need/ to test preconditions of internal code
(although you should document them). However, if you're writing a library with
an API that will be called by others, you should make sure your code fails
fast, and you should assert that behavior in your tests.

Adding extra code to ensure you fail fast is often well worth it in future
debugging time, even for internal code.

------
adamnemecek
This discussion is so played out.

[https://ro-che.info/ccc/17](https://ro-che.info/ccc/17)

~~~
pcl
I want to know what the people who are familiar with type theory but are not
proponents of either static or dynamic type systems are up to. Did they move
on to management? Switch to physics or painting? Retire?

~~~
tel
Serious answer: type theory as a subject almost never discusses any idea like
dynamic types (they wouldn't even be considered "types"). Presumably, if you
only ever study it academically then you might actually have never even heard
of dynamic types.

~~~
neel_k
The founding act of denotational semantics was Scott's model of the untyped
lambda calculus, which is surely the paradigmatic dynamically typed language.
Moreover, its creation led to an immense body of work on type theory, because
there are two two distinct, but related, intuitions underpinning what a type
is.

On the one hand, you can think of starting with some given notion of
computation (the lambda-calculus, say, or x86 machine code), and then you can
think of types as picking out particular subsets of the set of valid programs.
(In jargon, this is "types as retracts".)

In this view, a type is a property of a program. On the one hand, you can say
that only things which pass type-checking are programs, and things which don't
type check are simply not well-formed programs. In this view, a type is a kind
of grammaticality constraint.

If you don't have a model of the untyped lambda calculus, you can't speak
formally about the first view, which is very limiting. This is because being
able to shift between the views is important for thinking flexibly about
different kinds of problems.

Of course, you can also try to combine these two views. A few years back, Noam
Zeilberger and Paul-Andre Mellies wrote a gorgeous paper, "Functors are Type
Refinement Systems", which lays out the principles of how these two views
interact.

[http://noamz.org/papers/funts.pdf](http://noamz.org/papers/funts.pdf)

~~~
tel
I hear what you're saying---and appreciate the references---but I guess what I
intended to discuss above was the "practice of dynamic types/tags" as is
carried out in software development.

LC is clearly "untyped" in that it doesn't carry any type system, but that's a
distinct idea from "dynamically typed introspection". That's an engineering
system.

So maybe my error is to read into "dynamic types" as something more than the
most utilitarian reading, but I also think that's usually what the "debate"
ends up being about.

------
skybrian
If you're writing a standard library (or similar) then you do want tests to
check that runtime errors are reported correctly with a nice error message,
because you that is what the customers of your library see.

Within a single app of reasonable size, this isn't so important.

------
truth_seeker
>Static typing is an attempt to make software more mathematical. Type
correctness is deductive and provable. However, type correctness does not
imply behavioral correctness. Even when fully type correct the behavior must
be demonstrated empirically.

I think this is the crux of the problem and my experience agrees on this
statement.

At the end of the day, its Behaviour that matters the most.

------
lgas
_What about invalid arguments? What if someone calls: (random-elements -23
nil)? Should I write tests for those cases?_

 _The function already handles negatives by returning an empty list for any
count less than one. This isn’t tested; but the code is pretty clear. In the
case of the nil, an exception will be thrown. That’s OK with me. This is a
dynamically typed language. You get exceptions when you mess up the types._

 _Is that a risk? Sure, but only if some other part of the system was written
without tests. If you call this function from a module that you wrote with
tests, using something as effective as the TDD discipline, you won’t be
passing nil or negative numbers, or any other form of invalid arguments. So
I’m not going to worry about that. It’s on you._

This attitude coming from an experienced software developer is hard to fathom.
Sure, if you're an experienced and meticulous developer that fully embraces
TDD and you only work with other experienced and meticulous developers that
full embrace TDD on the same type of software handling the same types of
problems over and over, then sure, you might be able dodge a pretty high
percentage of bugs a pretty high percentage of the time.

But most teams in the real world have a mix of inexperienced and experienced
developers, meticulous and careless developers, developers that embrace TDD to
varying degrees, and work on a variety of problems in a mix of domains with
which the team members have varying degrees of experience.

So now your choices are:

1) code each and every function in a way that gives you high confidence that
it works for the happy path and rely on the discipline of your team to ensure
that no unhappy paths are hit

or, for approximately the same level of effort:

2) code each and every function in a way that gives you high confidence that
it works for the happy path and which you have an extremely high degree of
confidence that no unhappy paths even exist at all

It's just not a tough decision for me.

~~~
mikeash
What stands out for me is the casual acceptance of this behavior for negative
parameters. It returns an empty list so it’s all good? No way! Asking for a
list of negative length is a conceptual error. This should at least be an
assert. That ensures nobody accidentally starts relying on bad behavior by
prohibiting it. Of course, if you can use the type system to do this by using
an unsigned type (and one without implicit non-failing conversions from signed
types such as C has) then so much the better.

I think the author has the foundation of a good idea: don’t write tests for
things that aren’t supposed to happen. (This is not the same as writing tests
for error handling: errors are supposed to happen!) But he builds the wrong
thing on top. You don’t just ignore things that aren’t supposed to happen and
hope that they don’t. You _make sure_ they don’t happen, either by asserting
at runtime or making them logically impossible at compile time.

Once you reach that point, then you see that a type is just an assert that’s
nicer to write, checked earlier, and is more comprehensive. The tradeoff is
that they are less expressive, so you still need traditional asserts
sometimes.

~~~
jghn
> You make sure they don’t happen, either by asserting at runtime or making
> them logically impossible at compile time.

One thing I find common with arguments from people like the author is that
they don't consider types can be more powerful than the ones they're used to
dealing with in the languages they use. It seems impossible to them that one
could simply encode the parameter as being a type which _does not allow_ for
negative values or `nil`. As you say, asserting at compile time that the
program is correct.

~~~
mikeash
I guess this is a specific example of the Blub Paradox.

I saw this a lot when Swift came out. Suddenly a lot of Objective-C
programmers were learning a new language with a fairly different type system,
and it was tough. The Optional type was an especially strong pain point.

What I found particularly interesting is that C (and Objective-C) _has_
optional types. All pointer types are optionals. What it lacks is optional
non-pointer types, and _non-optional_ pointer types. Because optionality is an
inherent part of pointer types in the language, you never learn to think about
optionality separately. It's like a fish in water: it's so common you don't
even see it.

I suppose it would show up even more strongly with someone used to a language
like Ruby. If C programmers have a hard time really grasping the idea of
optionality because the language doesn't make it explicit, then a Ruby
programmer might have trouble with static types at all. (And, to be fair, the
same no doubt applies in the other direction: someone coming from a strong
statically typed language probably has some hurdles to understanding dynamic
type systems.)

------
tpush
"My answer to that is that even using a dynamically typed language I don’t
have to write the test for nil because I know that nil will never be passed. I
don’t need to make states unrepresentable, if I know those states will never
be represented."

You _don 't_ know that nil or whatever "won't ever be passed" in dynamically
types language!

The fact that Bob Martin, a supposedly experienced software developer,
genuinely, sincerely argues a variation of "I don't need to handle this case,
since I _just know_ that this can't happen because... Obviously no one,
certainly not me would do that" is inconceivable.

Does he not even consider what happens when someone else, i.e. _no him_ ,
passes "unrepresented" values to his function? WTH?

~~~
chickenfries
How a consultant could ever have this view baffles me.

------
crimsonalucard
There is a misguided notion on testing. Testing cannot establish correctness
on a program unless you test every possible input and output. A test only
proves a specific test case is correct, that's it, this is not actual program
correctness.

Type checking is an actual proof that is light years ahead of a test in the
fact that you need zero test cases to prove your program is correct in terms
of types.

To achieve the same affect with testing you must test your function with every
possible input for every possible type. An untyped addition function must test
for what happens when you give that function a pointer to every possible
permutation of a nested hashmap of strings and arrays otherwise you are not
establishing correctness. What you are actually doing is taking a statistical
sample of random inputs and guessing that the function is correct based off of
that sample.

For greater predictability the test samples should be randomized following
statistical procedures but this is rarely followed as most testers don't
actually understand that they are taking a small statistical sample of a
domain that covers an almost unlimited amount of possible inputs.

Tests are essentially selected with high programmer bias so a lot of bugs make
it through.

