
The Dark Path - runesoerensen
http://blog.cleancoder.com/uncle-bob/2017/01/11/TheDarkPath.html
======
DannyB2
The article asserts that authors are not testing their code. The problem is
that you can't ensure that every possible bug is exercised in tests. Think of
the static typing of the compiler as the very first level of testing. If you
can have an NPE at this line of code, then it won't compile. The programmer is
forced to deal with it. It has failed the first round of 'testing' done by the
compiler.

Who's job is it, the programmer or the language?

In this case, it is the programmer, who is being forced to fix it by the
language. We don't want code with bugs. Don't 'fix' it in testing when it is
way cheaper to put that null test into the code when it is written. The
compiler can absolutely ensure that no NPEs can be generated. The testing
absolutely CAN NOT make this guarantee.

There are other languages that allow sloppy programming. Use them if you like.
Encourage their use if you like. If there were one perfect programming
language then everyone would be using it already.

------
AnimalMuppet
Every check that you can add to a language means ways that the programmer
cannot mess up. That's a gain. It also means things that programmers cannot
do. That's a loss. For some restrictions, the gains outweigh the losses. But
that won't be true for every restriction; kitchen knives are sharp for a
reason. Restricting everything so that nobody can do anything wrong is
unworkable; it is, if not _the_ dark path, at least _a_ dark path.

But so is relying solely on testing for safety. There's always a path through
the code that you didn't test. Even if your tests execute every path through
every function, there's always a combination of paths that wasn't, or an input
value that wasn't, or, or or. You _cannot_ achieve quality solely by tests.

You need a balance. Use language features that prevent errors - not that
prevent every error, but at least that prevent some. Use good design. Use
static code analysis. Use thorough testing. It all matters; no one thing will
save you.

~~~
kazinator
A static check adds a way in which the programmer can mess up.

A code path which fails a check was obviously not tested (at least, not since
the change was introduced which triggers the check).

The programmer can massage the code so the check goes away, yet leave the code
path still untested, due to the "safyness" feeling that that the machine "got
his/her back", so all is well.

~~~
AnimalMuppet
> A static check adds a way in which the programmer can mess up.

True, though it is one that you can ignore. You can't ignore the compiler.

Reading through the output of a static checker is 95% "Stupid hyper-picky
static checker", and 5% "Oh #&%$@, we'd better fix _that_ right away". Ignore
the ones that should be ignored.

> A code path which fails a check was obviously not tested (at least, not
> since the change was introduced which triggers the check).

Or not tested with the right combination of inputs, or not tested looking for
the right thing...

> The programmer can massage the code so the check goes away, yet leave the
> code path still untested, due to the "safyness" feeling that that the
> machine "got his/her back", so all is well.

Sure, they can, if they aren't trying to understand what the checker is
telling them, and instead just try to get it to shut up. And they can do the
same with a test.

~~~
kazinator
> Or not tested with the right combination of inputs, or not tested looking
> for the right thing...

There's the thing. Unless the code is exploiting dynamism, probing large input
spaces is not required to catch the same issues. Just enough variety to
achieve path coverage.

We don't need to probe the space of integers to catch that a string operation
is applied; running that expression with any value will do.

I added a static check for uses of undefined functions and variables to a
dynamic language quite recently. It didn't find anything in any of my
application code. One instance was found in the language's standard library.
The function which had that error _was never tested_. Calling that function
with any argument at all triggered the dynamic check.

The static check only pointed out this fact: hey, you didn't test this! That's
a poor substitute for an actual path coverage tool. You can develop untested
code which passes various static checks; only a coverage tool will show you
the untested paths.

~~~
Jweb_Guru
A coverage tool will not come remotely close to showing you untested paths.
Array indexing, for example, is an implicit branch between every possible
element in the array, but most coverage tools I've seen count just evaluating
the indexing expression as covering the line. On the other hand, judicious use
of indexed data types can guarantee a data path for all indices into the
array. Simple sum types like Maybe and Either guarantee a data path for the
more simple use cases of null or errors.

Uses of undefined functions and variables are not what people are talking
about here (though those do happen in dynamic languages); we're talking about
stuff that can't be checked locally (like "is a null allowed to come into the
function at all, and if it does are you covering it") that are hard or
impossible to catch with coverage tools.

~~~
kazinator
If for some crazy reason we choose to regard an array access as if it were a
big switch or case statement with a branch for each index, we are wrong in
concluding that this creates a path coverage problem. These cases are machine-
generated and do not require separate coverage treatment. An analogy might be
found in, say, machine-unrolled loops. I don't care about coverage of multiple
copies of a loop body caused by a compiler optimization.

~~~
Jweb_Guru
How do you know that each array index will succeed, without a proof (or type)
to that effect? You don't. You're arguing from the technical definition of
path coverage (cases that are machine-generated "don't count"), but I'm
arguing from "actually covering the behavior of your program," which means
they absolutely do. You're also assuming that the behavior on failure will
always be a crash, which isn't even close to being the worst kind of failure
from an array index that returns something unexpected.

~~~
kazinator
I don't care about that within the scope of coverage testing. I am not
laboring under any misconception that path coverage is correctness proof. Path
coverage is not "coverage of the behavior of the program".

My thesis is that path coverage will catch a type error, like a floating-point
value being used as the array index, or a non-array being indexed. Programs
have to be contrived to harbor a type error in spite of complete path
coverage.

Complete path coverage means this: every node of the program's graph has is
visited at least once at run-time. At that visit, type is determined. This is
similar to traversing the program graph statically and determining the type at
each node.

Regarding your last point, yes, I'm working under the assumption that we catch
errors rather than behave unpredictably.

~~~
Jweb_Guru
You're assuming we are only talking about very simple types, and simple
program semantics. In the vast majority of dynamic languages, variables are
unityped with several tags (string, number, object, etc.) so they can
trivially never experience a type error, with or without full path coverage
(with the possible exception of attempting to access undefined variables,
which is _still_ not considered a type error in all languages). However, full
path coverage does not guarantee that a value assigned a certain tag (which
often drives which interfaces are provided) will have that same tag in all
executions, and tags are often extremely imprecise descriptions of the value
(often they are only precise enough to identify data layout), which means that
many straightforward operations like property access and operator application
have to be turned into potentially-throwing expressions or perform automatic
casting to avoid causing a runtime error (both of which hinder many
optimizations and making it hard to reason about your program's behavior).

For instance, if you grab a value out of an array, assuming that the value you
grab is an object with a certain property, and then access that property,
requiring 100% path coverage still admits programs where the property will not
be found (or where the property will be found, but have a different type than
what is expected). Failing to account for the possibility that a tag is wrong
is exactly what sum types rule out; failing to account for the existence of a
property is what records (product types) rule out; the possibility of failing
to grab an element from an array in the first place is what indexed data types
(GADTs) rule out. None of these are type errors in most dynamic languages, and
hence none can be replaced by 100% path coverage, except under very limited
circumstances (which I can attest are not satisfied by real-world programs).

~~~
kazinator
> _so they can trivially never experience a type error_

(car a), a contains 42. Application bombs.

> _However, full path coverage does not guarantee that a value assigned a
> certain tag ... will have that tag in all executions._

Not having the same tag in all executions isn't necessarily an error, and the
guarantee is not required. It being vanishingly improbable that there isn't a
type error is good enough. A guarantee in this area has limited value; it
doesn't anywhere near prove correctness.

> _For instance, if you grab a value out of an array, assuming that the value
> you grab is an object with a certain property, and then access that
> property, requiring 100% path coverage still admits programs where the
> property will not be found (or where the property will be found, but have a
> different type than what is expected)._

I'm skeptical. An example program (Common Lisp, Ruby, Python, Scheme, ...)
would be useful, which is fully exercised, via external inputs, such that no
errors occur, yet more inputs can be found that trigger a slot-not-found type
problem, or a wrongly typed value pulled from a slot. It's not that I don't
think the program is possible, but that it has to be contrived. I don't care
about catching fraud, only mistakes.

~~~
Jweb_Guru
> (car a), a contains 42. Application bombs.

That's an example of having the wrong tag, not type.

> Not having the same tag in all executions isn't necessarily an error, and
> the guarantee is not required. It being vanishingly improbable that there
> isn't a type error is good enough. A guarantee in this area has limited
> value; it doesn't anywhere near prove correctness.

I absolutely agree that it isn't necessarily an error (though I do not agree
on "good enough"; see below), but I was responding to your own assertion that
the guarantee you got was just as good as what you got in a statically typed
system, which it's not. Simple statically typed systems verify that the tag is
the same by only allowing one tag, eliminating certain classes of error caused
by trying to access data as though it were a particular type when it's not (at
the cost of limiting the flexibility of your program); more complex ones allow
multiple tags if desired, but require you to check to make sure that the tag
is what you expect before going on to perform operations on the wrapped data.

> I'm skeptical. An example program (Common Lisp, Ruby, Python, Scheme, ...)
> would be useful, which is fully exercised, via external inputs, such that no
> errors occur, yet more inputs can be found that trigger a slot-not-found
> type problem, or a wrongly typed value pulled from a slot. It's not that I
> don't think the program is possible, but that it has to be contrived. I
> don't care about catching fraud, only mistakes.

It's really, really easy to do this. JSON parse a list of objects (your
external input) that are expected to have the same shape. Then walk through
the list, accessing a certain property (say, name). It is very easy to write a
test of this that achieves 100% path coverage (harder if you need to count
exercising every path of the JSON parser, but not material for the purposes of
my point). It is also very easy to construct an input for which this will
fail. It happens routinely when you store something in a json blob store or
accept input over an API endpoint (I've seen it numerous times) so your
skepticism that it can happen in real programs is unwarranted. If you want, I
can write out a program that does it, but the description I just gave should
suffice.

Note that this can happen even if you carefully check to make sure that all
input is of _valid_ shapes and explicitly put them in structures (let's say
you have three or four shapes that you consider valid in most lists). You can
get a similar failure in a language like Java if you try to, e.g., downcast to
a subtype of Object, while iterating through the list, without first checking
what type it is (something I've seen people do not infrequently, occasionally
with reassuring comments about how it can't fail; the comments are often
wrong).

Examples of array indexing failures with 100% path coverage is even easier, as
should be evident from the numerous array-out-of-bounds errors that exist in
C; the only difference is that they don't turn into buffer overflows in
memory-safe languages, they're still there.

Overall, I find your skepticism that bugs like this show up to be pretty
unwarranted, especially since I work in dynamically typed languages every day,
and my coworkers and I cause bugs like this in production pretty frequently,
including in repositories with 100% code coverage.

------
gumby
I like the sentiment but don't at all agree with the conclusion.

> Why are these languages adopting all these features? Because programmers are
> not testing their code.

I have a more uncharitable view. Most programmers don't really understand
their tools and don't really take a lot of time to think about what they are
doing (to be fair, management, especially in enterprise environments,
discourage programmers from thinking about what they are doing).

Languages like Swift and Go are designed to constrain the programmer to make
it harder for them to make mistakes. The strong type systems also make it
easier for programmers to provide information to the compilers to help them
generate better code.

Testing happens so late in the process I don't think it has anything to do
with it.

There are powerful grownup tools like Lisp and C++ which are paradigm-neutral,
but which also hand you enough rope to asphyxiate yourself before you can even
get as far as hanging yourself.

~~~
kazinator
> _Testing happens so late in the process I don 't think it has anything to do
> with it._

Most testing happens as late as possible, in fact. Which is to say, never.

------
seertaak
I think this an excellent article, and mostly agree with the author's somewhat
controversial thesis.

Broadly speaking, I have noticed that language features fall into two
categories. The first gives you some kind of new ability. For example, late
binding allows polymorphism. On the other hand, the second category entails
deliberate restrictions and hurdles imposed upon the programmer with the
putative goal of reducing errors. An example of this category is the checked
exception mechanism of e.g. Java.

For reasons I don't deign to fully understand, many programmers seem to be
rather attracted to the features from the second category, whereas I myself am
more attracted to features in the first category. Languages with a strong
emphasis on the second category include Haskell and Rust.

My criticism of those languages is that they seem to neglect that language
design almost always involves tradeoffs. This is because there exists a
complexity budget -- related to the our ability to hold multiple pieces of
information in our very imperfect brains. During my forays into Haskell --
which I tested on simple scripts in the hedge fund I was working -- I always
was struck by the beauty and power of the language. That said, it was hard not
to notice how concept-heavy even simple scripts were; scripts that would be
absolutely trivial in other languages now involved tortuous use of monads
within monads.

C++, the language I use for my current app
([https://zenaud.io](https://zenaud.io)), is a bit of a Jekyll and Hyde
language in this sense. In my opinion, it is telling that the category 1
feature that is templates was wildly successful. By contrast, the category 2
feature which is concepts appears to be struggling (although in fairness, the
jury is still out [and anyone who has read the Design and Evolution of C++
knows that it's a brave man who bets against the language-design savvy of
Bjarne Stroustroup]).

Last point: Rust seems to me a category 2 heavy language. For this reason, I'm
steering clear. What I really want is C with templates, destructors, late-
binding (but with value types a la Sean Parent), and operators, and very
little of the rest of the crap you get in C++.

~~~
steveklabnik
Rust may or may not be for you, but we definitely understand there's
tradeoffs. I even wrote a blog post about it
[http://words.steveklabnik.com/the-language-strangeness-
budge...](http://words.steveklabnik.com/the-language-strangeness-budget)

There's a language you might like, but I can't _quite_ find it right now, gah.
EDIT: Oh! Zig!
[https://github.com/andrewrk/zig](https://github.com/andrewrk/zig)

~~~
seertaak
Zig looks interesting -- thanks for the pointer!

------
qubyte
I like the emphasis this article places on testing. Languages which emphasise
testing or make testing easy are my favourites to code in. For example, JS
being dynamic and having mature testing and assertion libraries lowers the
barrier to getting stuff tested, Go has testing in its standard library, and
Rust has amazing support for tests (including testing examples in comments)
both in the language and from Cargo.

Perhaps it is more important to provide tooling and libraries for testing a
language than it is to provide new features in the language proper to avoid
foot-guns. It's much harder to back out of the latter.

------
spion
The main argument here apparently is "lets not add safety features because we
have a limited pool of features before the language becomes too complex". I
don't think thats a convincing argument. Its a question of which features you
value. For example, I'd totally give up `public, private and protected` to be
able to get rid of exception and `null` pointer problems forever.

> It is programmers who create defects – not languages.

In the case of nulls in typed languages, its the language's fault. The
language is literally lying to you.

Lets say that Java is claiming, for example, that a method always returns a
`string`. What is a string? Its a value of a certain class, that supports
certain methods that return other values. In structurally typed ("duck typed")
languages, the type can be defined by the methods you can call on the value
(the interface it supports)

But this claim is just not true. The method may return null, and if you call
the string methods on that value, your code may die a violent death. Its clear
that the value `null` doesn't belong to the type `string`, since it doesn't
support the contract that a `string` provides. So why is the programmers fault
to expect that the type system doesn't lie to them? I think the programmer has
the right to expect an actual string - not a number, not an `StringLike`, and
definitely not null, all of which don't fully satisfy the `string` contract.

Unfortunately, the language doesn't give you the ability to discern between a
string that cannot be null and a string-or-null (could be string, but also
could be null). Because of this, you need to track all the places that might
be null in your head to make sure you check for them, or look up the test code
to remind yourself all the time.

Its a similar problem with exceptions. The language is claiming that functions
that don't cause exceptions act the same way as those that do. But thats
painfully untrue, and you would find that out the moment you try to `fopen` a
file, read something from it, and then `fclose` it.

    
    
      f = fopen('file.txt')
      data = fread(f)
      fclose(f)
    
    

Oops but the `read` call may throw, and you now have a dangling file handle.
Good thing you knew that about `fread`! Now how about the other thousand
methods in your code? Do you know whether any of those throw?

Whats that? You should be using `try-with-resources`, I hear? Okay, then what
do I do with this code:

    
    
      atomicIncrement(inFlightRequests)
      result = doRequest()
      atomicDecrement(inFlightRequests)
    

I should use try-finally. Thats true, but what if doRequest() doesn't throw?
Then I don't need to. Why do those two look the same? Why is the language
lying that they are the same?

Now lets see whats available in Swift:

* `try!` means - I don't expect this piece of code to throw, even though it "might". If it does, crash horribly.

* `try` means - I know that this piece of code throws, but I can't handle that exception here, so propagate it.

* no keyword - This code should not throw. If it returns a Result that may be an error, I want to handle it right here.

* `try?` similar as above, but discards the specific error and returns an Optional.

Thats it. You manage the risk, the language only makes that explicit. It also
makes it really clear whats going on during code review.

Here is how the problematic piece of code looks when `doRequest` might throw:

    
    
      atomicIncrement(inFlightRequests)
      result = try doRequest()
      atomicDecrement(inFlightRequests)
    
    

The possible bug is now very easy to spot. We don't need to remember whether
`doRequest` throws or to look that up. Its clear that a finally or defer block
is necessary here.

Whats even better is how the code looks like when `doRequest` doesn't throw!

    
    
      atomicIncrement(inFlightRequests)
      result = doRequest()
      atomicDecrement(inFlightRequests)
    

Since we know the compiler would force a `try` keyword there if there was the
possibility of exception, we know this code is correct just by glancing it. No
need for `defer` or `finally`.

So are these two features not worth the cost? Maybe, if you are arguing that
we don't need types as a tool to prevent errors. In that case, it would indeed
be the programmer's fault for not writing a test for it.

But if you are going to have types, you might as well have types that tell the
truth. Otherwise they're useless.

Finally, IMO this is not a good reason to hate on types. Stop complaining that
the compiler wont accept your buggy code and take the time to do the changes.
I'm happy when the compiler points out all the places that will be affected by
the change - thats something that unit tests with mocks will _never_ catch!

Imagine that a change in doRequest caused it to throw, where previously it did
not. The compiler will now tell you about ALL places that may be affected.
What would happen if it didn't? The `inFlightRequests` counter will suddenly
start behaving strange and increasing indefinitely. What would happen if above
a certain value new requests don't get queued? A bug that leads to denial of
service.

Is it worth the hassle of changing all affected code? I don't know, what do
you think?

There are much better complaints against type systems, like the fact that most
are not able to understand certain advanced styles of metaprogramming yet. Or
that it doesn't do control flow analysis, e.g. if I type

    
    
      if (x != null) { doCode(x); }
    

then `x` should have its nullability removed in the block

or

    
    
      if (x == null) { x = nonNullValue; }
    

then below this line, x should not be considered nulable. TypeScript for
example does both of the above.

