
Ultratestable Coding Style - luu
http://blog.jessitron.com/2015/06/ultratestable-coding-style.html
======
kojoru
This is very similar to awesome Gary Bernhardt's Boundaries:
[https://www.destroyallsoftware.com/talks/boundaries](https://www.destroyallsoftware.com/talks/boundaries)

Watch it if you still haven't

~~~
eterm
Thanks for this link, that's a great talk, in fact it deserves its own
submission as I rate it better than this article.

~~~
BerislavLopac
It's here, but rather unappreciated:
[https://news.ycombinator.com/item?id=8093389](https://news.ycombinator.com/item?id=8093389)

------
tmoertel
Take this idea to the limit and you end up with free monads:

[http://www.haskellforall.com/2012/07/purify-code-using-
free-...](http://www.haskellforall.com/2012/07/purify-code-using-free-
monads.html)

~~~
pekk
Even if that takes the idea to the limit, does that make the code any easier
to understand?

~~~
chowells
No, it makes it more flexible (in ways including testability) without adding
any cost in understandability.

~~~
zak_mc_kracken
Adding the Free monad to this approach doesn't add any cost in
understandability?

~~~
chowells
Not in a language with decent abstraction capabilities. You end up changing
code from looking like:

    
    
        myThing :: Foo -> ProblemDomain Bar
        myThing f = do
            x <- subthing1 f
            y <- subthing2 x 23
            return $ g y
    

to looking like

    
    
        myThing :: Foo -> ProblemDomain Bar
        myThing f = do
            x <- subthing1 f
            y <- subthing2 x 23
            return $ g y
    

There's no difference, don't spend too much time looking for one.

The underlying representations of the type change. How you interact with the
type does not. There's no mental overhead. There _is_ a small bit of added run
time cost. You pay somewhere for adding the ability to interact with the type
in additional ways, but not by breaking any existing code.

~~~
zak_mc_kracken
> Not in a language with decent abstraction capabilities.

My question has nothing to do with the language. You are adding a relatively
advanced construct which requires a lot of prior knowledge to understand: Free
monad, monad, applicative functor, functor, monoid. Then you need to
understand the laws that monads must obey, which are not enforced by the type
system. Before you get there, a passing grasp with Haskell and higher kinded
types will help, and some basics in category theory too.

Saying that adding the free monad doesn't have any cost in understanding is
very naïve. You forgot how long it took you to reach that understanding.

------
erikb
This agrees with the general conclusion that you can't just write tests for
code, you must also write testable code.

Why it is ultra testable I didn't get, though. To me it looks like what you
should do in normal mode not ultra mode.

~~~
Ensorceled
Considering Boris Beizer was giving seminars on designing and coding for
testability in the 80's ... yeah, not ultramode.

------
couchand
The diagrams are very helpful, but they fall victim to an unfortunately common
mistake: labeling the boxes but not the arrows. We know from the text that not
every arrow is the same here. Clarify that in the diagram. Label your arrows!

------
calpaterson
For a filesystem integration, perhaps you can generally rely on the
documentation being correct. There are lots of integrations (main cause of
side effects) where this is not the case, for example if you are integrating
with APIs or with a SQL database. For these cases I would want to have more
than an umbrella test to check that my assumptions about the integration are
correct.

~~~
kabdib
Race conditions in file systems are fun. So are edge conditions around
partially written data (for instance, Windows NT will transactionally
guarantee a file's existence, but not its content).

------
pron
Only, what the article describes is _exactly_ \-- down to a T -- what mocks
are (in their more narrow definition, i.e. as opposed to fakes/stubs). They
capture your calls to actual real-world interaction and turn them into a list
of operations that can later be queried and verified, and they do all that
without requiring you to create a new DSL for capturing and describing
effects.

The article starts with discounting mocks, and later describes how you should
hand-code them each time.

~~~
chriswarbo
> They capture your calls to actual real-world interaction and turn them into
> a list of operations that can later be queried and verified

The author's point is to avoid interleaving logic and side-effects (as much as
possible) in the first place. Yes, we can use dependency injection to send
mocks deep into the bowels of our application where they get called.
Alternatively, we can have the bowels of our application return values,
describing what to call, back up to the top of our application. These are
exactly dual.

The nice thing about the latter is that every action gets reified into a
concrete value, on which we can perform any kind of computation we like.
Dependency injection can't do that, since a method call is not a value (it can
_return_ a value, but it isn't _itself_ a first-class value).

Mocks allow us to inspect which calls have been made, etc. but mocks are for
testing. We can't use those interfaces in our real code. If we use a list-of-
actions we can, for example, run optimisation passes before execution.

> they do all that without requiring you to create a new DSL for capturing and
> describing effects.

Mocks _do_ require a new DSL for capturing and describing effects; it's
normally called a "public interface", but it _is_ a DSL (an algebra) for
describing effects.

~~~
pron
> The author's point is to avoid interleaving logic and side-effects

But with mocks you're not really interleaving side effects. They are the same
as generating "effect values", only they use the same API as the actual
effect.

> The nice thing about the latter is that every action gets reified into a
> concrete value, on which we can perform any kind of computation we like.

That's what mocks do. They capture calls and turn them into reified values
that can then be manipulated and queried.

> Mocks do require a new DSL for capturing and describing effects; it's
> normally called a "public interface", but it is a DSL (an algebra) for
> describing effects.

Except that API is identical to that of whatever effect API we're using.

A mock _is_ a mechanism for turning effects into values. It just works
automatically by means of interfaces (in the language sense) and reflection.
It is an imperative construct that achieves -- without extra work -- the very
same functional effect the author is striving for.

~~~
chriswarbo
> with mocks you're not really interleaving side effects.

Yes, you are. Side effects aren't limited to filesystems, databases, etc.
Here's the very first sentence of
[http://en.wikipedia.org/wiki/Side_effect_(computer_science)](http://en.wikipedia.org/wiki/Side_effect_\(computer_science\))

> In computer science, a function or expression is said to have a side effect
> if, in addition to returning a value, it also modifies some state or has an
> observable interaction with calling functions or the outside world.

Every time we call a mock, we cause observable changes (ie. we alter the
results of any queries made on the mock). Therefore, calling a mock _is_ a
side-effect.

The point of mocking is to replace unwanted side-effects with harmless side-
effects. The side-effects are still there, mixed in with the logic.

> That's what mocks do. They capture calls and turn them into reified values
> that can then be manipulated and queried.

But mocks are only used in tests! My application can't ask, say, a FileSystem
object for a list of calls which have been made. Only my _tests_ can do that,
because they're using a FileSystemMock object instead. My application can't
use a FileSystemMock since, by design, it can't do the filesystem manipulation
that I need.

Even if I add call-logging to the API of my FileSystem object, it doesn't give
me any control, since they've already happened by the time they appear in my
queries.

For example, let's say my application ends up with a list of effect values
[e1, e2, e3, etc.] and, after performing some logic, it chooses to only return
the first one: [e1].

How would mocks help me do this? Firstly, since this is application code
rather than test code, I can't use mocks at all. Secondly, even if I _could_
perform a query to get [call1, call2, call3, etc.], there's no way to turn
that into [call1]; I can remove the _logs_ of call2, call3, etc. but I can't
undo any side-effects they've caused.

To achieve the same outcome as the effect value example, I must alter the
control flow of the program such that call2, call3, etc. are never made. That
has nothing to do with mocking.

> Except that API is identical to that of whatever effect API we're using.

Exactly! How is an "effect API" _not_ a DSL for describing effects? They're
the same thing. Mocks don't

~~~
chriswarbo
Oops, cut off my last sentence there and didn't spot it in time to edit.

I meant to say, mocks (+ dependency injection) don't require more or less work
than using "effect values"; certainly for applications written with that
approach in mind.

Yes, mocks use reflection to do meta-programming. "Effect values" also allow
meta-programming; but we don't need reflection since our "programs" (eg. lists
of side-effecting actions) are already first-class citizens. We can use any
built-in or off-the-shelf list library to manipulate such "effect values";
which I would consider just as "automatic" as using a built-in or off-the-
shelf reflection library.

Taking an _existing_ application and refactoring it to use one of these
approaches would vary from project to project regarding which would take more
effort.

I think many people dismiss functional approaches like these "effect values"
by thinking of them at too low a level, and imagining that they must involve
an awful lot of work to use. That would be like dismissing linked lists as
involving too much work, since we have to keep pushing and popping values in
and out of the lists; so why not just use individual variables?

Of course, in reality the whole point of lists is that we can ignore their
contents and manipulate them as whole entities; mapping, slicing, merging,
sorting, etc. The same applies to "effect values"; we don't just declare an
enum and throw a load of switch statements everywhere. We can ignore the
particular effects in a value and instead manipulate them as whole entities;
mapping, slicing, merging, sorting, etc.

~~~
pron
> Side effects aren't limited to filesystems, databases, etc.

As the author programs in Scala and Clojure, two impure languages that make
liberal use of memory side effects, I think she was concerned mostly about IO
(she even said so).

> The side-effects are still there, mixed in with the logic.

Imperative languages don't differentiate between memory side effects and
logic. That is either an advantage or a disadvantage -- mostly depending on
your personal preference. The strength of the imperative approach is that when
computation is a combination of logic and memory effects, it yields a more
powerful computational model than pure functions alone (that's why Turing
machines can describe computations that lambda calculus cannot -- at least not
directly; their equivalence mostly rests on LC's ability to be used as a meta-
language that can simulate a Turing-machine language). It allows representing
many algorithms that PFP simply cannot (in Haskell, many algorithms are
implemented with unsafe). Finally, it yields better performance (computational
complexity is a very vague notion in lambda calculus).

> But mocks are only used in tests!

I was referring to a mock as a concept, not a particular implementation. If
you can't find a mocking library that replays operations, write one that does.
It would fit much better and be much more idiomatic. See below for an
explanation.

> I think many people dismiss functional approaches like these "effect values"
> by thinking of them at too low a level, and imagining that they must involve
> an awful lot of work to use.

I don't dismiss pure functional approaches at all if you happen to be using a
pure functional language like Haskell. Imperative languages, however, already
have similar mechanisms in place, and eschewing those in favor of PFP
approaches _is_ more work, and tends to lose the benefit the imperative
approaches have (as those are more in line with the language), for example,
the stack context.

I will be giving a talk about this at the upcoming Curry On/ECOOP conference,
where I'll show that virtually all monads already have imperative counterparts
that are preferable (and more powerful) when writing in an imperative
language. The recent trend of PFP constructs being adopted by impure languages
is due to two things: 1. the false notion that the PFP approach is more
"mathematical" or more verifiable, and 2. monadic composition is better than
callbacks, and thus useful in avoiding blocking OS threads, which is expensive
(although simply fixing threads -- as Erlang and Go have done -- is so much
easier).

------
crdoconnor
>Writing code to verify code is so much harder than just writing the code

It's not supposed to be easy. Doing it _well_ is _really_ hard and way
underappreciated.

>Testing side-effecting code is hard. This is well established. It's also
convoluted, complex, generally brittle.

It's not necessarily brittle when done right but it's hard, it's convoluted
and it's complex.

>Before the test, create the input AND go to the filesystem, prepare the input
and the spot where output is expected. After the test, check the output AND go
to the filesystem, read the files from there and check their contents.
Everything is intertwined: the prep, the implementation of the code under
test, and the checks at the end. It's specific to my filesystem. And it's
slow. No way can I run more than a few of these each build.

This doesn't sound like it would take more than a few hundred milliseconds.
That's not slow.

Furthermore, you could definitely run it on multiple filesystems by firing up
a virtual machine for each filesystem and running the test. That _would_ be
slow, but you don't need to do it often.

~~~
mikeash
I generally want all tests in total to complete in a few seconds at most.
Longer than that and they become a serious impediment to the edit build test
cycle, and you either start screwing around with only running some tests
routinely, or only run the tests occasionally, neither of which is good.

If a single test takes a few hundred milliseconds then that means you can only
run a dozen or so such tests before you hit that problem, which isn't very
many. I'd say that a few hundred milliseconds is _quite_ slow in this context.

~~~
Alupis
There's nothing wrong with having a full test suite that runs as part of the
nightly build.

Sometimes it's just not practical to impose a hard limit of "a few seconds at
most" on your test suite, especially for larger projects. (some projects have
test suites that take hours to run, and simply cannot be any quicker unless
they were less thorough).

~~~
mikeash
I agree, but you want to avoid it when you can, and being able to put tests
into the fast ones you run all the time is good when you can. So I'd say that
a test which takes a few hundred milliseconds easily qualifies as "slow" and
is something you want to avoid if you can, even if it's not the end of the
world if you can't.

------
sqeaky
I read the title as "Untestable Coding Style" and I came to see some
bewildering and obfuscated code.

But this is good too.

------
gcb0
> mock the filesystem

it's beautiful in theory. but reality goes like this: "i only have this little
method. writing mocks will take 5x more time then the code. i will just test
the side effects on the functional tests"

then, a year later, the team added a dozen methods that would benefit from the
mock, but since you did the first one in functional only, the team just added
the new ones there as well.

the end. (because nobody refactor projects for tests)

:(

------
ignorabilis
Mocks, stubs, shims, etc. are usually just hiding the fact that the code to be
tested is terrible, even in OOP languages.

Furthermore side effects have nothing to do with testing one's code. Why?
Because you have to test your own logic, not I/O or something else. I/O is
already tested by someone else. So you actually don't care where the data
comes from - the filesystem or a hardcoded string - you care if the output is
correct after the respective transformations are applied to the input.

And when you think about it testing shouldn't be that needed at all. In the
OOP world, where state and identity are helplessly tied together and we are
not working with values, but with references to values, tests might be a
necessary evil.

In the land of Functional however writing tests shouldn't be needed, at least
in theory. Why? Because if you write small composable functions you should be
able to test them right in the REPL. Corner cases? You have to think about
them right from the start. TDD forces you to do so - then why shouldn't you
when you have the immense power of a REPL? Once a function is ready you are
not supposed to change it much. If you do you should change the inner workings
and not the input/output. If you need to change the i/o most probably you need
a different function. But what is more important is that if you needed to
change the i/o and had tests for this function you would actually need to
change the tests as well, which is more work (double? triple?) and no added
value.

~~~
awinder
Honest question: how do you validate business rules are adhered to without
defect in FRP and how does FRP differ from OOP in that regard?

~~~
15155
Stolen from Wikipedia: "FRP is a paradigm using the building blocks of
functional programming." I doubt OP was referring to FRP.

In any case: in a functional programming language, you'd just unit test the
building blocks (functions). Assuming every function is pure and total, these
unit tests should be succinct and mirror your business logic quite closely.

~~~
ignorabilis
Actually the OP uses Clojure in his project.

------
michaelfeathers
I like the IO monad in Haskell because it places a tax on mingled code. It's
easier to keep IO at the top level and just call into pure code.

------
jheriko
this just falls out of writing code cleanly and properly. if you are doing
something so complicated with inputs and outputs thatit becomes unclear what
is happening and where things need to go in and go out then you just haven't
used enough functions and classes to split up the responsibilities.

if you write clean, data driven code i can't imagine this ever being a problem
you need to solve.

~~~
jerf
"If you use the solution given in the article, I can't imagine why you'd ever
need the solution given in the article"?

~~~
Dewie3
There is always that one guy who gets offended over how obvious something is
to him.

~~~
AnimalMuppet
Given the number of comments that I've seen from jerf, and the value of what
he usually has to say, I don't think he's being "that one guy". I think he's
commenting on what he perceives to be a genuine inconsistency in the article.

~~~
Dewie3
My comment was actually referring to jheriko. :-]

~~~
AnimalMuppet
Ah, I see. My error.

