
Avoid Null Checks by Replacing Finders with Tellers - anon1385
http://michaelfeathers.typepad.com/michael_feathers_blog/2013/06/avoid-null-checks-by-replacing-finders-with-tellers.html
======
NateDad
This does not seem to be fixing null checks. It seems to be fixing lack of
proper error handling. The problem is not that you have to check for null, the
problem is that you don't know if null is a valid value, and even if you know
it's not, you don't know why the method failed. This is simply a lack of a
channel for returning errors.

There are a lot of ways to fix this problem... Many modern languages use
exceptions for this. Some languages use out parameters. Languages with
multiple return values often return a value and an error. They're all
essentially equivalent, and yes, they're all better than just returning null.

I don't really see a difference between:

    
    
      data_source.person(id) do |person|
          person.phone_number = phone_number
          data_source.update_person person
      end
    

and

    
    
      data_source.updatePhoneNumber(id, phone_number)
    

The latter one encapsulates the find, the error handling, and the setting.
Yes, you might have a null check in there. Or you might have a try/catch, or
you might be checking an out value or a multiply returned error value....
there's really no difference. You've changed the API to signify that the
person might not exist and this function might not actually do anything. And
that's fine, if that's fine. However, most of the time, if I go to set
someone's phone number, I want to know if the system thinks that person
doesn't exist.

~~~
dasil003
Exceptions for null handling? I admit there's room for healthy debate about
using exceptions for flow control in various situations, but for null handling
it sounds like an absolute nightmare. Instead of considering null at each
point it might occur in the code, you have to consider all the possible places
a null value might be stacked above you.

~~~
NateDad
Exceptions for error handling... this is just error handling, that was my
point. Returning a sentinel value (null) is just one (very suboptimal) way of
handling an error.

Instead of dataSource.getPersonById(id) returning null, it could throw a
NotFoundException. This is better because it's a specific error (as opposed to
a MalformedIdException, for example).

------
tbrownaw
Why? What is so bad about having your conditionals be explicit?

What happens when you need an "else" branch?

Is not having a person for the given ID an error, is it a "do nothing", or
does it mean to create a new person?

What does the callback version do if you forget to make your index unique, and
end up with multiple people with the same ID; does it get run multiple times?
The version that returns a Maybe<Person> at least has an obvious proper
response (throw an error).

~~~
masklinn
> What happens when you need an "else" branch?

Then you use one, but null checks generally aren't cases of if/else
conditionals, they're more along the lines of "either I have one or mayday
mayday". Sometimes slightly softened into "either I have one or use this
default thing"

~~~
tbrownaw
_more along the lines of "either I have one or mayday mayday"._

That would be the Perl way of

    
    
        my $person = $data_source->get_person($id) or die "No person available";
    

ie, there's code to be executed (to raise the error) in the case that there is
no person. The callback structure from the article doesn't allow for this.

------
com2kid
As someone who has only played with Ruby briefly, and at that many years ago,
I feel that this article could have done more to explain the author's concepts
to those not familiar with Ruby.

From what I can see, how is this much different than a Null check? I agree it
looks a bit better, but what is the real difference? Isn't there still a check
for if the Person object exists?

In terms of C++ or C#, is it correct to imagine this as an iterator over one
object where inside my iteration loop I am passing my object into a lambda?
(Seems overly complicated so I am guessing this is the wrong interpretation?)

Indeed a post at the bottom of the page mentions using foreach, but since sane
languages have closures now days (thankfully! Life is so much better now!),
how would I apply this to, say, C++?

~~~
Strilanc
The difference is that the null check is front and center, instead of hidden.

In C# or Java, the member access expression "a.b" contains a null check.
There's no way to do a member access that _doesn 't_ do a null check (for
reference types)! It's impossible to distinguish member accesses that were
expected/assumed by the writer to not fail from member accesses where the
check is relied upon.

With an option type / a maybe monad / 'using telling instead of finding',
there are no implicit null checks. Only explicit checks, and only when
necessary. Either you use Option.map, to do a check and propagate nulls, or
you don't, in which case there's no null path. You don't have to worry about
null reference exceptions, making a check where you shouldn't, or missing a
check where you should have, because the compiler prevents you from making
those mistakes.

Applying it to C++:
[https://github.com/simonask/simonask.github.com/blob/master/...](https://github.com/simonask/simonask.github.com/blob/master/maybe.markdown)

Applying it to C# (as an augment over null):
[http://twistedoakstudios.com/blog/Post1130_when-null-is-
not-...](http://twistedoakstudios.com/blog/Post1130_when-null-is-not-enough-
an-option-type-for-c)

Applying it to Java (as an alternative to null):
[http://java.dzone.com/articles/java-optional-
objects](http://java.dzone.com/articles/java-optional-objects)

~~~
topbanana
Off-topic but I find the color scheme of your blog very offputting

------
ambrop7
All these attempts to eradicate nulls are getting ridiculous. The possibility
of nulls is most often rooted in the essential complexity of a problem.

If his function is intended to be called with a possibly incorrect personId,
the null check is perfectly fine, it's just expressing his problem. His
"solution" just hides that for no benefit - actually, it makes it less clear
what's happening.

On the other hand I think what he really needs is "assert
personExists(personId)" at the beginning of the function.

People who complain about nulls are either lazy or incompetent programmers,
and the blame just happens to fall on null due to null being an inherent
feature of a programming language. Dealing with nulls is exactly like dealing
with any other case where you need to think about multiple possibilities, and
possibly assert that some of those can't happen.

~~~
orclev
Implicit nulls are the underlying problem. If you default to non-nullable
types and only allow nulls when explicitly annotated then it becomes much
simpler to spot and handle all the cases where nulls might creep into your
code. There are multiple ways of accomplishing that of course. Rust forces you
to explicitly declare a variable as nullable and won't allow you to assign a
nullable value to a non-nullable variable without performing a null check
first. Haskell simply doesn't have a built in null value instead handling the
problem via the Maybe type which represents a nullable value. I'm sure there
are other examples in other languages, but those are the two I'm familiar
with.

The concept of null is fine, but it shouldn't be the default state of
everything. Force the programmer to declare his intention ahead of time and
have the compiler help him by trying to find and validate all the edge cases,
or at least as many as reasonably possible.

~~~
ambrop7
True, some language features can indeed make defensive code bit less verbose,
but I believe their value is overestimated. My approach in C code is usually
to assume no pointer argument is permitted to be null unless explicitly
stated, and to _not_ add null assertions (since if is null program will most
likely crash which serves the purpose of the assertion). I don't remember
seeing a null dereference bug in my code within the last year.

~~~
orclev
Unfortunately the problem with that approach is contained in the sentence
"since if [a pointer] is null [the] program will most likely crash which
serves the purpose of the assertion". It's that most likely part, you can't
count on that, and in the case of a malicious actor if they can figure out a
way to control the declaration of the null pointer (and yes, there are ways of
doing this) you could be looking at an arbitrary code execution vulnerability.

There's also the fact that a program crashing is quite often not considered an
acceptable way to handle an error condition (it is true however that a crash
from a null de-reference vs. a crash from a failed assertion is, aside from
security implications, functionally identical). In a large code base, if
values default to null or even intentionally return null, it can be very
difficult to track down and handle every single case, particular if someone
changes something way down in the guts of some function that changes a pointer
in a data structure to a null, but that pointer never gets accessed until
you're in some completely different area of code. Sure you can handle the null
in the location it causes a failure at, but what if what you really want to do
is figure out where the null came from and prevent it from being null in the
first place?

There's also the added advantage that adding extra effort to allow a value to
be null is a nice subtle encouragement to not allow null values if possible in
the first place.

------
jlarocco
This is a strange article. The biggest problem with null checks isn't that
they make code difficult to read - it's that people forget to add them and
that can cause bugs and security vulnerabilities. I've never known anybody to
complain that they made code difficult to read. Unless they're mixed in with
"real" program logic, like "if (obj != NULL && obj->value()<20)" then they're
really not too hard to ignore.

And the author doesn't seem familiar with many programming languages. Using
his "finders and tellers" idiom isn't as special as he thinks, and doesn't
require lambdas or blocks. Here's a similar idea in Python, not using its
lambdas:

    
    
        for person in data_source.person(id):
            person.setPhoneNumber(phone_number)
            data_source.update_person(person)
    
        # And for other situations:
        with open("whatever.txt") as inf:
            for line in inf:
                print(line)
    

C++ has had similar functionality in the STL using iterators since before
C++98.

Lambdas and blocks _are_ convenient and nice features to have, but they're not
required for this.

~~~
untothebreach
It seems like the reason for the author recommending the use of blocks/lambdas
is to put the null-checking code in the `data_source` library, rather than
user code. That is, if `data_source.getPersonById` doesn't find a person for
that id, the block/lambda doesn't get executed at all, and there only had to
be a null-check/not-found-check in one place. In your first Python example,
both `.setPhoneNumber` and `.update_person` would be executed, and both would
have to be smart enough to not persist anything from a "null object" (assuming
data_source.person(id) returned a valid `Person` object with null fields).

~~~
jlarocco
No, data_source.person(id) will return a collection of 0 or more Person
objects, and the for loop will iterate over them and call .setPhoneNumber and
.update_person methods on each them. The null check is replaced by iteration
over a possibly empty collection.

The author's example code appears to be doing the same thing, but iterating
using Ruby's block syntax:

    
    
        data_source.person(id) do |person|
          person.phone_number = phone_number
          data_source.update_person person
        end
    

I'm assuming data_source.person(id) would return 0 or more Person objects.

~~~
masklinn
> The author's example code appears to be doing the same thing, but iterating
> using Ruby's block syntax:

It's not iterating.

~~~
jlarocco
Does it matter? The null check is gone, so it achieves the author's primary
goal.

Conceptually, they both say "update this field for every entry having this ID"
and at a quick glance the code for each language looks almost the same.

~~~
masklinn
> Does it matter?

To an extent yes. I'd fully expect the Ruby version to run more code after the
block has been executed, the Python version much less so (even though it is
possible) for instance.

------
munificent
This is a pretty neat pattern. One nice bonus is that the code invoking the
closure has a chance to do work _after_ the closure. This lets you do nice
"scoped" behavior automatically. For example, in his Ruby code:

    
    
        data_source.person(id) do |person|
          person.phone_number = phone_number
          data_source.update_person person
        end
    

I would take that `update_person` call and have the `person(id)` method do
that implicitly:

    
    
        data_source.person(id) do |person|
          person.phone_number = phone_number
        end
    

(I'd likely optimize for the case where the person wasn't actually modified
too.) This way, the caller doesn't have to remember to explicitly update the
person.

Another way to look at this pattern is as a poor-man's pattern match. For
example, using the pattern-matching syntax of my language[1], you could do:

    
    
        match dataSource person(id)
        case person is Person then
          person phoneNumber = phoneNumber
          dataSource updatePerson(person)
        end
    

Granted, that's more verbose here, but it lets you have other cases if that
makes sense for your problem. Magpie has blocks too, so a literal translation
would be:

    
    
        dataSource person(id) as person do
          person phoneNumber = phoneNumber
        end
    

For very short blocks, you can use an implicit parameter name similar to
Scala:

    
    
        dataSource person(id) do _ phoneNumber = phoneNumber
    

The `do` notation is just syntactic sugar for passing a function as the last
argument, so you can also do:

    
    
        dataSource person(id, fn(person) person phoneNumber = phoneNumber)
    

How did I get derailed talking about Magpie?

[1]: [http://magpie-lang.org/](http://magpie-lang.org/)

------
batterseapower
This is just the Church encoding of Haskell's Maybe type. It is indeed a
powerful way to simulate algebraic sum types when you only have product types
in your language.

------
wwweston
Anybody who's written much JavaScript without a jQuery-like function should
appreciate this too -- used to be much more common to have code littered with
checks to see if a dom element existed, then do an operation if it did.
Performing an operation over a set (including empty) of elements seems easier.

~~~
ape4
Yes I thought of jQuery also. jQuery always returns an array. If it has no
elements that means the thing wasn't found.

------
masklinn
This dovetails into the classical "tell, don't ask": without good support for
anonymous functions or blocks, for a number of operations you'll have to ask
an object for parts of its internal state, munge that state then shove it back
into the object. Said object may return a command protocol instead, but it's
generally verbose and exceedingly rare.

Blocks provide a conduit to provide compound and possibly complex commands
while leaving the object itself in charge.

------
fexl
The problem I have with his example is that I would probably want to do this:

    
    
        Person person = dataSource.getPersonById(personId);
        if (person != null) {
            person.setPhoneNumber(phoneNumber);
            dataSource.updatePerson(person);
        } else {
            user.notify("That's an invalid ID.");
        }
    

So I need an "else", in effect, regardless of whether it's implemented as a
conditional, block, lambda, exception, or whatever.

------
Locke1689
If you did this everywhere your program would be slow as shit due to all the
closure allocation in inner loops. Of course, this doesn't matter much for
Ruby, which is already slow as shit, but C# and Java people would be pissed.

~~~
mrmekon
This is the response I was looking for. We keep making computers 2x faster,
and then making software architecture 4x more wasteful (given, it's also 20x
faster dev time).

"if (ptr == NULL)" is usually two instructions and an instruction pipeline
flush. That sucks, but I can't even imagine what the processor ends up
executing for the Ruby lambda callback. Slightly more readable code at what
cost?

~~~
masklinn
> That sucks, but I can't even imagine what the processor ends up executing
> for the Ruby lambda callback.

That's not a very good question, `if object == nil` will already be a pretty
huge number of instructions in ruby.

Now if the question is "could you have this pattern compile down to little
more than a null check", the answer is why not? A bit of flattening/inlining
should be able to handle it correctly.

And in most fields, the gain in safety way outweighs the almost unnoticeable
(against background noise) loss in efficiency.

~~~
Locke1689
_could you have this pattern compile down to little more than a null check_

The answer for C# and Java, which this issue was directed at, is no.

Email me if you want details.

~~~
masklinn
> The answer for C# and Java, which this issue was directed at, is no.

I'm not sure what you're talking about, _you_ focused on java and C# for odd
and unknown reasons, TFA merely uses java as an example of syntactically heavy
(or even missing) closures language to illustrate the pattern's gains and
differences.

~~~
Locke1689
Did you read the article? This is the beginning of it:

 _The other day, I saw this example on StackOverflow:

    
    
        Person person = dataSource.getPersonById(personId);
        if (person != null) {
            person.setPhoneNumber(phoneNumber);
            dataSource.updatePerson(person);
        }

This looks like a typical case where we need a null check._

If the canonical example under discussion is in Java, I think it is safe to
assume the article is about Java.

~~~
masklinn
> I think it is safe to assume the article is about Java.

No. The article is about a pattern of explicit null-check avoidance, java is
an example of "a language without blocks or lambdas" but the article is no
more about java than it is about ruby, you're not even missing the forest for
the trees you're missing the forest for a fallen leaf.

------
tome
The astounding lengths languages will go to to avoid implementing sum types.

------
robotresearcher
An idiomatic Java/C++ OO approach is to have an interface to the datastore
that eliminates the need for the NULL. That way you don't need to know the
internal details of the datastore object.

datastore.ChangePhone( personId, number );

could return a bool to indicate success/failure or throw a "personId not
found" exception according to taste.

This is normal Java style and is in line with the article's recommended "tell-
don't-get" approach.

Requires boilerplate inside datastore object, but hey, that's not exactly news
for OO.

~~~
ajanuary
OO doesn't mean boilerplate. Just because there's a correlation in languages
like C++ and Java doesn't mean there's causation. Most of their boilerplate
derive from the type system.

------
SigmundA
Surprised no one has mentioned objective-c's sending message to nil which is
relied on throughout apple frameworks which is basically what the article says
is a lot of work to implement but given to you for free in obj-c:
[http://stackoverflow.com/questions/156395/sending-a-
message-...](http://stackoverflow.com/questions/156395/sending-a-message-to-
nil)

C# kinda does this for nullable types, it is referred to as null lifting.

~~~
mikeash
Sadly, it's not too consistent there. For example, this is OK:

    
    
        [[obj find] doThing]
    

But this will crash on nil:

    
    
        [array addObject: [obj find]]
    

Nop-on-nil can be convenient, but without APIs that also nop on nil
parameters, it can be a little annoying when you still have to check half the
time.

------
tieTYT
I love the Object Mentor guys' blog articles, but they get a new blog every 2
years so it's really hard to find their articles in one place.

I remember reading one, I think by Michael Feathers, that was about how
sometimes your code gets larger when you refactor and that's expected and
good. He compared code to an orange. The rind is the class/method signatures.
The pulp is the implementation. If the rind get larger by reducing the size
and DRY violations of the pulp, you can consider that worthwhile. If anyone
can find this article I'd be grateful. I haven't been able to track it down
myself.

~~~
tbrownaw
_He compared code to an orange. The rind is the class /method signatures. The
pulp is the implementation. If the rind get larger by reducing the size and
DRY violations of the pulp, you can consider that worthwhile._

That sounds very very wrong, as you're making your code's clients do more work
/ be more complex just so you can simplify your implementation.

~~~
tieTYT
That's a huge assumption on your part. You should show me an example in code
of what you think I'm saying.

~~~
tbrownaw
It sounded like you were saying that making your codes interface bigger _in
order to_ make your code's implementation smaller, is a Good Thing.

~~~
tieTYT
No, I'm saying if avoiding a DRY violation creates more methods/classes it's
worth doing.

------
tmuir
Can we please dispense with the whole "How did these moron's get anything
accomplished before I came along?" pattern?

The article, comments on the article's page, and a lot of comments here are
all variations on "This perfectly acceptable practice makes my eyes bleed, and
is the calling card of shitty programmers everywhere. After I switch some
semantics, rearrange the deck chairs, etc, its now perfect in every way,
invulnerable to bugs or misuse."

No pattern is universally applicable. There are lots of ways to accomplish a
task. Coming up with a different solution does not invalidate what came
before.

~~~
nilliams
Eh, seems like you exaggerated the article/point only to then tear down your
own extreme interpretation of it.

>> After I switch some semantics, rearrange the deck chairs, etc, its now
perfect in every way, invulnerable to bugs or misuse

I don't see anybody claiming that.

------
glurgh
If you're particularly bothered by this you can just make your query
interfaces uniformly return iterables. Which is what already typically happens
in your average low-level DB API, in languages lambdaful and lambdaless.

------
Pxtl
Wouldn't simply returning a list and then using your language's favourite
foreach construct have functionally the same semantics without the confusion
of passing code-blocks around?

~~~
dragonwriter
Never saw "passing code blocks around" as a source of confusion, and in many
languages (Ruby, for a salient example) "passing code blocks around" is the
same mechanism as is used in the languages main foreach construct.

------
stevepotter
Sounds great except often something needs to be done when a value is null,
like giving user feedback. If the answer then is to provide another block,
just go back to the if statement.

------
radiowave
It may be I'm missing something here. Do we really need to tack block handling
onto every method that can change some state? I don't see any benefit over the
following idiomatic Smalltalk:

    
    
      (self dataSource getPersonById: id) ifNotNil: [:person |
      	"operate on person here"	
      ].
    

(There's also ifNil:ifNotNil: for where you want to handle both cases.)

------
DanWaterworth
Congratulations, you invented monads.

~~~
Strilanc
You're probably being downvoted for being sarcastic about a good thing (at
least... I think that's sarcasm?).

Independently discovering Option.map and realizing its usefulness is _good_.
Point at the existing work instead of making fun.

[http://en.wikipedia.org/wiki/Option_type](http://en.wikipedia.org/wiki/Option_type)
(well.. maybe a bit more introductory than that..)

~~~
DanWaterworth
I didn't mean to come off sarcastic. Though, reading it back, I can see how
you could have thought that. I'm enthusiastic whenever anyone inadvertently
discovers functional concepts.

~~~
akkartik
Can you elaborate? I don't see how this pattern is a monad.

~~~
tmoertel
The underlying pattern is captured by the Maybe monad. Its bind rule takes (1)
a computation producing a value of type "Maybe a" and (2) a function accepting
an "a" and producing a computation producing a value of type "Maybe b". It
then returns a new computation that will produce "Nothing" if the first
computation produces "Nothing"; otherwise, it will take the value it received
from the first computation ("Just x") and pass the value "x" to the function
(2) to get a new computation, which it will then perform. Thus if either the
first computation or the second produces Nothing, the result will be nothing.
Further, if the first produces Nothing, the second will be short-circuited
away and never evaluated.

These semantics are basically what you implement by hand when you join
together computations that may produce NULL results and check for NULLs in
between. But now the computer is doing the checking for you, so the checking
is hidden and yet _never gets forgotten_.

For example, here's some Haskell code the provides a simple lookup database
from names (type "a") to phone numbers (type "b"):

    
    
        --          name   phone number
        persons = [("Joe", "123-555-7890"),
                   ("Tom", "432-555-0987")]
    

The lookup service for this database _may_ return a result, thus it returns a
value of the Maybe type to indicate that it may not be able to find a "b" for
every "a" you give it:

    
    
        lookup :: Eq a => a -> [(a, b)] -> Maybe b
    

But since Maybe is a monad and its bind rule takes care of the didn't-get-a-
result checking for us, we can safely string together lookup computations
without having to do any checks by hand. Nevertheless, we can be assured that
all the checks will be done.

For example, let's create a function that takes two persons' names, looks up
their phone numbers, and (if both are found) connects them with a (simulated)
call.

    
    
        -- try to connect person a's phone to person b's phone
        connect a b = do
            phonea <- lookup a persons
            phoneb <- lookup b persons
            return $ "connected " ++ phonea ++ " to " ++ phoneb
    

If we try to connect two names that are known to our lookup service, the call
goes through as expected:

    
    
        *Main> connect "Tom" "Joe"
        Just "connected 432-555-0987 to 123-555-7890"
    

But if either or both of the name lookups fail, no call is made:

    
    
        *Main> connect "Tom" "SomeUnknownDude"
        Nothing
    
        *Main> connect "SomeUnknownDude" "Tom"
        Nothing
    
        *Main> connect "SomeUnknownDude" "AnotherUnknownDude"
        Nothing

------
sambeau
I once wrote a programming language that had a localised 'this' called 'it'
exactly for this pattern.

    
    
      if (something_exists) {
        do_something_to(it)
      }
    

It was really useful (and readable).

I believe Perl has something similar but it was less readable.

~~~
draegtun
re: Perl - You're probably thinking of _$__ , which most Perl programmers call
_it_.

However this only gets topicalised in bare _for_ & _while_ but not _if_.

So you could do...

    
    
      for (something_exists) {
        do_something_to($_);
      }
    

... as long as _something_exists_ doesn't return a list :)

More common approach I've seen in Perl (and other languages where variable
declarations return its value) is this:

    
    
      if (my $it = something_exists) {
        do_something_to($it);
      }

------
anon1385
Anybody know why this got flagged off the front page?
[http://hnrankings.info/5893950/](http://hnrankings.info/5893950/)

~~~
bfung
I don't know _why_ , but take it as a signal that it this isn't as interesting
to the HN community.

If you think about it, it's not avoiding null checks, it's just moving it to a
different place. The null check still needs to happen in order to know when to
call the block, it just happens to be in the datasource.whatever code.

~~~
masklinn
The point is to move the check out of the library user's way, and out of his
ability to forget about it. Same as block-scoped resource patterns (RAII,
context managers, unwind-protect, whatever).

------
bandushrew
that has got nothing to do with blocks.

you could have pretty much the same code in any language by doing something
like:

    
    
        List personList=database.personListForID(id)
        for(person in personList)
        {
            person.setPhoneNumber(phoneNumber)
        }
    

Personally I prefer that idiom over null checks, it is a lot easier to read. I
do like the OP names for 'Finder' vs 'Teller' though, that is a good way to
describe the difference, that I haven't heard before.

------
utf8guy
This is essentially the Maybe monad.

------
Rickasaurus
Nice try secret functional programming guy.

