
Early exit is a tail call optimization of procedural languages - vkhorikov
http://enterprisecraftsmanship.com/2015/12/14/early-exit-is-a-tail-call-optimization-of-procedural-languages/
======
jpollock
I've been in the "only a single return" and "early return plz" camps at
various points in my career.

Right now? I'm early return.

When I wrote C and C++ code with locks and malloc/free? One exit only.

It's all about avoiding bugs.

In C code heavy with locks and memory allocation, you need to make sure your
post-conditions are correct. That is near impossible when there is more than
one return [1].

In C++, with RAII and Java with "try-with-resources", I can ensure proper
cleanup. At that point, it's about making it easy for someone who is reading
the code to avoid screwing it up. I need to make it easy for them to flow
through the code, keeping the cognitive load as low as possible. Functions do
what they say on the box and _only_ what they say on the box. At that point,
my goal is to make a decision, act on it and move on. I don't want to carry
that state forward - it means I have to keep it in mind for every line after
it. I'm even trending towards having the majority of the methods in my classes
be static functions with no state.

Finally, if I've got a function that has two different flows and I'm deciding
between them with an if? I've probably actually got two different functions
pretending to be one. I then try to refactor and get rid of the if.

[1] [http://programmers.stackexchange.com/questions/154974/is-
thi...](http://programmers.stackexchange.com/questions/154974/is-this-a-
decent-use-case-for-goto-in-c/154980#154980)

~~~
legulere
But in C you usually use `goto Error;` which basically is just a `return` +
you can do the cleanup yourself.

~~~
CyberDildonics
That means that you have to put all your unlocks and deallocations there AND
you have to check them to see if they were allocated/locked in the first
place.

------
Stratoscope
The error checking example conflates two different things:

• Early returns vs. single return.

• Excessive nesting vs. simpler flatter if/else.

The real problem with the "before" example is the excessive nesting. I do
prefer early returns myself, but even if you wanted to use the single-return
style, you could simply un-nest the whole thing:

    
    
        public string Execute( int integer, string str, bool boolean ) {
            string result;
         
            if( integer <= 42 ) {
                result = "Error: Integer is too small";
            } else if( string.IsNullOrWhiteSpace(str) ) {
                result = "Error: String is null or empty";
            } else if( ! boolean ) {
                result = "Error: Incorrect boolean";
            } else {
                result = "Success";
            }
         
            return result;
        }
    

It's a bit ironic that this is C# code, because ReSharper would offer to un-
nest the code for you.

Also, this really has nothing to do with tail call optimization.

~~~
barrkel
Frequently the checking is dependent on passing the previous clause. Something
like:

    
    
        if (x == null)
            return "x is required";
        if (x.length < 3)
            return "too few elements in x";
        if (x[0] == null)
            return "first element missing";
    

The point here being that each successive check is testing something that is
only valid if the previous check passed. This can't be so easily converted to
your if/else pattern without re-encoding the early exit in the form of a
complex boolean (&& and || have early exit built in).

The concept's applicability to tail call optimization is only by analogy with
the mental stack; just as tail call optimization sets up the call such that
the current stack frame no longer exists, early exit sets you up to remainder
the rest of the code without regard to any potential alternate code paths
(i.e. a potential else branch on an early if statement) later; the mental
frame of a long if statement no longer exists. The analogy is a little bit
stretched; of course it is not a technical isomorphism, but nobody ever said
it was.

~~~
Stratoscope
As I mentioned, I prefer early returns as well, and I wasn't recommending the
if/else structure as an alternative to early returns.

My only point in showing that code was to illustrate what I see as the bigger
problem in the example: the use of nested if and else statements when non-
nested ones could be used instead. Personally I would get rid of the nesting
by using early returns, but for someone who objects to that for any reason,
the non-nested if/else chain is a nice alternative.

Also, the series of returns you show here could easily be written as a chain
of if/else for those who prefer that style. I'm not sure why you said it
requires a complex boolean expression; this code is equivalent to the series
of returns:

    
    
        string result;
        if( x == null ) {
            result = "x is required";
        } else if( x.length < 3 ) {
            result = "too few elements in x";
        } else if( x[0] == null ) {
            result = "first element missing";
        } else {
            result = "the real result";
        }
        return result;
    

Again, I don't particularly like this style, although it is better than the
deeply nested if/else. I prefer the series of returns that you listed in your
comment.

------
tikhonj
Return statements and "early exit" are not built into functional languages. If
the language, like Scheme, supports continuations, we can use callCC to
implement an early return[1]:

    
    
        (define (example x)
          (call/cc (lambda (return)
            (when (< x 0) (return #f))
            ; more code, including possible more calls to return
            0)))
    

However, most functional languages do not have callCC built in. Happily we can
get a similar effect by writing our code in continuation-passing style
(CPS)[2]. (Think Node.js.)

Nominally, CPS has some overhead because of the extra function calls involved.
However, a CPS transform also puts every call in _tail position_ , making them
eligible for tail call elimination. This means that we can use CPS with no
function call overhead in a language with proper tail calls.

What does this mean in practice? It means that calling a callback in CPS
style—which, being in tail position, can be eliminated—is effectively _just_ a
jump under the hood. Not too different from return!

Putting it all together, we get that tail call elimination lets us write code
in continuation passing style with no extra over head which gives us access to
continuations that we can use to implement an early return in languages
_without_ return.

This turns out to _not_ be what the article was on about, but I think it's
more interesting.

[1]: Code from a StackOverflow answer by Nathan Shively-Sanders which goes
into more detail: [http://stackoverflow.com/questions/2434294/scheme-early-
shor...](http://stackoverflow.com/questions/2434294/scheme-early-short-
circuit-return)

[2]: I wrote a brief explanation of CPS on Quora: [https://www.quora.com/What-
is-continuation-passing-style-in-...](https://www.quora.com/What-is-
continuation-passing-style-in-functional-programming/answer/Tikhon-
Jelvis?share=1)

~~~
ridiculous_fish
Is it possible to use CPS with functional primitives like map? Or do you have
to implement CPS-savvy replacements? It doesn't seem like a CPS transform
would let you early-exit from map in tail position.

~~~
tikhonj
Right, you'd need a CPS version of map. Luckily map isn't really a
"primitive", it's just a library function, so using a different version of map
is not a big deal.

One way to think about it is that languages with callCC built in are just
transformed to CPS by the compiler before being run, which includes
transforming the definition of map. (In fact, this is a reasonable strategy
for compiling Scheme, I believe.)

It's pretty easy in Haskell which wraps CPS into a monad (Cont) because you
use mapM from the Control.Monad library instead of having to write a
specifically CPSed version of the function.

------
fantasticsid
I would rather see more articles like this on hacker news!

~~~
dang
Anyone who wants to share their preference with us is welcome to email
hn@ycombinator.com. We're interested.

------
jkot
Original code seems like really bad way to check validity of parameters. Is
this type of programming common in some languages? I have never seen similar
code in Java.

~~~
jpollock
Some groups have a "Functions can only have a single entry and single exit
point" rule, meaning you have to jump through a lot of hoops.

Sometimes it is required. In Java, before try-with-resources, if you had any
resource management (locks, file descriptors, sockets), a single exit point
made resource management simpler. C code tended to have similar rules (or
GOTO's to jump to the exit block in the function). C++ just used RAII, and
made it the destructor's problem.

~~~
SideburnsOfDoom
Right - that single-exit-point rule is generally a holdover from C.

It is pointless in languages like C# or Java which have GC and 'try' or
'using' blocks. In some cases the single return is _nicer_ , but this is not
universal nor is single return "required" in all cases.

In functional languages such as F#, Haskell or Erlang where the method is
usually side-effect free and uses pattern-matching, multiple return-points are
completely normal and no-one even sees an issue.

------
bjacks
I think that if the author re-wrote the first method in a cleaner way,
composed of lots of smaller methods with descriptive names then the code would
be a lot more readable, and you wouldn't even need to debate the whole
multiple returns versus single point of return thing because your "mental
stack" could handle either option equally well.

For example:

    
    
      public string Execute(int integer, string str, bool boolean) {
        string result;
    
        if (isValidInteger) {
           result = validateStringAndBool(str, boolean);
        } else {
           result = “Error: Integer is too small”;
        }
       return result;

}

    
    
      private string validateStringAndBool(string str, bool boolean) {
         if (!string.IsNullOrWhiteSpace(str)) {
           return isValidBoolean(boolean)          
         } else {
            return  “Error: String is null or empty”;
         }
      }
     
      private string isValidBoolean(bool boolean) {
         if (boolean) {
            return “Success”;
         } else {
            return “Error: Incorrect Boolean”;
         }
      }
    

And yeah, I can't see how this has anything to do with tail call optimization
other than a fluffy analogy to stacks - mental and programmatic.

------
hellofunk
Would be nice to see more serious study of this for a language like C++. It
would take either TCO or Garbage Collection to make reliable, robust
persistent data structures a possibility in C++, and more would prefer TCO of
these two. That would open a big door into more powerful functional
programming idioms for C++ than currently exist, even with the latest language
standards.

~~~
adrianm
How would TCO help? I've never read anything about the connection between TCO
and persistent data structures. GC on the other hand is a known, and often
assumed, requirement in the literature. Sounds interesting!

~~~
jacquesm
TCO can be done more readily if you know for a fact that all the parameters
passed into the function were not modified during the execution of the
function.

~~~
GFK_of_xmaspast
More reasons to be aggressive about marking things const.

------
glastra
To me, this only makes sense if you always read source code sequentially.
Humans are not machines, and most often you will not be reading code
(especially code with branches) in a top-bottom manner.

Having multiple return statements in a method actually renders the method LESS
readable. I understand the first proposal is also usually implemented with
many return statements, but they could be replaced with a variable assignment,
variable that is then returned at the end of the method (as in the example).
If that variable is final (or whatever the C# equivalent is), that's even
better.

~~~
adamc
I don't think I fully understand your comment. Regardless of whether I've read
the whole program top-down, it is definitely easier for me to understand
early-exit logic in a routine than to keep the mental stack of conditionals
going, particularly when (as is often the case) the early exits consist of
various deviations from the "main sequence" code (e.g., validations failing or
other edge-cases).

~~~
phamilton
"Oh... We call function foo. Let's go see what function foo returns. (Scroll
to the end of the function)"

With a single return, I can at least follow that thread fairly easily and
sometimes it's just a few lines back in a large function. With many returns,
I'm forced to start at the beginning of the function.

There a dozen or so other culprits that make this difficult. Assignment in a
block can be just as bad, for example, since I can't know which assignment
occured without looking at all of them.

~~~
camgunz
I like this ergonomics argument, particularly from a reader's perspective. But
having had to read some single return value code recently, I found that the
return value returned at the end of the function was rather unhelpful; it
usually boiled down to "return rv;". I then had to go back up to the top of
the function and follow through the nest of passed conditionals and side
effects, fall throughs, and failed conditionals to see how "rv" was modified
throughout the function. The non-linearity cost me a lot more thought, whereas
with multiple return value code the failed conditionals are "tossed" out of my
head very quickly. I think, given that you almost always end up reading the
whole function anyway (which is probably good), linear is better than non-
linear and consistency (bail early) is better than inconsistency (anything
goes) when it comes to error handling.

~~~
phamilton
I come from an FP background, and so I tend to avoid multiple assignment as
well. In such a scenario, there's a single line that assigns rv. Each of the
variables used in assigning rv are each assigned just once. I can follow the
graph fairly easily until I get to the bit I need. Often its something like
"what conversion rate for currency are we using?" and following that chain
gets me there quickly.

~~~
camgunz
Yeah, I can definitely see how if you're disciplined it can be effective. But
in my case, I've seen integer return values hold status codes from multiple
modules and libraries, file handles, and enum values all in the same function
just to twist everything together so that no matter where it stops executing,
"rv" has some kind of value so that the ending "return rv" won't just return
success. It was total madness, and it's really hard to do anything as
complicated and fragile as that with early returns.

I will admit though that no programming style will sufficiently protect
against bad coding :). I'm sure even if this person did everything I advised,
they'd still find a way to overcomplicate and confuse things. So maybe the
issue is moot, or at least bigger than style.

------
legulere
This would be pretty awesome to have as a lint in static code analysis tools.

------
askafriend
For startups too

