

The death of if/else cascades, a lightweight alternative - kyleburton
http://leftrightfold.com/?p=85

======
DrJokepu
Forgive my ignorance but how is this different from a switch statement?
Besides the slightly different syntax, of course. Many modern languages
interpret switch statements as lookup tables. And that in compiled languages
the lookup table is generated at compile time and this approach generates the
lookup table at runtime, possibly wasting expensive CPU cycles.

Basically we're dealing here with a special set of condition-action pairs. The
property that makes them special is that the conditions are mutually
exclusive, that is, the order of condition checks is not important. Of course,
most of the time, non-exclusive conditions can be restructured to exclusive
conditions as long as the conditions checks themselves don't have side effects
(that it, they don't modify the state). This is a solved problem in software
construction. For binary conditions, you use if statements (or the equivalent
in Your Favourite Language), for simple multiple conditions, you use a switch
statement and for complex state-condition check you create a class/closure
that takes the state and determines if the condition evaluates to true and
then another function executes the action and then you loop through the
condition checkers. We're talking about concepts that have been around for
almost half a century. There's no need to reinvent the wheel. Instead, read
some books and learn how developers solved these basic software construction
problems in the past.

The problem is, many developers are too smart for their own good. They come up
with new "design patterns" while ignoring the mountains of experience and
knowledge past programmers have built up. More often than not these
"groundshaking new ideas" lead to huge pains one way or other later down the
codebase lifecycle. This is why C++ is so hard; you can come up with many
"innovative" ways to do stuff only to discover a year later that your
"innovation" made your codebase completely unmaintainable. Don't get me wrong,
innovation is really cool and I support it but it doesn't hurt to consult the
existing knowledge (either by research or asking more experienced developers)
before going down your path.

~~~
nostrademons
Dispatch tables give you some additional flexibility. For example, say that
you want to wrap each function with some statistic-gathering in debug mode.
You could implement this easily with:

    
    
      if(DEBUG) {
        for(var key in actions) {
          actions[key] = wrapDebug(key, actions[key]);
        }
      }
    

In a switch or nested if-else, you would need to include that debug processing
inline with each alternative, instead of factoring it out into its own
closure-generating function.

Similarly, compilers often use this approach for instruction generation. The
set of available machine instructions is stored as a table; if you move to a
different architecture, you just replace the table with one appropriate to the
new architecture.

This is not a new idea. SICP has a whole section on data-directed programming,
and MapReduce and many other internal bits of Google infrastructure use this
extensively.

~~~
DrJokepu
Compilers are special because their input/output data is code and
instructions. However the code of the compiler itself is clearly separated
from the code the compiler emits. When the compiler uses the lookup table to
look up the code to emit it really looks up data for output and not code to be
executed (right away).

------
iamaleksey
This is much better handled in languages with pattern matching. And you can
match on arbitrary fields.

Erlang:

    
    
      surf(Channels) ->
        lists:foreach(fun process/1, Channels).
      
      process(#channel{genre = "football"} = Channel) ->
        record(Channel);
      
      process(#channel{genre = "comedy", repeat = false} = Channel) ->
        record(Channel);
      
      process(#channel{genre = "crime", show_title = "Cops"}) ->
        skip;
      
      process(#channel{genre = "crime"} = Channel) ->
        record(Channel);
      
      process(_) ->
        skip.
    

Full gist: <http://gist.github.com/250304>

~~~
daleharvey
the problem with pattern matching is that its hard to dynamically change the
dispatch table, in javascript I can just do

    
    
      dispatch_table[newthing].handler = function() .....

~~~
silentbicycle
You can change the clause database in Prolog by asserta/assertz/retract, but
that runs headfirst into the typical trade-offs involving mutable state, as
does your Javascript example. Erlang also lets you change patterns by hot-
loading code, but its design goes to great pains to do so with reasonable
safety.

Also, pattern matching doesn't require all of the variables to be bound. A
pattern may specify some values but just partially specify (a list with at
least one remaining value) or constrain (any positive integer) others. This
makes it vastly more flexible than just dispatching based on one value.

~~~
daleharvey
certainly, I am an erlang programmer and do love pattern matching, but as much
as I pretend to hate javascript, I do love that modules/namespaces whatever
are a first class citizen in a way they certainly arent with erlang, (they are
at the vm level, but not so much the language, smerl is quite handy for that).

I should probably play with match specs more often

------
barrkel
Two points.

First, there's the distinction between essential complexity and non-essential
complexity. What the OP is talking about is trying to reduce the non-essential
complexity by using a more appropriate abstraction. Nested if-then-else can
introduce non-essential complexity merely by having dead nested cases which
are actually obviated by outer cases; looking at a complex version of this
code, it can become quite difficult to see where exactly the code is supposed
to flow under what circumstances, as there can seem to be conflicting
assumptions in different areas.

Secondly, once upon a time I invented a scheme to solve this problem in the
best way I thought possible, and called it "matrix inheritance". The problem
with inheritance and subclassing is that it only handles a single dimension of
customization. Suppose you have two dimensions, genre and type, such as
[comedy, drama] and [movie, series]. If you were to try and classify any given
thing under a classical type breakdown, you could subclass by one axes or the
other, but you would need to duplicate subclassing for the remaining axes. So,
you could end up with Programme, Comedy <: Programme, Drama <: Programme, but
then you'd need ComedyMovie, ComedySeries, DramaMovie, DramaSeries,
duplicating the kind axis in the two different branches.

The matrix inheritance concept basically takes the cartesian product of all
option axes, essentially modelling it as an n-dimensional space, and then
applies behaviour to sub-spaces of this space. So, you could apply conditions
to [drama, _] and [_ , series], with these two representing slices of the
example 2-dimensional space described above. The advantage of modelling things
this way is that it is declarative: you can analyse overlaps and identify un-
covered space.

~~~
sjs
Have you seen Subtext? It's an interesting semi-visual language from Jonathan
Edwards at MIT.

[1] <http://en.wikipedia.org/wiki/Subtext_programming_language> [2]
<http://subtextual.org/subtext2.html> (demo)

These days the idea is developed under the name Coherence. Jonathan recently
released another paper on the subject at OOPSLA.

[3] <http://coherence-lang.org/>

~~~
scrod
Thank you for posting this! I'd seen this system a while ago and have been
wishing I'd bookmarked it ever since!

------
amix
This is a common practice in Python and most of the implementations I have
seen (and used) use it in a following way:

    
    
        handler = {
            'football': handle_football,
            'comedy': handle_comedy,
            'crime': hendle_crime
        }.get
    
        handler('crime')(...)

~~~
domnit
As a dynamic language, Python already has similar tables built in. You can use
something like:

    
    
      getattr(self, 'handle_%s' % action)

~~~
kyleburton
If action comes from outside the program, this can open up unexpected
behavior. The dispatch table is closed - there's less chance that user input
will invoke something unintended.

~~~
domnit
In a real world program you would validate input and make sure you get
something callable (with either technique). The only additional unexpected
behavior is when there could be unexpected handle_xxx attributes. If we take
for granted that the programmer defines the dict for the explicit dispatch
table, though, we can also assume that the programmer defines the object for
the implicit table.

~~~
eru
I feel the version from the original comment is most Pythonic. The other
version is close to an eval-function, which seems to be frowned up on in
Python.

~~~
cma
The second version is slower too. Here's what I use for e.g. a state machine,
which follows "don't repeat yourself" a little better than the first example:

    
    
      # {state_name: state_action}
      states = {}
      
      state = ['start']
      
      def reg(func):
        states[func.__name__] = func
        return func
      
      # define and simultaneously register the states
      @reg
      def start():
        state[0] = 'second'
      
      @reg
      def second():
        state[0] = 'last'
      
      @reg
      def last():
        state[0] = None
      
      
      # run the state machine
      while state[0]:
        print state[0]
        states[state[0]]()

~~~
eru
Is the second version slower because the string won't get intern-ed?

~~~
cma
Yeah, and the string must be reinterpolated every time (possibly barring some
of the more exotic python implementations).

------
thaumaturgy
I must be turning into a cranky oldpants programmer.

As I see it, this eliminates a large block of logic, but it doesn't actually
make the program less complex.

That is, the program is still making the same decisions, and you still have
the same amount of fragmented business logic handling all of your special
cases; it's just no longer dropping into those special cases by way of a
monolithic if/else statement.

On the other hand, this is potentially adding a dangerous bug in the program;
if your dispatch table gets corrupted for some reason, or an OBO gets
introduced (that _never_ happens!), you could end up trying to jump to a
random pointer address.

EDIT: Higher level languages, like Javascript, will handle it somewhat more
gracefully of course. But, it would still cause an error of some kind.

~~~
fragmede
Cranky pants. :)

How would the dispatch table get corrupted? If the dispatch table is compiled
down into an executable in exactly the same manner an if/else would be, then I
can't think of a reasonable corruption vector that would corrupt a compiled-in
dispatch table, but not a similarly compiled-in if/else block of code.

As for an off-by-one, that /never/ happens in an if-else block either, and as
long as you do bounds checking, you might call the wrong function, but that's
not a 'random pointer address'.

You have a very good point that this technique doesn't actually lessen program
complexity. It does, however, help the fragmented business logic into tables
that are more easily dealt with when the business end changes.

The 'error of some kind' is just as much an error as a missing 'else' in a
long if/else if/else if block or a 'default' section in a switch.

~~~
swolchok
> How would the dispatch table get corrupted? If the dispatch table is
> compiled down into an executable in exactly the same manner an if/else would
> be, then I can't think of a reasonable corruption vector that would corrupt
> a compiled-in dispatch table, but not a similarly compiled-in if/else block
> of code.

Without doing any actual experiments and assuming native code, the dispatch
table is liable to go into the .data section of the binary with the rest of
the globals/statics, which should be read/write (if you're lucky, you made it
const and it's in .rdata which is read-only). An unmitigated buffer overflow
in a different item in .data is liable to corrupt the table. Typically, the
.text section (the program code) is not writable.

------
domnit
Schematic tables [<http://subtextual.org/OOPSLA07.pdf>] are an extension of
this idea that can handle arbitrarily complex conditionals. Unfortunately,
they only work in Subtext [<http://subtextual.org/>], an unreleased research
language. Schematic tables take advantage of the 2-dimensional visual layout
of Subtext (as opposed to linear text, like most languages); it would be very
interesting to see something similar in text.

------
randallsquared
> New behavior can be injected without changing any core code.

While it probably is easier to change the dispatching part of this code now,
I'd hesitate to call the dispatching part the "core" code; it seems like the
functions that actually do the work have a better claim to that, so what this
does is to move the core code away from the calling location, and removes the
name, so the core code has to be found by tracing or simulation, rather than
by reading. Of course, a comment pointing at the file and line of the actions
table would be nice, but might get out of date, and since the table is passed
rather than global, at some point there's going to be more than one actions
table that do other things as well...

This code will be a nightmare to debug when there's a few thousand lines of
similar redirections. :/

~~~
Deestan
> This code will be a nightmare to debug when there's a few thousand lines of
> similar redirections. :/

It is. The SConstruct build tool is a prime example of this technique being
used so much that it hampers debugging.

Using this pattern successfully requires hard discipline; you have to
concentrate to keep your code readable.

------
DannoHung
This technique is ancient for anyone that has an associative array and
function objects.

~~~
rbanffy
The design patter of one is the syntax of another.

------
zaphar
Erlangs pattern matching is a superior form of this concept. Clojures
multimethods also implement a similar style to erlangs pattern matching
dispatch. Both of those avoid some of the objections described in other
comments here. The code is still in core and is easier to read than a dispatch
table.

------
xtho
Instead of mimicing method dispatch, shouldn't this be handled by a factory
that returns an appropriate action object?

Okay, the author writes:

> Skeptics may say, that problem has been solved in the Object Oriented world
> by [...] Wow, that is a lot of work, and think about the number of classes
> and lines of code you will end up with!

First, you don't have classes in a prototype-based language. Secondly, do it
right or don't do it at all.

------
Nycto
This looks like a functional (The paradigm, not the adjective) version of the
Visitor pattern: <http://en.wikipedia.org/wiki/Visitor_pattern>

------
clistctrl
So this strikes me as a fairly common problem. If I was going to solve it I
don't think i would use a "dispatch table" I wold prefer to use polymorphism.
In C# maybe i would have an interface with the common method. An abstract
class to provide a default implementation. Then several classes which either
implement the interface directory, or inherit the abstract class.

------
clistctrl
if/else cascades will live as long as long as junior developers themselves.
Jeez some of the code I've personally written when I was younger...

