
Designing state machines - adipasquale
https://drivy.engineering/designing-state-machines/
======
idbfs
A paper that changed my approach to designing state machines is "Statecharts:
A Visual Formalism for Complex Systems" by Harel
([http://www.wisdom.weizmann.ac.il/~harel/papers/Statecharts.p...](http://www.wisdom.weizmann.ac.il/~harel/papers/Statecharts.pdf)).

Statecharts (also called hierarchical state machines) are essentially
generalized state machines which allow for nesting and parallel composition of
states. The 'nesting' part is my favourite, since it allows one to delegate
event handling logic shared by multiple states to a 'parent' state, reducing
code duplication.

The great thing about this paper is that you can glean most of its key ideas
by just looking at the diagrams.

~~~
nickpsecurity
Exactly. UML looked like over-complicated crap when I saw it after reading on
Statecharts and Yourdon. The latter are still in use on occasion in high-
assurance with Tenix's Dats Diode being an example.

------
activatedgeek
While I understand the importance of state machines and its direct usage in
complex Business Logic Workflows (and indirectly almost everywhere), there is
nothing special in this post. It is a generic comment on State Machines. Am I
missing something here?

~~~
bri3d
A surprising number of entrepreneurs and developers are unfamiliar with the
application of state machines to business logic workflows, so it's useful to
see the topic presented in a succinct way. Good learning content tends to get
upvoted on HN, especially when the discussion around the concept (for example,
the discussion of State Machines and state machine libraries in this thread)
is interesting as well.

~~~
clairity
you don't even have to be unfamiliar with state machines to have found some
usefulness here. sometimes it's just nice to be reminded of a concept as
you're wrangling the 20 different problems you face as an entrepreneur on any
given day.

the blog post wasn't particularly deep, but the (short) discussion here was
worth perusing, as i'm considering employing a state machine for my rails app
and have been generally looking for options and best practices (in a mental
background thread). but it looks like most (all?) existing gems assume a fixed
set of states and transitions, and i was hoping to find something designed
primarily for user-defined states and transitions.

------
zcdziura
There was an article posted here a while back on how to implement a state
machine in Rust, purely using its type system to facilitate state transitions.
It's a neat concept, even if it's a bit more verbose than what you'll find in
other languages.

[https://hoverbear.org/2016/10/12/rust-state-machine-
pattern/](https://hoverbear.org/2016/10/12/rust-state-machine-pattern/)

~~~
nuclx
How does this compare to the expressivity of Idris' dependent types? I'm
intrigued by the idea of type-safe state-machines as e.g. file i/o and network
socket code is almost always error-prone. So it feels crucial to rely on the
support of a strong type system here.

[http://docs.idris-lang.org/en/latest/st/machines.html](http://docs.idris-
lang.org/en/latest/st/machines.html)

------
andreineculau
FWIW there is a standard to describing (finite) state machines despite the RFC
being expired [https://tools.ietf.org/html/draft-bortzmeyer-language-
state-...](https://tools.ietf.org/html/draft-bortzmeyer-language-state-
machines-01)

I have used an "improved" version of it
[https://github.com/andreineculau/cosmogol-
abnf](https://github.com/andreineculau/cosmogol-abnf) when working on
[https://github.com/for-GET/http-decision-diagram](https://github.com/for-
GET/http-decision-diagram)

------
estsauver
I can't recommend Akka's state machines enough. It's an incredibly simple and
rock solid state machines built right into the library.
([http://doc.akka.io/docs/akka/snapshot/scala/fsm.html](http://doc.akka.io/docs/akka/snapshot/scala/fsm.html))
We're already on a Play/Scala backend so adding these was incredibly easy for
us, but I think they're one of the hidden amazing features of the akka/scala
world.

~~~
djb_hackernews
I just looked through and I may have missed it but do you know if this does
distributed FSM management?

~~~
AdieuToLogic
Akka FSM in and of itself does not address distributed FSM's. However,
supporting such would likely involve their clustering[0] support and perhaps
distributed data[1].

0 -[http://doc.akka.io/docs/akka/current/scala/index-
network.htm...](http://doc.akka.io/docs/akka/current/scala/index-network.html)

1 - [http://doc.akka.io/docs/akka/current/scala/cluster-
usage.htm...](http://doc.akka.io/docs/akka/current/scala/cluster-
usage.html#distributed-data)

------
karboosx
There is good tool for making diagrams (so state machines as well) called
mermaid[1], just by writing simple readable text.

[1] [https://github.com/knsv/mermaid](https://github.com/knsv/mermaid)

~~~
nerdponx
I wanted to like Mermaid, then I realized it was just a neutered version of
GraphViz. I had wanted to write a Mermaid-to-Graphviz compiler but never got
around to it.

It produces pretty ugly graphs that can't be exported to any other format.
Meanwhile just about ever graph manipulation program (e.g. Gephi) can handle
GraphViz DOT format.

~~~
karboosx
I didn't know GraphViz. I have to try it. DOT language looks very similar to
Mermaid.

It was too hard to make that compiler or you have encountered another problem?

~~~
nerdponx
I didn't even know what BNF was when I had the idea ;)

It's still on my list of short projects. I have a lot of learning to do,
knowing nothing about writing parsers and such. Yes, I know I could probably
hack it together with a bunch of garbage Python. But that just does not sound
fun to write, let alone maintain.

~~~
karboosx
Apparently some people also wanted graphviz syntax:
[https://github.com/knsv/mermaid/issues/5](https://github.com/knsv/mermaid/issues/5)

I also find another project called Viz.js[1]. It's graphviz in javascript.
Looks promising. It's hard to choose between this two. On the one hand is very
simple syntax, and on the other hand more flexibility.

[1] [https://github.com/mdaines/viz.js/](https://github.com/mdaines/viz.js/)

------
troels
When dealing with business processes, I feel that the evented nature of a
formal fsm can be a bit off-putting. I have on several occasions used a batch-
processing approach, that maintains state in a persistent record, giving many
of the same characteristics as an fsm. Let's say we have our movie from the
example (pseudo-code, obviously):

    
    
        class MovieState < Model
          belongs_to :movie
        end
    
        class InitialStep
          def select
            MovieState.where(state: 'initial')
          end
    
          def process(movie_state)
            movie_state.state = 'in_production'
          end
        end
    
        class InProductionStep
          def select
            MovieState.where(state: 'in_production')
          end
    
          def process(movie_state)
            if movie_state.movie.finished_shooting?
              movie_state.state = 'in_theaters'
            end
          end
        end
    
        class MovieWorkflow
          def initialize
            @steps = []
            @steps << InitialStep.new
            @steps << InProductionStep.new
          end
    
          def run
            @steps.each do |step|
              step.select.each do |state|
                step.process(state)
                state.save
              end
            end
          end
        end
    

This makes it very clear when things are happening. It's also easy to debug
and test.

Of course, there are a lot of limitations, most notably that it can't deal
with an async event happening. In my experience though, this can be dealt with
by storing the payload of that event somewhere in the db, where the step-
selector can access it on next run. One thing this model deals very well with,
is delays, which is often part of business processes.

Any thoughts? Does this design have a more formal name?

~~~
rebootthesystem
That's the problem with kids these days. Everything has to be a friggin class.

You know we did some pretty amazing things with computers before OO. We even
landed people on the Moon.

Can anyone program without OO any more?

Just saying.

~~~
troels
I recognise that your tongue and cheek may be in close proximity here, but
care to elaborate?

~~~
rebootthesystem
It's just a silly observation. I come from a generation that had to create
products like AutoCAD to run on CP/M systems with 64K of RAM and precious
little permanent storage. As an aside, few people know that software like
AutoCAD started on such --by today's standards-- crippled systems.

Anyone who came up in software development through that path has a full
understanding of the cost of every bit of code written, every abstraction.

Today's programmers, unless they've taken university courses to expose them to
the low level stuff, are extremely isolated from the innards of what they
write. They reach for classes and objects by default and seem incapable of
thinking outside of these boxes. All of this adds overhead and bloat,
something you had to be keenly aware of with limited memory and horsepower.

I do understand that a higher level of abstraction in the context of massive
amounts of memory and horsepower can be nice. What gets me is that this is
often an unnecessary default.

Going back to state machines. One does not to create a bunch of classes and
methods to implement them. I see that almost as a violation of the conceptual
simplicity of state machines.

I mean, to go in a different direction, I use state machine in FPGA's all the
time. The synthesized hardware works fine and represents the bare minimum
necessary to implement these very high speed state machines. No classes
involved. One can do the same in software development.

~~~
troels
Well.

I'll call it idiomatic form to use OO style in Ruby, which the article (and my
examples) were written in. Considering the interpreted and managed-memory
model of the language runtime, it is utterly pointless to worry about a few
object instances. Heck, this is a language where primitives are objects.

I suspect we work in quite different areas though, which probably gives us
very different perspectives of what matters. The kind of applications I build
on a daily basis, deal with business processes, and integration with external
systems. I/O is going to be 90% of my machines load, if it's ever doing any
real work in the first place. Most of the time it'll just sit waiting for a
human to do something though.

~~~
rebootthesystem
Well, of course, "When in Rome do as the Romans".

Not saying OO has no place. Of course not. That would be crazy.

------
pwm
I have recently created some dead simple code to explain the concept to
colleagues whom, as the article notes, were not using this very simple and
effective construct at all to control state transitions in business apps.

As the article does not really explain how state machines work let me try to
do it here so maybe you will go back to your code and put one in place :) In
general, if anywhere in your code you have some property/field called
status/state/etc... that you update depending on events then chances are that
a state machine would improve the robustness of your code.

Ok, so you have some states. For business apps this usually comes from
business. For our dummy example we will simulate a day (using Haskell here for
conciseness, but no worry, it's trivial and you don't need to know any Haskell
to understand it):

    
    
        data State = Awake | Dressed | Fed | Ready | Work | Cafe | Home | Asleep
    

Now most apps stop here and thus they don't have a machinery to control how to
move between these. If we were to draw a graph where vertices are the above
states and edges are how we go from one to another then, without any
restriction, we would implicitly get a complete graph where you can go from
any state to any state. Eg. from "Ready" I could go straight to "Asleep"
without "Work" :) Our aim is thus to restrict transitions from one state to
another to transitions that actually make sense. Making sense obviously
depends on the problem we are solving. For our example let's introduce the
following events:

    
    
        data Event = Alarm | Dress | Eat | GoToWork | Think | Yawn | Coffee | GoHome | Read | WatchTV
    

Now that we have states and corresponding events we can draw a graph where
vertices are states and edges are events. The graph is now explicit in that we
can only move between states (vertices) via events (edges). To put this graph
into code, let's create a function that takes a pair of state and event, eg.
(Asleep, Alarm) and returns a new state, eg. Awake:

    
    
        myDay :: (State, Event) -> State
    

As said above, myDay is a function that takes a (state, event) pair, which we
can also call a transition, and returns a state. This function is nothing more
than a lookup table, aka. transition table, where we will write down how to
transition between states:

    
    
        myDay (Asleep,  Alarm)    = Awake
        myDay (Awake,   Dress)    = Dressed
        myDay (Awake,   Eat)      = Fed
        myDay (Dressed, Eat)      = Ready
        myDay (Fed,     Dress)    = Ready
        myDay (Ready,   GoToWork) = Work
        myDay (Work,    Think)    = Work
        myDay (Work,    Yawn)     = Cafe
        myDay (Cafe,    Coffee)   = Work
        myDay (Work,    GoHome)   = Home
        myDay (Home,    Read)     = Asleep
        myDay (Home,    WatchTV)  = Asleep
        myDay (_, _)              = error "invalid state transition"
    

That's it, we have codified our graph. We explicitly stated the valid
transitions and anything else is by definition invalid. We are pretty much
done at this point. A state machine itself can be thought of as a generic
function that takes a transition table, a transition and returns a state. This
is only to decouple a specific transition table from the general machine
itself, but it literally does nothing other than lookup whether the transition
is in the table. If yes then it gives back the corresponding new state
otherwise we get and error.

You can see a pretty picture of the above graph and some code here:
[https://github.com/pwm/fsm](https://github.com/pwm/fsm)

~~~
pavanto
Brilliant. Superb explanation! Just couple of days I was trying to understand
this from my team mates and it took him 30 mins to make it complicated.

------
Eclyps
For anyone else working in ruby, I've grown to love the "Workflow" gem[1] for
my state machines. It really helps to streamline things, and also has a built-
in way to generate a visual of your class's states and transitions.

[1] [https://github.com/geekq/workflow](https://github.com/geekq/workflow)

~~~
tcopeland
Looks interesting! I've always used
[https://github.com/aasm/aasm](https://github.com/aasm/aasm), but this looks
pretty good.

More generally I get worried about my ActiveRecord objects getting too bulky
and business logic getting all wound up in post-transition callbacks and such.
But that's something that can be managed with service objects and being
disciplined about triggering state changes.

~~~
Eclyps
I used this for a huge project with a lot of states. Separated my workflow out
into individual files for each class so they could just be included at the
top. Then for each transition method, I created a service object (Interactor
gem[1]) to handle all of my complicated transition logic, so the object-level
transition methods would just check for a success or failure and perform the
transition appropriately. Really love how it turned out.

[1]
[https://github.com/collectiveidea/interactor](https://github.com/collectiveidea/interactor)

------
jugg1es
Can't recommend state machines enough. A well designed state machine makes
troubleshooting straightforward and makes adding new features a breeze without
refactoring much.

------
siddhant
In case you're using Python, I can't recommend this library enough -
[https://github.com/pytransitions/transitions](https://github.com/pytransitions/transitions)
. An absolute pleasure to work with.

~~~
carapace
For goodness' sake, no! I just ripped that package out of our production code.
It's terrible! It adds methods to your object at runtime, crazy stuff.

I replaced it with just this:

    
    
            class StateMachine(object):
    	    '''
    	    A very simple FSM class.
    
    	    Params:
    	        initial: Initial state
    	        table: A dict (current, event) -> target
    	    '''
    
    	    def __init__(self, initial, table):
    	        self.current_state = initial
    	        self.state_table = table
    
    	    def __call__(self, event):
    	        '''Trigger one state transition.'''
    	        self.current_state = self.state_table[self.current_state, event]
    
    
    	class Foo(object):
    
    	    STATE_TABLE = {
    		(current_state, event): next_state,
    		...
    	    }
    
    	    def __init__(self, xid, token, config):
    		self.fsm = StateMachine('start', self.STATE_TABLE)
    
    	    def begin(self):
    	        while self.fsm.current_state not in {'success', 'error'}:
    	            method = getattr(self, self.fsm.current_state)
    	            self.fsm(method())
    	        if self.fsm.current_state == 'error':
    	            self.report_error()
    
    
    

If you're using Python anything more involved than a dict mapping (current,
transition) -> next is just overkill. Use it, don't abuse it. ;-)

------
jbritton
I typically try to avoid state machines. I like to have a function like model
= reduce(model, event) where model is an environment of variable:value
bindings. This function then just uses (if expr/else if expr) to find a
condition in which to process the event. Processing the event involves
changing the values in the model environment. I suppose on some level the two
approaches are the same, but they feel and look different to me. If the
"if/else if" pattern is deeply nested, then this is kind of like traversing a
state machine to reach the correct state for processing the event. So in this
case maybe a state machine is more efficient. However, a state machine,
involves a history. You really need an external diagram. You need to see how
you get to a state and how you transition out of a state. An "if/else if" type
approach puts all the decision making in the code right in front of me. I
don't need a diagram. The "if/else if" approach is often much more concise as
the expressions are a powerful modeling approach. Finally, I prefer to avoid
deeply nested if/else if" conditions, because it can be a chore to traverse
the tree and really understand the path taken to get to the event handling
spot. If the conditions can be written as linear list of rules where each rule
can be thought about in isolation, I find this easier. This often involves
repeating test conditions. Just flatten the decision tree by repeating the
preconditions. This is again less efficient, but if you save the results of
the expression computations then it becomes a simple test of Boolean values. I
like to think of this approach as a rules engine instead of a state machine. A
state machine encodes information in the state diagram so that it doesn't need
to be stored in variables. The rules engine encodes all the knowledge in
variables and rules.

~~~
rebootthesystem
You might want to read-up on table driven state machines. I'm not sure you
have a full view of what's possible.

------
jrmiii
In regard to tip 4, I've been using statesman[1] which allows you to store the
transitions as table backed active record objects along with any contextual
metadata.

Depending on how much auditability you need on why transitions happened, this
may be preferable to papertrail.

[1]
[https://github.com/gocardless/statesman](https://github.com/gocardless/statesman)

------
xtiansimon
The podcast Podcast.__init__('Python')[1] recently interviewed the developer
of Automat module [2]. The author explains when you should consider using a
state machine in your Python project. I didn't realize there were so many
types of state machines, and the author explains where his module fits into
this ecosystem.

[1]: [https://www.podcastinit.com/automat-state-machines-with-
glyp...](https://www.podcastinit.com/automat-state-machines-with-glyph-
lefkowitz-episode-116/)

[2]:
[https://pypi.python.org/pypi/Automat/0.5.0](https://pypi.python.org/pypi/Automat/0.5.0)

------
agopaul
What a coincidence, I've been working on a PHP state machine implementation
myself in the last few days.

Anyway, I find the concept of "events" a bit weird in this case and I prefer
to work with "states" only (state transitions rather than events that change
the state). It seems like an unnecessary abstraction to think in terms of an
event rather than a state.

~~~
pwm
Ultimately state is a function of time. Events are thus temporal effects from
the outside world that can, depending on their values, cause a transition from
a state to another state within the machine. If you model the machine as a
graph where vertices are states and edges are events then a transition can be
seen as a function from a (state, event) pair to a state, ie. (CurrState,
Event) -> NewState. I've posted a simple explanation below, check it out if
you like.

------
sandGorgon
What about the DB side of things ? How do you model the state graph in the DB
?

I have been thinking of using Postgresql CTE to do this. When a cycle occurs,
I just create a new step to the dB.

Anyone from CRM startups who have interesting learnings here ?

~~~
adipasquale
If you mean modeling the state graph itself, it's usually not modeled in the
db but only in the code. It could indeed be interesting to store the graph
itself if it evolves often, or at least a version number.

If you were speaking about storing the instances lifecycles, a simple model
using a RDBS is to store one row per transition event in a separate table.
This is what the papertrail[1] gem does for example.

[1]
[https://github.com/airblade/paper_trail](https://github.com/airblade/paper_trail)

~~~
arethuza
"Enterprise" applications such as ERPs or CRMs often have fairly complex
workflow features where _everything_ is in the underlying database - both high
level workflow/state-machine definitions and compiled code.

------
Olap84
Anyone got tips for working with FSMs in REST and/or MVC? Very rarely do you
find HTTP verbs match FSM events leading to a fudge somewhere

~~~
smizell
You might find it useful to spend some time looking into the hypermedia
constraint of REST itself (i.e. hypermedia as the engine of application
state). The point of REST is to represent state machines by including
hypermedia links in the responses of the API. Those links describe the state
transitions a client may invoke in a given state.

Instead then of mapping CRUD operations to HTTP, you would map your
application semantics to HTTP methods via link relations. Link relations let
you go as far past CRUD as you would like to go depending on your domain and
allow you to describe your states and state machine.

I always think a good way to think about a hypermedia state machine is in the
context of HTML. Consider a todo app example. You might enter the application
and get an empty list of todo items. If you have permissions to add a todo,
you might have a "create todo" HTML form. Once you use that form, that todo
has a new form called "mark complete." Once invoked, that todo has new
transitions called "mark incomplete" or "archive." Once archived, you might
see a new form for "unarchive." All of this captures the state machine and
transitions in the REST API itself using domain-specific semantics.

Of course, there are other ways of solving this problem, but REST with
hypermedia is a great way to work with state machines. There are lots of
hypermedia JSON formats out there if you're interested in exploring (e.g. HAL,
Siren, Collection+JSON, etc.).

------
starjockey
yes,and now skip all the bs and grab yourself a real tool
[http://twysf.users.sourceforge.net/#A6](http://twysf.users.sourceforge.net/#A6)
with a real manual
[http://twysf.users.sourceforge.net/ccide_man.shtml](http://twysf.users.sourceforge.net/ccide_man.shtml)
.combine this with m4 and generate efficient working code for luajit and c.and
as a bonus you learn m4,a macro processor that can take output from cli
programs and insert it inside your source code.much more pragmatic than lisp

