Hacker News new | comments | ask | show | jobs | submit login
Why developers should be force-fed state machines (shopify.com)
173 points by Titanous on June 13, 2011 | hide | past | web | favorite | 50 comments

This is mostly a description of state, not of state machines.

Status, published, or paid fields on your objects have little to do with state machines if the code is not written with assertions about the current state of the application. These fields are just... well plain state, and no machine.

One would learn more about state machines from looking at the Ruby gem that's linked to in the article: https://github.com/pluginaweek/state_machine

My post isn't meant as a description about state machines; I point to wikipedia for that. The Ruby gem also gives a very nice introduction indeed.

However, the point I try to make is very much about state machines, because modelling state using a state machine, forces you to think about all possible state transitions. This will make it less likely that you application will get into an unexpected state. Also, keeping an audit trail is very much about state transitions, because the transitions describe the behavior of the system, not the state itself.

I've always thought that gem's a bit silly. It makes sense to think in terms of state machines, but to code them explicitly?

Yet another misteriously killed post, by someone who's not deadbanned:

In [1] , fleitz [2] wrote

Explicitly using a state machine often yields benefits in terms of reusability and clarity. The state machine pattern also works really well with async code where it quickly becomes unapparent what's going on.

[1] http://news.ycombinator.com/item?id=2650347 [2] http://news.ycombinator.com/user?id=fleitz

Weird, that's one of the most insightful comment here. Why would it be killed?

The same can be said of any of the lesser patterns. People who turn _everything_ into a Singleton for instance. Or who want everything to be a Factory or a Command.

State machines should be force-fed simply because they are the simplest computational model, and if they are sufficient for a task, to use something more complicated would be illogical. Indeed, in computer science school, they generally are force-fed, followed of course by force-feedings of push-down automata and various Turing Machines. (And if your web app is modeled as a very long tape, that is probably bad.)

Great post. I think it is important that more developers learn the importance of managing state. Almost every application ends up with the kind of bugs where you have two properties set on an object that are mutually exclusive, and you can do nothing but scratch your head and try to reproduce the steps that got you there.

Even more important than in Rails-style server-side MVC, though, is using state management in stateful client-side MVC, like Mac, iOS/Android, and web applications. (See http://gmoeck.github.com/2011/03/10/sproutcore-mvc-vs-rails-... to understand the difference.)

At least with Rails, the flow through your application is pipelined M->C->V and the debugging is significantly easier. If you think about an iOS application, for example, your application is starting off in a different state every time, and is constantly being modified by the user. If you ever get into bad state, it can be very hard for the user to recover; especially if that bad state gets persisted to the file system.

One problem with state machines is that they grow in complexity very quickly, and the tools that were given to most people in their CS curriculum don't help you manage this fast growth. However, applications that are mission-critical still need the robustness of formalized state management.

David Harel, while working on software for fighter jets, came up with Harel statecharts, which describe a formalism for parent and child states. These are also very popular in embedded systems, such as pacemakers, where users could die if the system fails:


We've been preaching statecharts pretty hard in the SproutCore community, although largely internally at Apple, and included a built-in statechart library in our 1.5 release. Mike Cohen, the maintainer of the SC library, has a ton of great resources on his blog:


I think especially in the arena of web applications, we need to start spreading the word about the benefits of statecharts, not least of which is the easy ability to regenerate state from URLs. It requires discipline and effort upfront, but so does unit testing, and I think that's a battle that the development community has largely won.

This is increasingly important for client-side devs as we use the History API to transition between pages without reloading. This upends our whole predictable state model starting with a page load and a clean slate.

Case in point: I recently built a web app / site with simple content, but very complex page-to-page animated SVG transitions.

The easiest part of the project was implementing the animated transitions. By far the hardest part was managing UI state, since you could enter the app from any URL endpoint.

Thing is, it was only the hardest part because I thought about it last, after I'd built the whole thing assuming a predictable initial state. An approach starting with a statechart might have saved me a ton of trouble.

Almost every application ends up with the kind of bugs where you have two properties set on an object that are mutually exclusive, and you can do nothing but scratch your head and try to reproduce the steps that got you there.

I agree that managing state is important in any large application. I've studied formal machine but they have certain limitations - often a single module will have several different kind of states and some ad-hoc dependencies with the outside world.

Personally, really simple solutions have worked best for me. A) Name your states and state-variables really well - appropriately naming or renaming a state can clarify a huge amount of code. Editors that allow automated renaming of variables are fabulous. B) Add "choke-point" functions which guarantee a consistent state through each program "cycle" (each cycle of user interaction or whatever "main loop" a program has). C) ASSERT, ASSERT, ASSERT. Even if you assert too much, it forces you to think about your code. It frustrates me that Ruby doesn't has a costless ASSERT even though a minimal-cost equivalent can be Jury-rigged in a line.

One problem with state machines is that they grow in complexity very quickly, and the tools that were given to most people in their CS curriculum don't help you manage this fast growth.

This certainly isn't the case in all CS programs. In my university we had a notorious class where we had to use Rational Rose RT to create a state machine representing some distributed system (usually a game like pacman, or cops and robbers), then we actually added C++ code to those states and transitions to make the system actually run.

There are of course always teams that decide to make a single state with a single transition holding all their code, but most people utilized nested states, history states, and all those "fun" things.

The course however was usually a nightmare as the version of Rational Rose RT we were using was ancient and unstable. Also, since this was supposed to simulate a real project, the requirements changed half way through the semester causing lots of swearing, lots of all-nighters, and lots of delivery lasagna.

You can still see "I HATE ROSE" and "ROSE SUCKS" etched into desks and walls in certain computer labs.

Last time I ran into a business process coded as a state machine in the wild it was this horrible mess of code that no one could understand. After staring at it for a while, going through it line by line... finally the lightbulb went off and I was "oh! It's a state machine! I know what they're trying to do now!"

Kind of a Neo "I know Kung Fu" moment.

Unfortunately, no one else on the team knew/remembered anything about state machines, so my epiphany didn't help them out any, even if they had had the same kind of classical Comp Sci education as me.

Naturally, the first thing I did with this power was to leverage it into World Domin... no wait, that was something else. :D What I did with this knowledge was to hassle the people until they gave me a diagram of what the state transitions (or whatever they called them in Business Analyst land) were supposed to be, and then I went back and compared them, and they were not the same :(

Erlang comes with gen_fsm behavior because often when writting network protocols and servers, a state machine is a useful abstraction.


"Rage Against The Finite-State Machine" from "Learn you some Erlang For Great Good"


In Haskell, you can represent a state machine and its current state using an iteratee:


An iteratee is essentially just a function that takes an input symbol and returns either a final state value or another iteratee.

So, more often than not, you'll catch your state machines attaching behavior directly to your objects. Which sometimes bugs you, because you want your states to be less about "nouns" and more about "verbs".

And eventually you notice that, in most cases, you crave coordination across those "nouns" (e.g.: "a transition in model A provokes a transition in model B"). The real workflow now lies "at the intersection of two models". You start to wish for the equivalent of "process management" in addition to state management.

So this, coupled with general unhappiness for the sort of anti-modular/anti-abstraction problem you see in state machines, might incite you to look at "workflow engines" like Ruote.


TL;DR I am drinking the 'ruote' kool-aid right now. Augment your state machines with a great coordinator in the sky, a few levels higher in the architecture stack.

(BTW: I'm paraphrasing this argument from John Mettraux's "state machine != workflow engine" post, and Kenneth Kalmer's Ruote presentation from this year. Really changed my thoughts on the matter recently.)

Using the state_machine gem in one of our Rails apps, which tracks a workflow where items can be received, entered, reviewed, rejected, etc, helped me think through the business process and map that to our code.

We also logged all the changes in state. In hindsight, I should have added a "previous log entry" field to each log item, to make it easier to trace the history of any given item. Like "find all the rejections, hop back one through the log history and show me who created that item." If the rejection references the creation's id, that's easy; otherwise it's a query using widget ids and the date and sorting to get the most recent entry before the rejection.

It is interesting how few languages make state machines easily read when the appear in code ("what's with all the ifs?").

In languages with first class functions and hash tables, you can use this instead (here in Lua):

    state1.goto = {
        state2 = function() ... end,
        state3 = function() ... end    

Much cleaner (even though the Lua syntax for function literals is a bit heavy. It would look much nicer in CoffeeScript...).

Yeah, state as functions is really nice, but it gets messy in languages that don't have tail recursion elimination (eg Javascript), as you have to have a trampoline mechanism to call the next state, I generally return the new state function and get the trampoline to call it. Which is less readable then just calling the new state as in your Lua example...

You could use polymorphism - but that scatters the transition logic to the four winds.

I guess pattern matching would work well (haven't used it that way myself).

An actual state transition table seems the most straightforward way to represent it - but I agree, it's striking that there isn't native support for it. Perhaps there isn't a better way to do it...?

EDIT just thinking, a state machine can be modeled as a regular expression (they are formally equivalent). You could represent them as productions; or, as regular expressions. Composed of the input symbols causing the transitions - the states are implicit; but you could associate a function with each symbol, by inserting it afterwards; the 'symbol' itself could be a function that return true/false, for matches:

  isStart() setup() ( !isEnd() processInput() )* isEnd() teardown() | err()
I would guess that Lua deals with these problems a lot (as an embedded game scripting engine, both enemy AI and UI logic might be modeled well by state machines (in part) - perhaps it has a good way of handling them, or at least, good idioms have developed.

Honestly, modeling state with polymorphism can be really elegant and powerful. There are a few decisions to make (for instance, do States explicitly make transitions happen, or do they return the next state?)

But in general, anywhere you see a lot of conditionals, you probably want to consider a polymorphic approach. Especially if you have more than one function with parallel logic trees.

What do you think of my concern of it "scattering the transition logic" across the classes? Each class becomes a production, and only it knows which classes can come next. It's hard to get an overall grasp of how the state machine works, because you have to inspect all the classes to know the transitions.

This is a significant concern for me when reading others' code: the individual parts might be easy to understand, but there is usually no documentation of the overall design, of how the parts operate together. And there is no code that ties together all the parts - so you are forced to inspect all the local details in order to get a global overall view.

There are many ways to evaluate a program - how efficient it is, how many features it has, how complete it is, whether it is consistent, whether it is correct, how easy it is to prove things about... but I think how much work it is to understand is one of the most important - and usually neglected. For a complex program, having some way to quickly grasp the overall design is key to this, IMHO.

This would actually be a nice way to hijack lpeg.

In C it is possible to use functions to represent state. To cause a state change, you return a pointer to the function representing the next state. Each state function manages transitions away from that state. The main event loop repeatedly calls the current state function with any external input relevant to the system. There is also the giant switch block option, with the current state stored as an enum.

With a little discipline, I think it should be possible to represent state machines in a readable fashion in most popular programming languages.

this is why i like ragel (http://www.complang.org/ragel) a lot. seems that zed-shaw also has used it in couple of his projects e.g. mongrel most notably. would be particularly cool if ragel can be combined automagically for some protocol parsing tasks...

I'm also a big fan of ragel, and have used it for various protocol implementations. One of the really nice features is that you can have it output graphviz dot files, to get an actual visualisation of your state machine, so you know where you've missed a transition, or how you've accidentally hit a state explosion with a bad rule.

Zed Shaw also has a nifty blog post about explicitly hacking Ragel to model a state machine - iirc it has to do with an anti-griefing messaging system. Worth googling.

Learning HDL (e.g., Verilog or VHDL) is a great way to force-feed yourself state machines.

Yes, or PLC (ladder logic) programs. That will put hair on your chest and give you a real appreciation for the luxuries of high-level programming languages running on top of a traditional processor.

State machines are anti-modular.

State machines are anti-abstraction.

State machine code is hard to read.

I disagree.

State machines are anti-modular: By clearly defining the transitions from one state to another and when outputs can occur I see state machines as making it easier to develop modular applications. I don't see how it inevitably leads to non-modular code.

State machines are anti-abstraction: If you're modeling a process then some things can't be abstracted. At some level you have to deal with real world situations.

State machine code is hard to read: That's a function of the effort made to make it easy to read. It's no different than any other situation.

In general I find state machine design to be helpful in literally controlling the state of an application. By limiting what an application does and when it can do it, via an easily understood mental model, it's a straight forward design technique.

The problem is that many apps are looking at the wrong states when they design their machine.

For instance I worked on an app someone had build in which states were based on a multi-page html form. When I added an android app allowing offline data collection, I then had to hack around the state machine. We couldn't just create a instance of the object with the final data, we had to write code that replicated the submissions of the html form.

This echos my experiences exactly. I've found little value in state machines as a frequently used pattern. Abuses abound and I've also ran into the multi-page-html-form-as-a-state-machine anti-pattern. It's horrid and should not be done.

> By clearly defining the transitions from one state to another and when outputs can occur I see state machines as making it easier to develop modular applications. I don't see how it inevitably leads to non-modular code.

I have not seen a modular state machine yet. Perhaps it is possible, but it doesn't seem to be done very often.

> If you're modeling a process then some things can't be abstracted. At some level you have to deal with real world situations.

I don't think state machines are usually the most natural representation of a process. And very often, people use state machines to model things that don't really fit into that model.

> That's a function of the effort made to make it easy to read. It's no different than any other situation.

Perhaps I have only seen the bad apples.

All of these problems are solvable if you apply a bit of DRY, and use better languages. I would despair of trying to combine state machines and DRY without closures.

The final remaining "hard to read" that remains after DRY is usually a reflection of the problem domain, not the solution domain. That's a barrier you can't pass through; if you got seven states and ten events, you've got seventy things to deal with, one way or another. (Which doesn't have to mean 70 literal handlers, but does usually mean still quite a number.)

Indeed. If you do not use a state machine in a situation like that, you either get very complicated code as well, or you conveniently do not implement some of the state transitions. That will always get back to you: "how can this ever happen?!"

State machines force you to consider all possible state transitions. This makes sure that transitions are handled properly, or they are actively made impossible.

I reduced a buggy, crashing, network protocol (serial modem!) to a state machine (three actually) in 2 days; had it debugged in another few days and it chugged along like a workhorse for years.

In the process all the bugs in the original code were revealed - literally a hundred cases unhandled or incompletely handled. The code I found had a dozen flags and enourmous runon procedures at every event entry point, totally unreadable.

SO yes I am a keen fan of state machines. But I admit they dominate your code architecture and look funky. Closures are definitely a good thing, tho I haven't yet figured out the best pattern. Its good to have all the cases laid out in front of you, with each combination of state and event commented and discussed. What would that look like using closures?

"It depends." Whether you should show every combo of state and event or not is a local decision. What closures really make easy is something like "I always set something up, do something, and tear it down", and it allows you to put the set up and teardown in one place. Or maybe it lets you put the "something" in one place and it's the setup and tear down that are different every time. Or maybe you've got multiple state machines that are talking to each other, and it makes it possible to abstract them from each other a bit. It isn't like there's one design that they enable that is impossible without them, it's more like, it's a tool without which I would really be stuck.

But one concrete example: I have a state machine that manages connection to a network resource that requires a negotiation process. I have some code that would like to just use that connection, without having to manage it, and there are many ways to enter this code and many things it may want to do. During the connection process, if more requests come in, the connection state machine just stacks them up as closures to be executed when the machine completes its connection. (Or, called another way, notify the callers of an error.) These closures themselves contain other ones that have the details of how to back to their state machines and send a message along that particular socket to go back out to the right client along one of several protocols. None of this is impossible without closures, but keeping the machines ignorant of how the other machines work is a lot harder without them.

OO can do it too, albeit with a different flavor. Without one of OO or closures is a pain, though.

would love to see a blog post glancing at the 'before' code and overviewing the 'after' code. Even if it was just two pastes.

What about a .dot file? It's pretty simple.

  digraph publish{
voilla! a state diagram. What's that? you want metadata?

  digraph publish{
    AUTHOR => EDIT  [ key="SUBMIT" 
                      buttonText="Submit to Editor"

et voilla. The modularity can find it's way in the code implementing the state machine actions. Not sure what "anti-abstraction" means, but I find the above pretty easy to read.

Also, you get free diagrams via graphviz.

I don't think state machines are anti-modular, but I've seen situations where a cartesian product of several state machines was modeled as one big state machine, rather than as several largely independent state machines running in parallel. The one big state machine ends up as an anti-modular un-abstracted hard to read mess.

Imagine writing a state machine to handle TCP/IP connection establishment, and a state machine to handle parsing HTTP headers, and a state machine for the different things your web app can be doing. If you combine them all into one big state machine, you're going to have a huge mess. If you can write separate state-driven modules for each, interacting with each other at well-defined interfaces, you'll be much better off.

In other words, state machines, like most tools, can be abused. Don't do that!

guys, i think the parent makes more sense if you qualify as

state machines suck if: - the problem isn't inherently stateful - the stateful code is not carefully written

i think the problem is as the OP dscribes: people often write stateful code without realizing it, thus the code doesn't show explicit understanding the combinatorical number of state transitions and all the possible edge case bugs.

But that just reduces to sucky code sucks and doesn't say anything meaningful about state machines.

None of these has to be true. Of cause I'm a bit colored on the subject since I once worked on visualSTATE (http://www.iar.com/website1/ which is a graphical tool for state machines. In fact I see state machines in pretty much all software projects but developers are so attached to their way of thinking that they have troubles adopting to new programming concepts.

> State machine code is hard to read.

I disagree. A well-written state machine is largely declarative. And that's just as good as it gets for readability.

I skimmed the whole thing before realizing I had parsed the headline wrong.

Kinda like foie gras right?

Yep :)

For any .Net people here, this is a great piece of kit:


This is not an interesting thought for someone who has a degree.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact