
State Machines in Go - dpapathanasiou
http://denis.papathanasiou.org/?p=1190
======
PuerkitoBio
The end state could be a map[string]struct{}, since it is only used to test
the presence of the string. The bool value is unused, and an empty struct
takes no memory, so it would be an admittedly tiny optimization.

Related: Rob Pikes' lexical scanning talk (state machines in Go):
[http://blog.golang.org/2011/09/two-go-talks-lexical-
scanning...](http://blog.golang.org/2011/09/two-go-talks-lexical-scanning-in-
go-and.html)

------
JulianMorrison
Personally I'd model a state machine in Go as

    
    
        type State int
        type Transition int
    
        func (s State) Next(t Transition) (State, error) {
          /* doubly nested switch goes here */
        }

~~~
zemo

        type state struct {
            // add any fields you need to persist here
        }
    
        type transition func(*state) transition
    

same as in Lexical Scanning in Go
(<http://www.youtube.com/watch?v=HxaD_trXwRE>)

~~~
JulianMorrison
The reason I _didn't_ do that is a function from state to state is pure in a
functional programming sense as well as being simple, free from virtual
dispatch and indirection, and thus fast. It's also easy to harness for
testing, because it's side effect free and stateless.

If you wanted to extend it to perform actions on state change, I'd extend the
return values to (State, Action, error) where Action is "type Action int" as
well, and use the Action in another switch. Thus you are free to pick which
actions you care about, and test the actions separately from the state engine.

------
nickbp
Ragel supports Go output. There's been some recent fixes according to the
mailing list, so you may need to get a recent version straight from git:
[http://www.complang.org/pipermail/ragel-
users/2013-February/...](http://www.complang.org/pipermail/ragel-
users/2013-February/002939.html)

------
amourdezombi
One thing that hasn't been articulated in the recent HN articles about Go:
method sets can be defined for _any_ type, not just structs.

~~~
mseepgood
Yes, you can even attach a method to a function type.

------
ericbb
It's interesting to contrast this design with what Rob Pike presented in his
talk on the Go template lexer. Some differences:

(1) Pike's is more static: there is no Machine struct and states are
identified with handlers so that there is no table lookup in the loop.

(2) Pike chose to go with a modifiable lexer struct rather than threading a
cargo value through the machine.

(3) Pike used nil for all end states.

The main design questions are probably:

(a) Do you need to construct machines at runtime?

(b) Do you need state names available at runtime?

~~~
stevvooe
For an example of a runtime state machine, you may want to take a look at Russ
Cox's implementation of NFA simulation for regexp in go [1]. The concepts may
be a little impenetrable for a Sunday, but there are some interesting
references in the source.

[1] <http://golang.org/src/pkg/regexp/exec.go>

------
signa11
please consider adding an 'fsm-trace' or some such, which is basically an
array of values, each having time, input-msg, current-state and next-state (at
the very least), so that debugging these becomes easier...

------
chubot
Since Go has goroutines, I wonder if you can do something like this:

[http://eli.thegreenplace.net/2009/08/29/co-routines-as-an-
al...](http://eli.thegreenplace.net/2009/08/29/co-routines-as-an-alternative-
to-state-machines/)

Basically use stack state and goroutines to avoid explicit states. Unlike
Python, Go might take multiple threads to do this?

~~~
ericbb
I was going to translate the example to Go but it's such an uninspiring
example because the frame_receiver "coroutine" has no state and might as well
have been a simple procedure.

For an simple example--written in many languages including Python and Go--with
two stateful coroutines, you can check this out:
<http://rosettacode.org/wiki/Synchronous_concurrency>

~~~
chubot
Well unwrap_protocol is the one with state. It could be more complex with 2
stateful coroutines, but it still shows the idea with 1. I would be interested
in the comparison between Python and Go there, although I can basically
imagine it. More interesting is if the nondeterminism of Go has any effect
(performance or otherwise).

That is, Go doesn't provide _synchronous_ dataflow. Thanks for the link, I
will be looking at that!

edit: or maybe it does with a channel size of 1? The Rob Pike lexer talk
mentioned a few times here was doing some trick with that.

~~~
ericbb
If I were writing unwrap_protocol in Go, my first take would probably to use
an io.ByteReader interface constructed using bufio.NewReader(bytes). The
function can then use ReadByte as needed to consume the stream. Apologies if
I'm overlooking something. I only skimmed the article.

And you're on the right track regarding how channels work. They can be
synchronous if you like. Effectively, such channels have no buffering. Also,
you don't need to use OS threading at all to solve such problems in Go. It's
possible to run code with lots of cooperating goroutines in a single OS
thread.

Edit: By the way, I'm not familiar with Python so I'm just guessing at how the
send method and the yield form work.

~~~
ericbb
Edit 2: I ended up doing the translation from the Python article. ;)
<http://play.golang.org/p/ep8wK_UUGm> It's pretty straightforward I think. One
subtlety is that goroutines are not values and won't be garbage collected
until they return. In practice, that means that you usually want a way to
signal to the goroutine that it is done and can return. In my translation, I
used the mechanism of closing the "stream" channel. You can see that in the
unwrapProtocol function, it uses "b, ok := <-stream", which gets a byte in b
and true in ok if a byte was received and gets a zero byte and false if the
channel has been closed. The main function closes the stream to signal to
unwrapProtocol that it is done.

Edit 3: Notice that I forgot to close the "frames" channel. So if this were a
long-running program, that goroutine running frameReceiver would be a leaked
resource.

~~~
chubot
Yeah this is cool. Notice that the top comment in this article is about Ragel
supporting output for Go. But you don't need it in a lot of cases since you
have goroutines.

One of the most well-known uses of Ragel is for Zed Shaw's mongrel HTTP
parser. If you want an event-driven server, and you want to parse
incrementally, you need a state machine. And writing a state machine for HTTP
is a nightmare -- for C it is indeed better to generate it.

But I took a look at Go's http/request.go, and it just parses in the normal
"synchronous" style. Of course this is because the Go runtime already takes
care of the event loop and dispatching to goroutines.

I just did the same thing in Python -- I wrote a coroutine-based HTTP "framer"
in a single 20-line function -- it uses ReadUntil('\r\n\r\n'), parses Content-
Length, and ReadN(length). This is all done concurrently on a single thread of
course, thanks to the coroutines.

This seems like obviously the right way to parse network formats efficiently
and scalably. So yeah I would think twice about before using state machines in
Go.

