The end state could be a map[string]struct{}, since it is only used to test the presence of the string. The bool value is unused, and an empty struct takes no memory, so it would be an admittedly tiny optimization.
The reason I didn't do that is a function from state to state is pure in a functional programming sense as well as being simple, free from virtual dispatch and indirection, and thus fast. It's also easy to harness for testing, because it's side effect free and stateless.
If you wanted to extend it to perform actions on state change, I'd extend the return values to (State, Action, error) where Action is "type Action int" as well, and use the Action in another switch. Thus you are free to pick which actions you care about, and test the actions separately from the state engine.
For an example of a runtime state machine, you may want to take a look at Russ Cox's implementation of NFA simulation for regexp in go [1]. The concepts may be a little impenetrable for a Sunday, but there are some interesting references in the source.
please consider adding an 'fsm-trace' or some such, which is basically an array of values, each having time, input-msg, current-state and next-state (at the very least), so that debugging these becomes easier...
I was going to translate the example to Go but it's such an uninspiring example because the frame_receiver "coroutine" has no state and might as well have been a simple procedure.
Well unwrap_protocol is the one with state. It could be more complex with 2 stateful coroutines, but it still shows the idea with 1. I would be interested in the comparison between Python and Go there, although I can basically imagine it. More interesting is if the nondeterminism of Go has any effect (performance or otherwise).
That is, Go doesn't provide synchronous dataflow. Thanks for the link, I will be looking at that!
edit: or maybe it does with a channel size of 1? The Rob Pike lexer talk mentioned a few times here was doing some trick with that.
If I were writing unwrap_protocol in Go, my first take would probably to use an io.ByteReader interface constructed using bufio.NewReader(bytes). The function can then use ReadByte as needed to consume the stream. Apologies if I'm overlooking something. I only skimmed the article.
And you're on the right track regarding how channels work. They can be synchronous if you like. Effectively, such channels have no buffering. Also, you don't need to use OS threading at all to solve such problems in Go. It's possible to run code with lots of cooperating goroutines in a single OS thread.
Edit: By the way, I'm not familiar with Python so I'm just guessing at how the send method and the yield form work.
Edit 2: I ended up doing the translation from the Python article. ;) http://play.golang.org/p/ep8wK_UUGm It's pretty straightforward I think. One subtlety is that goroutines are not values and won't be garbage collected until they return. In practice, that means that you usually want a way to signal to the goroutine that it is done and can return. In my translation, I used the mechanism of closing the "stream" channel. You can see that in the unwrapProtocol function, it uses "b, ok := <-stream", which gets a byte in b and true in ok if a byte was received and gets a zero byte and false if the channel has been closed. The main function closes the stream to signal to unwrapProtocol that it is done.
Edit 3: Notice that I forgot to close the "frames" channel. So if this were a long-running program, that goroutine running frameReceiver would be a leaked resource.
Yeah this is cool. Notice that the top comment in this article is about Ragel supporting output for Go. But you don't need it in a lot of cases since you have goroutines.
One of the most well-known uses of Ragel is for Zed Shaw's mongrel HTTP parser. If you want an event-driven server, and you want to parse incrementally, you need a state machine. And writing a state machine for HTTP is a nightmare -- for C it is indeed better to generate it.
But I took a look at Go's http/request.go, and it just parses in the normal "synchronous" style. Of course this is because the Go runtime already takes care of the event loop and dispatching to goroutines.
I just did the same thing in Python -- I wrote a coroutine-based HTTP "framer" in a single 20-line function -- it uses ReadUntil('\r\n\r\n'), parses Content-Length, and ReadN(length). This is all done concurrently on a single thread of course, thanks to the coroutines.
This seems like obviously the right way to parse network formats efficiently and scalably. So yeah I would think twice about before using state machines in Go.
Related: Rob Pikes' lexical scanning talk (state machines in Go): http://blog.golang.org/2011/09/two-go-talks-lexical-scanning...