
Typestates in Rust (2018) - lwhsiao
https://yoric.github.io/post/rust-typestate/
======
cdirkx
When const generics are fully implemented (they are partially usable on
nightly rust now) this will enable an even stronger version of this pattern,
allowing you the full power of enums to represent the state.

    
    
      enum SenderState {
        ReadyToSendHello,
        HasSentHello,
        HasSentNumber,
        HasReceivedNumber
      }
    
      Sender<const S: SenderState> {
        ...
      }
    
      impl Sender<{SenderState::ReadyToSendHello}>{
        ...
      }
    

One can then give the individual states extra parameters:

    
    
      HasSentNumber {
        number: u32
      }
    

(Note that in this case that doesn't make much sense, the number that is sent
is more associated data then an actual type parameter. There is no real
difference, at the type level, between HasSentNumber { number: 3 } and
HasSentNumber { number: 6 }, and the compiler generating two types for this
would be unnecessary. It is only an example of the syntax.)

~~~
oleganza
You can do session types already in stable Rust without const generics. Each
"enum variant" can be its own separate type, and such types can only be
instantiated by a method on the type representing the previous state, which
also consumes that previous state.

We've used them in our Bulletproofs MPC (multi-party computation) where
cryptographic requirement is not to replay the protocol from the mid-point.
Since the user is supposed to perform these transitions within their
application on top of network messages, such strongly-typed API guarantees
that, if their program has compiled successfully, then:

1) Steps are performed in correct order. 2) The protocol cannot be replayed
from the intermediate state.

We have wrote about it here: [https://medium.com/interstellar/bulletproofs-
pre-release-fcb...](https://medium.com/interstellar/bulletproofs-pre-release-
fcb1feb36d4b) \- scroll to "strongly-typed multiparty computation".

~~~
oleganza
Here is updated link to the original article on MPC by Cathie Yun:
[https://medium.com/@cathieyun/bulletproof-multi-party-
comput...](https://medium.com/@cathieyun/bulletproof-multi-party-computation-
in-rust-with-session-types-b3da6e928d5d)

------
_hardwaregeek
A good sign in a type system for me is that the types are accurately modeling
behavior that I find in day to day code. To the point where moving back to a
language that lacks this feature feels weirdly unsafe or not expressive
enough. Going back to C APIs, with their states that need to be read via bit
masks or through special integer return values, is just painful after using
discriminated unions. Likewise being able to have multiple owners to a value
is a little weird after using Rust.

Also what's the deal with the fi ligature in the code? It throws off the
kerning and holds no purpose.

~~~
reificator
> _Also what 's the deal with the fi ligature in the code? It throws off the
> kerning and holds no purpose._

I'm not seeing a ligature there, or at least not an obvious one. What
browser/platform are you using?

~~~
_hardwaregeek
Interesting. The CSS is using `-moz-font-feature-settings: "liga" on` for some
reason. I'm using Firefox on macOS. I assume you're not?

~~~
reificator
Firefox on Windows. Do you have a screenshot? Maybe I'm just blind.

~~~
_hardwaregeek
Here you go:
[https://gfycat.com/angrycanineboa](https://gfycat.com/angrycanineboa)

Click the gear if it's too small for you.

~~~
reificator
Font-Family for me is "SFMono-Regular","Liberation Mono","Roboto
Mono",Menlo,Monaco,Consolas,"Courier New",Courier,monospace

Looks like I'm falling all the way back to consolas. I'm pretty sure SFMono
comes with macOS, which explains the platform differences between us.

That might explain why the fi ligature has been a common complaint lately, as
more people start using San Francisco in their sites.

~~~
saagarjha
> I'm pretty sure SFMono comes with macOS, which explains the platform
> differences between us.

Not in a way that would be relevant here. I’d expect this font stack to fall
back to Menlo on macOS.

~~~
reificator
And does Menlo have this ligature? I don't have a mac to test with.

~~~
saagarjha
It does, but it only works in Safari if I add font-feature-settings: "liga"
on.

------
frio
We do something like this to encode our Redux states at compile-time (using
TypeScript, obviously). Previously, using regular JS, the Redux devtools made
tracking down incorrectly implemented reducers/state transition reasonably
straightforward -- but you still had to trigger a bug before you knew you had
to track it down, and implement tests etc.

This kind of design pattern measurably saves us time; it reduces the volume of
unit tests we need to write/update when we make changes (we still test,
just... with less fear), and it prevents newer developers from making
mistakes. I haven't played with Rust (beyond a couple of toy projects) yet,
and articles like this remind me I'm looking forward to sinking my teeth in
over the Christmas break.

~~~
amykyta
Cool! Are you referring to the phantom types technique? I am interested in how
you implement this in typescript. Do you have a blog post about it you can
refer me to? Or even just a gist?

~~~
frio
I'll have to read up on that. We just leverage enums and Typescript's slightly
more strict type checking to state that the store can be in only certain
states, and use the excellent typesafe-actions library. I'll write up a blog
post. There are probably flaws in our methodology but when we took some new
grads on recently, it helped a lot :).

------
dmkolobov
This is one of my favorite patterns from Haskell, which I've known as "type-
level programming". One of the most common examples is a "sized vector"[1]
which allows compile-time bounds checking. This is achieved by annotating each
vector type with a phantom type variable representing its size.

Nice to see something like this in Rust! One thing that's a bit of a bummer,
and I'm sure there are very good reasons for this, is that we HAVE to use
every type argument of struct in its definition. If this restriction were to
be relaxed, we wouldn't need the "state" field in the struct at all, and we
could make the state type variable truly "phantom".

[1] [https://www.schoolofhaskell.com/user/konn/prove-your-
haskell...](https://www.schoolofhaskell.com/user/konn/prove-your-haskell-for-
great-safety/dependent-types-in-haskell)

EDIT: Typos.

~~~
choudanu4
Does the PhantomData type in Rust do what you want? Or are you talking about
something slightly different?

[https://doc.rust-lang.org/nomicon/phantom-data.html](https://doc.rust-
lang.org/nomicon/phantom-data.html)

[https://doc.rust-
lang.org/std/marker/struct.PhantomData.html](https://doc.rust-
lang.org/std/marker/struct.PhantomData.html)

So

    
    
      struct Sender<S> {
        /// Actual implementation of network I/O.
        inner: SenderImpl;  
        /// 0-sized field, doesn't exist at runtime.
        state: S;
      }
    

I believe could become

    
    
      struct Sender<S> {
        /// Actual implementation of network I/O.
        inner: SenderImpl;  
        /// 0-sized field, doesn't exist at runtime.
        _marker: PhantomData<S>;
      }
    

I've never used this PhantomData personally, so this might be wrong. Cheers!

~~~
dmkolobov
It does! But you still have to actually "use" it, meaning that it appears as a
field in the definition, and you have pass a PhantomData value whenever you
create new instances of your data type.

In Haskell, you can omit these fields entirely, and achieve the same thing
just by annotating the function.

For example, in Haskell, we can have

    
    
      data Const a b = Const a
    

whereas in Rust, it would be:

    
    
      struct Const<A, B> {
          konst: A, 
          // does not exist at run-time
          discard: PhantomData<B>
      }

~~~
Rusky
The reason for this requirement is lifetime subtyping: [https://doc.rust-
lang.org/nomicon/subtyping.html](https://doc.rust-
lang.org/nomicon/subtyping.html)

Type parameter variance is inferred from usage (e.g. covariant for normal
fields, contravariant for function arguments) and without a usage there's no
way to infer it.

------
RcouF1uZ4gsC
Basically, they are using Rust to encode a state machine using types. This is
brilliant! The nice thing is that the transitions are determined at compile
time so there can be performance benefit as well as compile time correctness
testing.

Really neat techniques!

~~~
cies
[https://hoverbear.org/blog/rust-state-machine-
pattern/](https://hoverbear.org/blog/rust-state-machine-pattern/)

Here another article, but with a more classical state machine impl.

------
nitnelave
I wrote something similar in C++, completely compile time that disappears at
runtime: [https://www.fluentcpp.com/2019/09/24/expressive-code-for-
sta...](https://www.fluentcpp.com/2019/09/24/expressive-code-for-state-
machines-in-cpp/)

Although the language itself doesn't guarantee that the value is not used
again after a move, good static analyzers will provide a warning in that case,
so it can still be safely used.

~~~
pcwalton
> Although the language itself doesn't guarantee that the value is not used
> again after a move, good static analyzers will provide a warning in that
> case, so it can still be safely used.

Not soundly. Such static analysis can be trivially defeated using e.g. virtual
methods.

~~~
nitnelave
True, it's not bulletproof. However, with a bit of discipline it's possible to
use it safely. And anyway, it's already an improvement over just documentation
like "don't call this until you called that"

------
reificator
Another example, implementing IMAP with Rust's affine types.

[https://insanitybit.github.io/2016/05/30/beyond-memory-
safet...](https://insanitybit.github.io/2016/05/30/beyond-memory-safety-with-
types)

~~~
staticassertion
Hey, I wrote that article. There's now an actual, functional IMAP library that
uses these techniques here:

[https://github.com/jonhoo/rust-imap](https://github.com/jonhoo/rust-imap)

I did not contribute to that library, but it's nice to see a practical
implementation. They use the same approach, essentially, as I describe in my
blog.

------
dhash
Scala/Haskell has the same thing, and it's an amazing feature. The proper type
is that of

    
    
        trait IndexedStateT[A, B, C]
    

Which signifies a typelevel state machine moving from state A to state B
emitting a value of type C.

I can only speak for scala, but i'm assuming haskell has singleton and literal
types as well. Meaning that code like this works great.

    
    
        object DoorOpen
        object DoorClosed
       
        class Door {
          def open: IndexedState[DoorClosed.type, DoorOpen.type, Unit]
          def close: IndexedState[DoorOpen.type, DoorClosed.type, Unit]
        }
    
       val d = Door()
       for {
          _ <- d.open() //works
          _ <- d.close() // works
          // _ <- d.close() //compile error
       }
    
    

By my understanding of the article, it uses the borrow/move state to implement
the state transistion. Is this generalizable to arbitrary state machines, or
only a simple 2-state one?

~~~
Rusky
It generalizes to arbitrary state machines- check out the reference near the
end to session types.

Each state is a type, methods consume their receiver (the "move state") and
return the new state. You can't accidentally keep around a copy of the old
state.

------
unlinked_dll
This is a really useful pattern when you want to have rigidly defined/enforced
state transitions to ensure that the data is never in an invalid state for a
given operation.

It's pretty awful to deal with when you're unsure of what the state machine
should look like, or if there needs to be a lot more flexibility in how the
data is accessed. Maintainability nightmare.

An example of this I ran into is a data processing pipeline architecture where
each vertex of the processing graph had a processing function called in a loop
on its own dedicated thread. Using the type state pattern helped clearly
define the "life cycle" of each vertex and enforce it, which provided for some
powerful synchronization guarantees (e.g. we could provide _some_ elements of
memory safety even when loading things through shared libraries). If you dug
into it you could break things, but that would be more work than just
following the pattern.

------
gameswithgo
Are there other languages that can do the trick done with "close()" on the
file there? A type with a method that can make the type compile time unusable
after being called!

that is pretty neat.

~~~
nimish
Any dependently typed language worth its salt will be able to model a state
machine in its type system.

Haskell can do it with some effort, maybe for simple stuff phantom types and
GADTs are enough. With linear types the explicit "consumption" can be modeled.

------
pedrow
Maybe not everyone will know this but Typestate was one of the original
'headline features' of Rust when Graydon Hoare started it. It was based on the
Strom/Yemeni paper[0] for the NIL language. There's a mention of in this SO
answer from 2010[1] and in the LtU discussion[2]. I'm not sure why it was de-
emphasised, I think it didn't work as well as anticipated in the 'real world'

[0]:
[http://www.cs.cmu.edu/~aldrich/papers/classic/tse12-typestat...](http://www.cs.cmu.edu/~aldrich/papers/classic/tse12-typestate.pdf)

[1]: [https://stackoverflow.com/questions/3210025/what-is-
typestat...](https://stackoverflow.com/questions/3210025/what-is-typestate)

[2]: [http://lambda-the-ultimate.org/node/4009](http://lambda-the-
ultimate.org/node/4009)

------
amedvednikov
> 2\. we have seeked in a file that was already closed.

> The second error, however, is much harder to catch. Most programming
> languages support the necessary features to make this error hard, typically
> by closing the file upon destruction or at the end of a higher-order
> function call, but the only non-academic language that I know of that can
> actually entirely prevent that error is Rust.

Why not simply add an `is_closed` flag and throw an error if it is?

~~~
msl
> Why not simply add an `is_closed` flag and throw an error if it is?

That's a way of doing it at runtime. The article describes catching the error
at compile time.

~~~
amedvednikov
I see, thanks.

------
justryry
This is one my favorite features of the language and I believe to be fairly
unique. It can make for some slick and safe state machine like code.

I do wish that we had reliable RVO so that this could come at zero cost.

------
nixpulvis
[https://github.com/Munksgaard/session-
types](https://github.com/Munksgaard/session-types)

------
jbritton
This line caught my eye.

my_file.open(); // Error: this may fail.

1) If this can fail, then it should be a compile error to not test the result
code.

2) IMHO it would be nice if there was something like Python's with statement
to correctly close a file.

    
    
        with open(filename, 'r') as f:
            f.read()
        # f.close() invoked automatically here
    

This prevents trying to close a file that is not opened.

The idea of encoding a state machine into the types seems interesting.

~~~
andolanra
You should probably have read the rest of the blog post, because in Rust it
_is_ a compiler error to not test the result, and the post specifically calls
this out. Specifically, look for this section:

    
    
        let mut my_file = MyFile::open(path)?;
        // Note the `?` above. It's a simple operator that asks
        // the compiler to check whether the operation succeeded.
        // The *only* way to obtain a `MyFile` object is to
        // have a successful `MyFile::open`.
    
        // At this line, `my_file` is a `MyFile`, which means
        // that we may use it.

~~~
Thaxll
Is it though? Does the compiler not compile when you don't check / use the
error result?

~~~
kccqzy
If you don't use the result, how would you be able to get the handle that
represents the open file?

~~~
inferiorhuman
You wouldn't. In this case a more likely example would be writing to the file.
You don't have to check the return value (in C or Rust) to do something
useful, but you should. In Rust you'll get a warning (or error depending on
how your workspace is setup). In C, you won't get anything. It's also worth
noting that the Result enum can be and is used for things beyond file I/O.

------
caleb-allen
Feels similar to Kotlin's sealed classes

~~~
james-mcelwain
They are similar to Kotlin's sealed classes, except there's no concept in
Kotlin of an object "consuming" itself, which is necessary for maintaining
safety in the API.

For example, in Kotlin, if a method on state A returns state B, there's no way
to invalidate all existing references to state A. Normally this invariant
would be enforced on the caller's side, or perhaps by throwing an
IllegalStateException if state A is called after producing state B.

~~~
Nycto
Adding to this, I see two other elements that make this tough to replicate in
Kotlin:

1\. Type erasure

2\. Using sealed classes requires instantiation, while the Rust version is
zero overhead.

~~~
paulddraper
> 1\. Type erasure

There is no need to access type information at runtime here.

