Hacker News new | past | comments | ask | show | jobs | submit login
Session Types for Rust (github.com)
232 points by adamnemecek 53 days ago | hide | past | web | favorite | 49 comments



I taught this concept in my course on Programming Languages at Stanford. If you’d like a more pedagogic introduction to both session types and their implementation, check out the course notes: http://cs242.stanford.edu/f18/lectures/07-2-session-types.ht...


I don’t believe that strict “session types” are a good fit for Rust: it’s shoehorning a concept, simultaneously specialised (it’s for channels) and generalised (as in “left” and “right”), into Rust in a way that works, but is not as pleasant as can be achieved with little effort. (There’s not much actual code in session-types/src/lib.rs—most of it is interfaces and macros because it made life hard for itself by being too generic.)

The underlying concept in Rust is basically a type-level enum (state machine), encoding the variant (state) as a phantom type in a generic parameter; that, you can express without the full baggage of what is called session types, and with greater freedom.

The situation is similar to the Either enum with its Left and Right variants: it was in the standard library well before Rust 1.0 because it was easy, but it was eventually abandoned in favour of good semantics: most users should use Result, and the rest should write their own two-variant enums. Similarly, if you encode your state machine in Rust types yourself, you end up with clearer and more obvious code.

The main thing this library does is get you the specific channel behaviour; that may be useful in some situations, but you can implement the underlying concept atop regular channels without much difficulty, and it gives you considerably more freedom in the API.

Here’s a small macro library that I wrote a few years ago, with an example that demonstrates this general concept: https://github.com/chris-morgan/phantom-enum

I find myself wishing more and more often that Rust enum variants could be used as types, and that type refinement could work like in TypeScript.

(I do realise that what I’m advocating is targeting a potentially different and broader sector to session-types, but I maintain my position: doing it from scratch yourself is fairly straightforward, normally easy and more powerful.)

You can achieve all of this with different types instead of one type with a phantom enum type parameter, too; see the insanitybit article linked elsewhere in these comments. The primary benefit of the phantom enum type parameter approach is when you have methods that apply in multiple, or all, states, so that you don’t need to duplicate them. Sometimes you want extra data in particular states; you can achieve that in the type parameter approach by making the enum not phantom, and rather a struct with extra data, or you can achieve that by just using distinct types. Which is most appropriate varies case by case.


Do you know of any programming language where this type of things (having functions defined only for certain enum values, or anything equivalent) would be doable properly, without hacking the typesystem ?


Look for languages with dependent typing. Idris and Liquid Haskell are the two that spring to my mind.

Other than that, TypeScript has decent support for the concept, though with standalone functions rather than methods: you’d define your “variants” as types (or interfaces), define the “enum” as the union of those types (e.g. `Foo | Bar`), and then have your functions take either the entire “enum” type, or specific “variant” types. In place of Rust’s `match`, you will then use type guards to narrow the “enum” type down to one (or more, if you like) of the “variant” types. See https://www.typescriptlang.org/docs/handbook/advanced-types.... for a bit of info.


Jula allows it, initially only defining the integer division operator for non-zero divisor. That default changed due to a (at that point) lack of syntactic sugar and type-inference to not make it a massive hassle.

Edit: It's called a partial function, and I confused Julia with Pony. The former might be able to express it, the latter can handle it in some way, but I'm not familiar with either type system.


If you're interested in this, you may want to take a look at this article as well: https://insanitybit.github.io/2016/05/30/beyond-memory-safet...

EDIT: Sorry, I remembered the site name and saw IMAP in the title, but ended up clicking the wrong article. Updated the above link from https://insanitybit.github.io/2016/06/28/implementing-an-ima...


is it different from phantom types ? more general ?


I wrote that insanitybit post. It's unrelated to Phantom types.

A session type allows you to encode a state machine into your type system. What this means is that, at compile time, you can validate that your code can't transition into an invalid state.


But state machine and communication protocols.. aren't they isomorphic ?


Sure, feel free to read 'communication protocol' where I say state machine.


oh ok; I double-misread your above comment. My assumption was that phantom types were also about encoding state machines in types and thus could work for communication protocols too just like your session type article; hence the question about the difference between the two.


Oh, yes, I see the miscommunication now. Phantom types can be leveraged in a similar way, where a type (the phantom type that you are generic over) represents that state. To answer your question - the end result of both approaches is a session type with the same guarantees.

I've seen that approach used but I can't really comment too much on the differences as I've never used it myself. I find my approach simple in practice (I've shipped Python code that takes the same approach, sans affine types) because it separates the types out.

So, essentially, I write my non-sesssiony struct with all of its methods, and then I wrap that in the session types. This lets me easily drop into the non-session type if necessary, and the sessions types act as thin wrappers. My imap client worked the same way, if I recall.

The downside is likely boilerplate.


is the python code public ?

I toyed with phantom type in kotlin recently (toy examples); I'm curious how it would be done in python.


It's not. The short answer is it's the same thing as Rust, except instead of a compile time guarantee that your state isn't reused I have an assertion and test coverage.


aight, thanks


If you're interested in this topic, I'd highly recommend watching Stephanie Balzer's sessions at the Oregon Programming Language Summer School on “Session-Typed Concurrent Programming”. The lectures are available online (https://www.cs.uoregon.edu/research/summerschool/summer18/to...).

Her recent work (sponsored, in part, by Mozilla Research) really advances session types and the ability to get termination/progress guarantees: http://www.cs.cmu.edu/~balzers/publications/manifest_deadloc...


In what sense does this work really advance session types? Prima facie it looks like yet another tiny delta on Honda's original work that cannot and does not get around session types' core problem of data-dependent communication topology. Let me quote from the paper (Page 22, Section 6):

"The typing discipline presented in the previous sections, while rich enough to account for a wide range of interesting programs, cannot type programs that spawn a statically undetermined number of shared sessions that are then to be used."

That has been the problem of this approach to typing message passing for nearly 3 decades, and like all predecessor papers in this theory tradition, an ad-hoc approach is proposed that makes core session types incredibly complicated without solving the problem. In order to get a publication out of this, a different problem is solved, a yet another Curry-Howard corresponce for some small delta of Linear Logic is given.


I am pleased to see that this was done without macros, even custom derives. There are a couple of macros in the library to make some use cases easier, but they aren't essential.

There is a tendency in the Rust community to reach for macros as soon as anything a bit complicated or verbose comes up. I think this most often ends up sacrificing true understandability on the altar of a slick surface syntax, and it doesn't help us write or understand well in the long run.


I never understood the idea of shortening variable names a few characters. Recv vs Received, especially when there’s a also a Rec type mentioned in the reader as well.

It’s not like Send, was shortened to Snd


Read (or indeed, write) a lot of code and you'll quickly realise why. It's the same reason acronyms and abbreviations exist in human languages.

Also, "recv" and "send" are both 4 characters, which is nice.


These days people type the first four letters and rely on IDE autocomplete to finish the other 42.


It's mysterious, all right. https://unix.stackexchange.com/questions/128708/why-wasnt-cr...

The case where things are given three or four letter codes is more comprehensible. Baudot code from the 19th C calls letters and figures LTRS and FIGS, though I'm not sure when exactly that got established as the normative spelling. Presumably LTRS and FIGS can be printed on a smaller keycap than the full word.

https://en.wikipedia.org/wiki/Baudot_code

So at a certain point abbreviations became both useful and fashionable:

https://en.wikipedia.org/wiki/Abbreviation

(See also http://www.thepotteries.org/mark/w/wedgwood-date.html for a 19th C example of three-letter coding.)


Probably because someone important couldn't remember if it was receive or recieve


Quite confusing indeed:

    Rec<Recv<String, Var<Z>>>


Well. Rust is about to turn into Haskell 2.0


You don’t need that powerful of a type system to do this kind of thing; Rust is still pretty far behind Haskell in terms of raw power.


I don't think it's helpful to talk about "raw power" in a comparison like this; clearly there are significant type system features Haskell lacks - so this isn't a question of a simple superset.

And in any case; what does raw power really mean?

The two languages appear to have fairly widely different aims, and both see type-system cleverness as a boon, so it should come as little surprise that they don't all have the same tricks.


There’s only one type system feature that Rust has that Haskell doesn’t, and Haskell will have it soon. Yet there are many features that Haskell has that Rust does not, and Rust is not getting that stuff any time soon. That’s what I mean by raw power; it is very close to a subset relationship.


Didn't know about https://ghc.haskell.org/trac/ghc/wiki/LinearTypes until now; I'm assuming you're referring to that?

Neat!


Indeed. Seen similar in C++


Is that good or bad? Sounds like it's good, but I've never used Haskell, so I don't know.


As someone that knows only a little about Rust and Haskell, why?


Type-level programming. It basically guarantees more correctness at the cost of steeper entry barriers. In consequence you end up with (some extension of) Haskell and a probably correct implementation that no one can start to implement ;).


The power of these is that a handful of implementations can act as enormous levers to create useful abstractions for everyone else. As an example, Rust's type system is difficult to implement and hard to prove correct - there was a problem with `std::thread::scoped`'s soundness for example - but once done, the work can be leveraged into higher (and often simpler) abstractions!

Session types in Rust are exciting to me because we might see implementations of consensus algorithms leverage the type system for correctness. Everyone else then gets to benefit from a simple API on those (complicated) types.


>Using this library, you can implement bi-directional process communication with compile-time assurance that neither party will violate the communication protocol.

No, you cannot. "Assurance of not violating communication protocol" and "process communication" together are an oxymoron. If you do this, you deserve all the possible failures that come along with it.

Please don't write infrastructure code that depends on "assurances" instead of checking and validating communications.


Consider that "compile-time memory safety" is still a reasonable concept, despite the fact that ptrace() means that anybody could technically read and write arbitrary bits in your process. Or that rowhammer could cause memory-safety-violating bit flips.

It's the same idea here. Obviously nothing can stop, say, faulty network hardware from causing protocol violations. But that doesn't stop us from meaningfully saying that, under the correct set of assumptions, protocol violations are statically prevented at compile time.


This is about validating code, not about maintaining the integrity of the system in the presence of corruption or a party that is deliberately trying to break things. It's actually helpful to be able to check that the system is sound in theory. Nobody is suggesting that you can use that as an excuse to get sloppy.


It’s about validating that your own code matches the spec. Not validating that anybody you’re communicating with does. Though if you wrote the client and the server then you know that both do.


I wish I knew how to get this point across to people. The way I think of it is that a type-checker which validates a single program at a time isn't capable of proving certain facts which hold when access to filesystems or networks are involved.


All validation is only performed up to an assumed platform.

There's usually little value in worrying too much about how a real platform won't correspond precisely to that assumed platform, since the goal of validation is to remove a class of errors, not all errors.


I think OP's point is that getting these assurances doesn't mean you can stop checking for errors that come from outside the platform.


Due to e.g. kernel or hardware bugs? Or something that can be ruled out in some scenarios, e.g. malicious processes?


I’m not sure if I’m misunderstanding the purpose of session-types, but it seem that this a lot like finagle but for rust?

Session type is basically just a finagle service, etc...

Or are they more like go channels?


Session types are more like “encode a state machine in your type system so it’s impossible to mess it up”. They don’t inherently have to do with network services, though since network services follow protocols that can be represented as state machines (think “the S in REST”), they’re a good application of session types.

That said, it looks like this library is not purely about session types, but applying them to channels. (Which are go-style, yes.)


are they used in non academic contexts ?


I'm one of the authors of the original paper/repository.

Part of the motivation behind our work was to try and apply it to the Servo project (Mozilla's experimental browser engine, of which parts have already been integrated into Firefox), which is highly parallelized. They had issues with their internal communication because of their expansive use of threads, and we wanted to see if session types could be applied to fix that. Unfortunately, Servo is a rapidly moving target, so while we did manage to convert some parts of the project to use our session types library, we never got around to finishing the work, and now I believe that they've found different solutions to their challenges.

With regards to your first question, this library is solely for in-process communication. The concept could be expanded to cover inter-process communication as well as communication over the network, and actually has been in other similar session type projects, but focusing on in-process communication allows us to make stronger guarantees.


Very nice


Session types are essentially a type-checked FSM.

So if your type checker passes, your FSM obeys the protocol requirements. (Namely, it does not pass between states in an illegal manner.)


This sounds a lot like the invariant that they're checking is that, _given a conforming counter-party_, nothing illegal will happen. But this doesn't tell you anything about what happens to you when your peer passes between states in an illegal manner. I'd expect this to have been emphasised in the paper. Especially if their threat model includes it, it would be a key selling point.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: