Hacker News new | past | comments | ask | show | jobs | submit login
Option Monads in Rust
83 points by mmastrac on Aug 16, 2014 | hide | past | favorite | 57 comments

As a Haskell programmer and a casual Rust observer (yet to take the dive), I am constantly impressed with the nice balance Rust seems to have struck between Haskell and C.

Currently my #1 and #2 languages are Haskell and C. I will definitely jump on the Rust bandwagon with the 1.0 release -- maybe sooner, we'll see.

I too am getting into Rust and am liking what I'm seeing so far. However, I find its type abstraction facilities limited. For example, the concept of a "monad" is not expressible using the trait system so that a bunch of related types can reuse the abstraction. Instead what we have to do is to declare a ListMonad<T>, OptionMonad<T>, with separate signatures for map/join/whatever. Ideally, something like Monad<M<T>> could be the trait specification.

edit: Not Monad<M<T>>, but just Monad<M> with the facility to use M<T> in the body of the trait. I don't have a clear idea of how references, lifetimes, etc. would pan out in this case.

What you are describing is essentially higher-kinded types (HKT). It is known that we want this, but it's probably going to happen post-1.0.

Shouldn't this happen before 1.0, so that the standard library can be designed to make use of it before it's frozen?

The standard library is not being frozen as a whole at 1.0, we are only providing strong backwards compatibility guarantees about the language itself. There are precise markers and compiler diagnostics that individual parts of the standard library will use to convey their stability properties (including graceful deprecation warnings).

> The standard library is not being frozen as a whole at 1.0, we are only providing strong backwards compatibility guarantees about the language itself.

That's unfortunate. :( Which version will have a full backwards compatibility guarantee?

I don't think it matters whether the full standard library is marked stable. What matters is whether the portion of the standard library that you're using in your particular app is stable. To that end, we have a lot of tooling to make sure that you can stay within the stable parts of the standard library if you wish. Version 1.0 will have enough of a stable base for most apps.

Freezing the entire standard library in one fell swoop would freeze design mistakes, and delaying 1.0 until we're sure about every last symbol in every last library would delay 1.0 for no reason. I'm confident in our approach.

> I don't think it matters whether the full standard library is marked stable.

For example, unless IO is stable no serious program can be written. But IO would benefit from monads and HKT.

try! is good enough for 1.0.

As a "many things except haskell ( yet to take a dive) and pure C" developper, i am also extremely eager to try Rust once the 1.0 is out.

One painful point i've seen so far is pointer semantics, but it seems pretty much like the only one, and after having read a slidedeck explaining the reason behind them i'm feeling even more excited.

What's the pain point of Rust's pointer semantics from the POV of a C programmer? I'm genuinely quite interested—especially given that Rust's pointer semantics is one of their major innovations.

Having done a (simple, but non-trivial) memcached implementation in Rust, the pointer semantic is terrible right now. The compiler tracks all the use of the values and assign them lifetimes. These lifetimes are tagged from a special form of use (value, mutable value, reference, mutable reference, and some more).

The idea of Rust memory-safety is to totally forbid two simultaneous mutable lifetime whatever the code path could be.

Whatever the code path could be. Do you begin to realize how strong this assumption is ?

Yes, the rust compiler is smart. But sometimes, it's not enough and you have to do the lifetime tracking job by yourself. It's awfully hard (even without recursion). But you know what's even harder ? When all the common code patterns you are aware of for a given task doesn't provide enough warranty for the rust compiler. You know what you want to do. It's indeed very simple (can be an inner swap in some sort of algorithm) but you can't find a way to warranty the compiler that all should be good. Because in its own sort of bizarre logic, it makes sense. Then you're trapped. And you are waiting on #rust@irc.mozilla.org an hint from the bunch of truly marvelous guys who pass their time here.

TLDR: Rust can't express 100% what others languages do (ML or C based). And in the portion where it can, you have to do a lot of gymnastic to make it compile.

And our Memcached, you'll ask ? Well. After 6 days of work, we finally make it compile, and it worked quite fine. But then we got some mysterious tasking errors when reaching 20 clients and gave up at this point.

FWIW, the code displays several signs of misunderstanding and generally fighting the system, rather than working with it (e.g. drop(&foo) always does nothing, and the single required lifetime doesn't seem too crazy: the returned reference comes from inside the memory that `self` points to, so the compiler needed to be told that they were connected to guarantee safety).

Of course, this is presumably a pedagogical problem (e.g. the language grows and changes so tutorials go out of date, including the official one, and the language is new, so there's not a huge number of examples anyway), but I think it's a little unfair to say that Rust can't express a lot of things, when it's mostly just a question of practice. That is, Rust's semantics are fairly different to a managed language like an ML or a "do whatever you like" one like C, and so it takes effort to really 'get' it, just like, for many people, it takes effort to 'get' Haskell.

> But then we got some mysterious tasking errors when reaching 20 clients and gave up at this point.

Your code has a lot of `unwrap` and `fail!`, it's generally recommended to use a more composable error handling technique (like returning Option and Result) or otherwise any error (either in the input, or a bug in your application) will lead to these "mysterious" errors, and especially avoiding `unwrap`.

I haven't done a serious Rust project, but it appeared to me that this kind of thing was exactly the "why" of Rust. The expectations I had gathered was that many obvious pointer algorithms would not (and should not?) translate to Rust.

I definitely asked for a Rust review from the C POV, but I feel like they do a good job marketing the idea that it's not supposed to express 100% of what other languages do.

Or perhaps it's just further from reasonably expressive than I had believed.

> I definitely asked for a Rust review from the C POV, but I feel like they do a good job marketing the idea that it's not supposed to express 100% of what other languages do.

You can express everything that other languages can do—we've written a compiler and a Web browser in it, remember. Of course there is a learning curve, but once you've learned the rules I've found that, by and large, people don't have too many problems, and the time saved by not having to track down segfaults in a debugger is a huge productivity booster.

To be clear, I meant "low level 100%" in the sense of outlawed bad programs. I'm quite confident that it'd still be sufficiently expressive for application level concerns—perhaps with a change of style!

> "low level 100%" in the sense of outlawed bad programs

There is unsafe in Rust. There you can go low level as your heart desires.

> and a Web browser in it

"Web browser" is an exaggeration for an experimental render engine in an early stage.

Calling it a render engine at an early stage is disingenuous. It's a prototype browser engine. By that notion WebKit or Blink are just text renderers.

We are far, far from 100%.

You can only take mutability lifetimes if you are alone. That mean for example that you can't insert in a tree without going mad. Yeah, you have to search (non-mut operation), insert (mut operation), and balance your tree (mut operation). Not possible.

How do you do it in the official B-Tree implementation ? You clone all the branch at each insertion[1] so that the insertion don't compete with the search. Oh well. Hopefully it's a B-Tree so it won't be too deep. But for a performance oriented language, it's make me cry a little.

[1]: a) https://github.com/rust-lang/rust/blob/master/src/libcollect... b) https://github.com/rust-lang/rust/blob/master/src/libcollect...

The btree is very immature. TreeMap is better. It demonstrates that these operations can be done fairly naturally with swaps: https://github.com/rust-lang/rust/blob/master/src/libcollect...

Yes, I can see that. And there is also a bunch of interesting patterns to learn from.

Please understand that by no mean I want to discredit your work. Rust is a great thing, and will be amazing. However, a random programmer alone in the jungle with only the rustc and the doc will had a hard time to express whatever he likes. It will get better, I trust you. Just don't let the hype grow too fast as Rust is not ready yet, not only in features but also in accessibility.

Do you have the source released somewhere? I'm interested in Rust but haven't really written anything large - I would find it interesting to see what a large project looks like, especially the kinds of problems that come up (like you mentioned).

Here are two Rust projects I did back at the 0.8 time frame. I tried to make it more read-friendly than clever hacks. Some of the syntax might be outdated since Rust has changed quickly and I didn't want to keep updating it until 1.0 is frozen.

A memcached client lib implementing both the binary protocol and text protocol, plus sharding to multiple servers. This is simpler in scope. https://github.com/williamw520/rustymem

A gzip implementation and deflate implementation. This is more complicate in scope. https://github.com/williamw520/rustyzip

> A memcached client lib implementing both the binary protocol and text protocol, plus sharding to multiple servers. This is simpler in scope. https://github.com/williamw520/rustymem

Oh, nice, I hadn't seen this before. That looks really elegant and idiomatic :)

Thanks. :) It's my first Rust project to learn it.

Of course, it's here: https://github.com/yaon/lottery

But even if you are familiar with the patterns we used and can guess the architecture, there will be a lot of unexplained code. I don't think you'll be able to extract a lot of knowledge of it.

Anyway, you should look for io_thread.rs (I think it's the most interesting file). Look how the scope are defined (especially the use and abuse of "{}"), the alternation of &self and &mut self and random methods put out of the implementation cause rustc would accept anything when they where inside.

You can also look at the mixed use of String and str which introduce a lot of chaos. And, last but not least, bonus points if you succeed to understand the why and the how of the explicits lifetimes we had to put.

Note that the newest version of master Rust does not require you to write the explicit lifetime in io_thread.rs (I didn't see any others).

The "Symphony of Functions" example is worth dissecting, since it's a good example of when NOT to use an Option monad.

The code computes sqrt(-1 * (log(-1 * (20 * 2)))^2).

The potentially failing functions are log and sqrt. Failure is determined by checking if the result is a "normal" floating point value. But 'double' and 'square' may also produce non-normal values, e.g. squaring a large float may produce infinity. So why doesn't double() return an Option as well?

Furthermore, checking if a value is normal is not correct. If x slightly exceeds 1, then log x is a small positive value, i.e. a denormal, and therefore log() will incorrectly report failure.

Floating point arithmetic already has a serviceable "Option" type in its non-finite values (NaN and infinities), so the best way to write this code is the naive way:

    let result: f64 = sqrt(pow(-1 * (log(-1 * (20 * 2))), 2));
    let success: bool = is_finite(result);
The Option monad only makes this code longer, slower, and buggier.

> If x slightly exceeds 1, then log x is a small positive value, i.e. a denormal, and therefore log() will incorrectly report failure.

No it won't, for the neighboring values of 1, which are 1-2^-53 and 1+2^-52, log will return approximately -2^-53 and 2^-52, which are not denormal.

It was a challenge to come up with a good example for the section without utilizing too much of a language that is unfamiliar to most of the readers. I'm debating changing it to a string parsing example. Do you think it would be more appropriate of a use? Or is there a better concept to demonstrate?

I'm really excited that PCWalton got some basic associated types support into Rust in time for the 1.0 milestone, that plus post 1.0 HKT (higher kinded traits/types) will make Rust a really compelling language for building amazing libraries :)

with those in place, some really beautiful api technology becomes possible to port to Rust. I really appreciate that the core dev team has those partly underway, and partly planned for the post 1.0 stuff.

Now do one about either.

Given Java 8 and Optional, I recently used Either to replace 80 something lines of crazy interconnected try-catch-bullstuff in a server with 4 straight either-based lines, "do this, in good case try this, in good case try that, ... and finally send a good result to the client, or log a bad result and send the error to the client".

Had to rework and introduce some methods to make this work, of course, but those methods turned into try { foo = doWork(); return Either.goodResult( foo ); } catch ( Exception e ) { return Either.badResult(e); }. It's structured, and neat, and readable. :)

Either is called Result in Rust.

That's a terrible name - they should change it to Either. We always call things "result" in code, but I've never called a variable "either". That's why I think Either is a better name - it's less likely to clash with valid names.

I agree with you on that. "Either" suggests 'either this or that', to me. "Result" feels too generic. Any value can be called a 'result'.

Though I don't know what the reasoning is on the naming of that type; for all I know it might make a lot of sense within the context of rust and be very much in line with the conventions of the rest of the standard library.

Result is essentially trying to only be an error-handling technique, its two variants are Ok and Err (less generic than Left/Right). I believe the 'Result' name actually comes from OCaml.

Rust used also to have an isomorphic type called Either, but error handling was found to be by far the most common use of these two types, so it was decided to standardise on a single one with obvious variant names (i.e. it's clear with Ok/Err which one is the error and which one is the real value, while Either requires getting the Left/Right convention right).

I remember having read that the convention was using Right when the value was right and Left for the error (the error that's left, I imagine).

Result makes sense to me: you either have an Ok(value) or an Error(reason). As for the name clashing, types and values have different namespaces, so no worries there.

Am I wrong in thinking this is using Option as a functor, and not a monad?

Edit: looks like rust's and_then() is similar to bind, so it's reasonable to call this monadic.

EDIT: This is wrong

I initially thought the same thing, due to the use of `map` as something of an equivalent of the Haskell `bind` (>>=) operator. It reminded me of just using `fmap`.

But no, they are using it as a full monad. Consider the type of fmap:

    Functor f => (a -> b) -> f a -> f b
and the type of (>>=)

    Monad m => m a -> (a -> m b) -> m b
The key difference is the function argument. In a Functor, the argument is (a -> b), but in a Monad, it's (a -> m b). The distinction is that when you `fmap` a function over a `Just` value (`Some` in Rust), you always get back a `Just` value, but when you bind a `Maybe` value into a function, you might get back a `Nothing`.

So don't let Rust's use of `map` deceive you -- it seems to be the same as (>>=).

That being said, I'd be interested to see Rust introduce something akin to do-notation syntactic sugar.


EDIT: I am wrong, the `map` function is basically equivalent to `fmap` and it seems the `and_then` function is similar to (>>=). See user tel and user wyager's responses.

Currently, it's impossible to introduce a generic `Monad` trait (the same with `Functor`) because we're missing Higher-kinded types. Once Rust has such a type system extension, it'll be possible to introduce some monadic sugar.

Note that you could have monadic sugar via macros today; it'd just be duck-typed, not strongly-typed.

But rust's map() is fmap. All the functions used with map() are a->a. It looks like and_then() is more equivalent to >>=.

Aaaah I misunderstood -- I thought they'd made the type signatures for all of their math functions to return an Option type. You are totally correct.

.map() is the functor interface and .and_then() is the monad one.

I don't understand the second example:

  fn some_on_even(val: int) -> Option<int> {
    match val {
        // Matches an even number.
        even if even % 2 == 0 => Some(even),
        // Matches anything else.
        _                     => None
Where is even coming from? Is that a typo, or that construct is actually defining a variable even from val if val % 2 == 0 ?

That seems absurdly confusing to me.

It's binding a variable `even`, and then using a pattern guard to conditionally take that arm depending on the value of `even`; it's (basically) the same as

  match val {
      even => {
          if even % 2 == 0 {
          } else {
      _ => None
which is an absurd construct (and the example pretty absurd too). It should be written as just

  if val % 2 == 0 {
  } else {
or, if you really like the formatting a match provides, just avoid creating a new name binding

  match () {
      _ if val % 2 == 0 => Some(val), 
      _ => None  
`match` and pattern guards make far more sense when not just binding directly to a single variable like that, e.g.

  let val: Option<int> = ...;

  match val {
      // even
      Some(x) if x % 2 == 0 => Some(x),
      // odd
      Some(x) => Some(x + 1),
      None => None
(Of course this may still be clearer using `.map` instead `match` but it's a better example than the original one.)

Match has the syntax of:

    match value {
        pattern(variable-binding) [if guard-condition] => matched result
In this case, the first 'even' is variable declaration binding it to val. The next 'if even % 2 == 0' is the guard expression, qualifying the range of values of the 'even' variable. The 'if' is a keyword, indicating a guard expression. The next '=> Some(even)' returns an Option type wrapping the matching 'even' values.

The wildcard _ matches whatever left, which is the non-even values of val.

To answer your question, it's the latter. That's what pattern matching is, basically.

There's a typo in the title, it should say "Option Monad in Rust".

Reposts are ok when a post hasn't had significant attention in about a year: https://news.ycombinator.com/newsfaq.html.

15 points is a fair bit of attention, but 0 comments not so much, so I think this repost is ok.

That one doesn't seem to be a year old, but the lack of comments make this bump reasonable IMO.

I fear that what I wrote wasn't clear. The rule about significant attention only applies to stories that are less than a year old. After a year or so, reposts are typically ok.

Aren't submissions folded to the last one, if the URL is the same?

I could swear I have seen that even when the prior submission was a few years old.

Does the algorithm to decide that take comment activity into account?

The algorithm doesn't currently take the time of submission or number of comments into account, but we're working on a new implementation that will handle all of this.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact