
Zero-cost futures in Rust - steveklabnik
http://aturon.github.io/blog/2016/08/11/futures/
======
AndyKelley
This is huge.

This allows one to express concurrency in a natural way not prone to the
typical errors of problems of this nature, with no runtime overhead, while
competing with C in terms of the constraints of the runtime.

Big kudos to Aaron Turon and Alex Crichton. You guys knocked it out of the
park.

~~~
tracker1
Agreed, and very cool... I do think that async/await syntax is very nice for
reducing a certain amount of nesting bloat. Even with chaining, it doesn't
look as nice imho, though better than deeply nesting callbacks.

------
rdtsc
As mentioned in the post, given Rust wants to operate in the same space as C,
this approach makes sense. However from a higher level, building more complex
concurrent systems, dealing with futures/deferred-s/promises and/or a central
select/epoll/kqueue reactor loop gets daunting and doesn't mix with complex
business rules.

Deferred based approach has been a round for many years. I experienced it by
using Twisted (Python framework) for 5 or so years. And early on it was great
However when we switched to using green threads, the logic and amount of code
was greatly simplified.

So wondering if Rust provides any ability to add that kind of an N:M threading
approach. Perhaps via an extension, macro or some other mechanism.

Note that in C, it being C such things can be done with some low level
trickery. Here is a library that attempt that:

[http://libmill.org/](http://libmill.org/)

And there were a few others before, but none have taken off enough to become
mainstream.

~~~
pcwalton
> So wondering if Rust provides any ability to add that kind of an N:M
> threading approach. Perhaps via an extension, macro or some other mechanism.

I don't want M:N threading as Go implements it. It's a big loss of performance
for marginal benefit over futures. In particular the libmill approach was
tried in Rust and the results were far worse than 1:1.

However, assuming this takes off I would like to see async/await style
syntactic sugar over futures down the road to make it easier to write code
that looks blocking but actually isn't. Crucially, this sugar would maintain
the zero-cost nature of futures. With that approach, we'd have the ergonomics
of languages like Erlang and Go without the significant performance tax
associated with M:N.

~~~
sebcat
> I don't want M:N threading as Go implements it.

As someone who writes code for enterpricey businesses doing a lot of I/O bound
stuff, golang style M:N threading is a godsend over Java's standard library,
and other common platforms in that space. Being able to express your code in a
sequential manor and still gain the performance offered by implicit
fiber,goroutine,w/e scheduling is pretty awesome. With futures, there's still
some language semantic overhead from the point of view of the developer.

In my problem space, M:N threading is simply worth it. And if I need anything
better performing, I can still switch to C for specific use cases.

~~~
pcwalton
> Being able to express your code in a sequential manor and still gain the
> performance offered by implicit fiber,goroutine,w/e scheduling is pretty
> awesome.

You don't gain as much performance. On Linux, you don't actually gain that
much if anything over 1:1 threading. Most of the benefits of goroutines
actually comes from the small stacks, which don't have anything to do with M:N
and 1:1 to begin with—they're a feature of GC.

As the blog post states, our end goal is to achieve the ergonomics of Go-style
M:N without sacrificing nginx levels of performance. With this futures
library, we've established the foundations. It would make no sense to give up
before even trying.

> And if I need anything better performing, I can still switch to C for
> specific use cases.

Why would you write networking code in C in 2016, when there are better
alternatives available (like this one)?

Rust's philosophy is to have _both_ performance _and_ ergonomics. It rejects
the idea that optimal performance requires a cumbersome programming model. I'm
not about to give up on that.

~~~
lightcatcher
> Why would you write networking code in C in 2016, when there are better
> alternatives available (like this one)?

Support for kernel bypass networking libraries like ibverbs, DPDK (has an old
unmaintained Rust wrapper) and other IO kernel bypass libraries such as SPDK
and IOAT.

If Rust supported these libraries, I'd much prefer the future based Rust code
to a massive event loop in C.

~~~
benlwalker
I'm one of the authors of SPDK (which includes an NVMe driver and an IOAT
driver). If the community wants to add rust bindings to those two components
I'd be very supportive.

------
jaytaylor
TLDR;

 _I’ve claimed a few times that our futures library provides a zero-cost
abstraction, in that it compiles to something very close to the state machine
code you’d write by hand. To make that a bit more concrete:

\- None of the future combinators impose any allocation. When we do things
like chain uses of and_then, not only are we not allocating, we are in fact
building up a big enum that represents the state machine. (There is one
allocation needed per “task”, which usually works out to one per connection.)

\- When an event arrives, only one dynamic dispatch is required.

\- There are essentially no imposed synchronization costs; if you want to
associate data that lives on your event loop and access it in a single-
threaded way from futures, we give you the tools to do so._

This sounds quite badass and awesome. I'm not sure what other language
implementations take this approach, but this is clearly an extremely
beautiful, powerful, and novel (to me at least!) concept. Before reading this,
I thought rust was great. This takes it to the next level, though.

~~~
merb
that sounds great but I guess that will make the compilation time bigger.

~~~
alexcrichton
In Rust it's frequently the case that slow compilations are dominated by
generating and optimizing LLVM IR. This codegen step (generating LLVM IR)
often takes awhile just because we're generating so much IR.

Rust takes an approach with generic functions called monomorphization which
means that we generate a new version of each function for each set of generics
it's instantiated with. This means that a future of a String will generate
entirely different code from a future of an integer. This allows generics to
be a zero cost abstraction because code is optimized as if you had substituted
all the generics by hand.

Putting all that together, highly generic programs will generally trend
towards higher compile times. With all the generics in play, there tends to be
a lot of monomorphization which causes quite a lot of LLVM IR to get
generated.

As with many aspects of Rust, however, you have a choice! Rust supports what
we call "trait objects" which is a way to take a future and put it behind an
allocation with a vtable (virtual dispatch). This forces the compiler to
generate code immediately when a trait object is created, rather than down the
line when something is monomorphized.

Put another way, you've got control over compile times if you're using
futures. If you're taking a future generically and that takes too long to
compile, you can instead take a trait object (or quickly convert it to a trait
object). This will help cut down on the amount of code getting monomorphized.

So in general futures shouldn't make compilation worse. You'll have a choice
between performance (no boxes) and compile times (boxing) occasionally, but
that's basically already the case of what happens in Rust today.

~~~
SideburnsOfDoom
Hm. I've heard arguments that C# or Java is slow for multiple reasons, but
never because of the minuscule overhead of a virtual method dispatch when
using objects behind interfaces (kinds similar to trait objects).

It's interesting that this is seen as significant here. Are we dealing with
much shorter timescales, or just being eager to optimise everything?

~~~
gpderetta
Virtual dispatch per se is not terribly slow, as long as the branch is
predictable by the CPU. The problem is that virtual dispatch prevents the sort
of aggressive inlining and interprocedural opimizatios that C++ compilers are
known to do. C# and Java JITers get around that via runtime analysis and
speculative inlining, but that is done at runtime and eats away some of the
precious little time available for optimisations.

Edit: spelling

~~~
SideburnsOfDoom
Put it this way:

Cost of a branch misprediction is 10s of cpu cycles. (1) Measured in gigahertz
(10^9 cycles per second).

Time to turn around a web request is, if you're very lucky and have done the
work, mainly about getting a value from an in-memory cache at multiple
milliseconds (2). That's 1 / (10^3) seconds.

If you're not lucky, 10s or 100s of milliseconds to generate the response.

It seems that the second duration is best case around 10^6 times longer. I
would not sweat the first one.

1)
[http://stackoverflow.com/a/289860/5599](http://stackoverflow.com/a/289860/5599)
2) [http://synsem.com/MCD_Redis_EMS/](http://synsem.com/MCD_Redis_EMS/)

~~~
gpderetta
Contrary to popular belief, not all C++ programs (or rust FWIW) are web
servers serving HTTP requests over the Internet.

~~~
SideburnsOfDoom
Yep, that's why I'm asking about the use-cases in the grandparent comment.

~~~
gpderetta
As an example, many real-time systems are often a giant ball of messy
asynchronous code and state machines. Futures can help with that, although
lately I have found that somtimes the best, cleanest, way to implement a state
machine is to make it explicit.

~~~
haimez
How much do you attribute that to the benefit of creating a high barrier to
entry for modifying that code? Could this be summarized as: code that
inexperienced devs can't understand, stays performant because they can't
figure out how to change it?

~~~
gpderetta
None of the teams I've worked with had such a policy and certainly I wouldn't
work in a team like that.

------
losvedir
I dabbled with rust in the past and was really fascinated with it, but haven't
played around lately. One thing caught my eye in the post:

    
    
        fn get_row(id: i32) -> impl Future<Item = Row>;
    

That return type looks odd to me. What does it mean to return an "impl", and
is that a new feature in rust, or just something advanced that I missed in my
exploration before?

~~~
aturon
This is an exciting upcoming feature in Rust, which you can read more about in
a couple places:

\- [http://aturon.github.io/blog/2015/09/28/impl-
trait/](http://aturon.github.io/blog/2015/09/28/impl-trait/) \-
[https://github.com/rust-lang/rfcs/pull/1522](https://github.com/rust-
lang/rfcs/pull/1522)

This feature allows you to return any struct that implements the trait,
without having to type the name of the struct (which can sometimes be quite
big). It also means that clients only know what traits are implemented; the
concrete type is invisible to them. The above links have a bunch more detail.

(And this feature is set to land in nightly Rust very soon! Shoutout to eddyb
:)

~~~
dukerutledge
So, existentially quantified types?

~~~
GolDDranks
Yes. However, there isn't full support of existential types in the horizon –
only returning them from a function (and using them inside the caller, as the
type inference allows this). This resolves some specific pain points of the
current Rust experience.

There may or may not be some extensions (like storing them in fields of
structs) in the future.

Note that Rust supported existentials before too, in the form of trait
objects. But this was only possible behind a pointer and using virtual
dispatch, so the performance story wasn't perfect. The impl Trait syntax is
supported through monomorphization.

~~~
dukerutledge
Yeah, since rust is eager existential types seem like a necessary evil.

------
leovonl
In my opinion - as someone with some background in CS - the name "future" is a
little too overloaded here. It is not only used for the deferred computation
of a value so much as it also means the composition of computations. This is
not wrong per se, but calling the result a "future" alone oversimplifies
what's happening below and hides some properties about the combinations.

The first observation one can make - which is not mentioned anywhere in the
article - is that the composition of futures here can be understood as a
monadic composition. This by itself gives a big hint why this interface is so
powerful. Second is that this library could be understood as an implementation
of process and process combination from pi-calculus [1] - sequential
combination, joining, selection, etc - so it could be formalized using its
process algebra.

From the practical side, one example of a mature library that implements
similar concepts is the LWT [2] library for OCaml, which has the same idea of
deferred computation, joining and sequencing, but calls the computations
"lightweight threads". One could also argue about naming in this case, but it
seem to reflect a better the idea of independent "processes" that are combined
on the same address space.

Finally, as much as these concepts of futures and processes look similar on
the surface, they each have their own properties - so it's always good to
consider what better fits the model. By looking at the research and at other
similar solutions, one can make more informed choices and have a better idea
of what to expect from the implementation.

[1]
[http://www.cs.cmu.edu/~wing/publications/Wing02a.pdf](http://www.cs.cmu.edu/~wing/publications/Wing02a.pdf)

[2] [http://ocsigen.org/lwt/manual/](http://ocsigen.org/lwt/manual/)

------
nv-vn
Anyone else find the f.select(g)/f.join(g) syntax unintuitive/awkward? I'm
confused as to why they wouldn't go with the (IMO) more logical select(f, g)
and join(f, g) in this case (since neither Future is really the "subject" in
these cases). Not that this is a major concern (it would take only a few lines
of code to change within your own program using an alias for the functions),
just interested in knowing the rationale behind the choice.

~~~
samnardoni
You can write it both ways. t.method() is the same as T::method(t).

------
soulbadguy
Finally a nice async/io interface for rust, always felt that it was a big
missing piece, couple of questions for peps familiar with async in other
languages :

1 - Isn't the state machine approach the same as C#/.net async/await is using
? But the with the added convenience of the syntactic sugar ?

2 - no allocation : , does'nt the lambda closure need to be allocated
somewhere ?

3 - I would have love some comparison (booth performance wise and on theory)
with C++ up comming coroutine work, from my understand the C++ approach is
even more efficient in term of context switching and have the advantage of
even less allocation.

~~~
steveklabnik
1\. I am not sure exactly how async/await is implemented, but I believe it is
very similar. Some people are also working on implementing similar sugar in
Rust, but it's not done yet.

2\. Closures are on the stack, not the heap, by default, in Rust. If you don't
see a Box, they're not heap allocated.

3\. I agree!

~~~
moosingin3space
Stack-allocated closures are one of my favorite features of Rust. They make it
easy to write functional-style code that performs well.

------
cyber1
Little benchmark rs-futures vs lwan ([https://lwan.ws](https://lwan.ws)) on my
machine Core i5

futures-minihttp(singlethread):

    
    
      $ wrk -c 100 -t 2 -d 20 http://127.0.0.1:8080/plaintext
      Running 20s test @ http://127.0.0.1:8080/plaintext
        2 threads and 100 connections
        Thread Stats   Avg      Stdev     Max   +/- Stdev
          Latency   823.09us  449.37us  20.98ms   98.69%
          Req/Sec    62.15k    10.51k  105.24k    48.63%
        2479035 requests in 20.10s, 340.44MB read
      Requests/sec: 123335.77
      Transfer/sec:     16.94MB
    

lwan(singlethread):

    
    
      $ wrk -c 100 -t 2 -d 20 http://127.0.0.1:8080/
      Running 20s test @ http://127.0.0.1:8080/
        2 threads and 100 connections
        Thread Stats   Avg      Stdev     Max   +/- Stdev
          Latency   596.45us  573.31us  24.46ms   99.33%
          Req/Sec    86.17k    13.15k  119.71k    76.00%
        3429720 requests in 20.01s, 624.73MB read
      Requests/sec: 171404.15
      Transfer/sec:     31.22MB
    

For lwan i use http server example from lwan.ws main page.

As you can see in this example C http server much faster than simple http Rust
server.

* futures-minihttp release build

* lwan -O3

~~~
killercup
On Reddit [1], Alex mentioned that the single-thread case was not optimized
(yet). What do the numbers for multi-threading look like?

[1]:
[http://reddit.com/r/rust/comments/4x8jqt/zerocost_futures_in...](http://reddit.com/r/rust/comments/4x8jqt/zerocost_futures_in_rust/d6dm0f5)

~~~
cyber1
futures-minihttp(2 threads):

    
    
      $ wrk -c 100 -t 2 -d 20 http://127.0.0.1:8080/plaintext
      Running 20s test @ http://127.0.0.1:8080/plaintext
        2 threads and 100 connections
        Thread Stats   Avg      Stdev     Max   +/- Stdev
          Latency   633.87us  808.22us  24.66ms   98.41%
          Req/Sec    86.72k     4.84k   94.90k    85.75%
        3452383 requests in 20.01s, 424.73MB read
      Requests/sec: 172546.21
      Transfer/sec:     21.23MB
    
    

lwan(2 threads):

    
    
      $ wrk -c 100 -t 2 -d 20 http://127.0.0.1:8080/
      Running 20s test @ http://127.0.0.1:8080/
        2 threads and 100 connections
        Thread Stats   Avg      Stdev     Max   +/- Stdev
          Latency   375.26us  618.54us  20.60ms   98.21%
          Req/Sec   116.82k     5.56k  124.07k    92.25%
        4647906 requests in 20.00s, 846.62MB read
      Requests/sec: 232339.90
      Transfer/sec:     42.32MB

------
Animats
This is cute. This is clever. Whether or not it's too clever time will tell. A
year ago, I noted that Rust was starting out at roughly the cruft level C++
took 20 years to reach. Rust is now well beyond that.

All this "futures" stuff strongly favors the main path over any other paths.
You can't loop, retry, or easily branch on an error, other than bailing out.
It's really a weird syntax for describing a limited type of state machine.

I'm not saying it's good or bad, but it seems a bit tortured.

~~~
Manishearth
How does futures count as cruft? It's a pure library. This is like saying
GObject is cruft in C. It is a library that some folks don't like (like most
libraries), but it is not part of the language and nobody is forced to use it
in their code.

Why do you think Rust has cruft anyway? It has a lot of typesystem features,
yes, but this is no different from languages like Haskell. These features work
together nicely and are useful. C++ has lots of features which for better or
for worse have been hacked in to the language (can't be made an organic part
of the language because backcompat). This is not the case with Rust (or D,
which is to me a cruft-less organically-designed C++)

~~~
Animats
Yes, it's a pure library. But this sort of thing is becoming standard for
Rust. If your code isn't full of "foo.and_then(|x| ...)" it's uncool. This
isn't "functional"; these functions have major side effects.

The bothersome thing is that the control structures of the language are hidden
under object-specific functions. "Things really start getting interesting with
futures when you combine them. There are endless ways of doing so." I'd rather
have "there's only one way to do it", as in Python, rather than "endless ways
of doing so". That usually leads to code that's hard to read and debug. As in
"how did control get there?". At least in traditional code, you can see
control flow easily. Adding an user-definable level of abstraction hides that.

~~~
Manishearth
> This isn't "functional"; these functions have major side effects

uh, no, I've rarely seen adaptors like and_then being used with side effects
(folks use regular loops if they want that). Rust doesn't have a strict notion
of purity, but that doesn't mean that most rust code isn't pure.

There's nothing wrong with having lots of adaptors scattered around the code,
either. It's not less readable, it's just _different_.

> There are endless ways of doing so." I'd rather have "there's only one way
> to do it", as in Python

Uh, "there are endless ways of combining future adaptors", not "there are
endless ways of solving a problem". Each combination of adaptors solves a
different problem (mostly).

> That usually leads to code that's hard to read and debug

This is async code. This has always been hard to read and debug. Futures make
the control flow more explicit, if anything (especially if you have async and
await), because the flow is now in one place, at least. Grokking manual event
loop code is much more annoying.

Sure, the async argument doesn't apply to regular iterators and Option.
However, these "object specific functions" are not object specific. All
futures have the same adaptors. All iterators have the same adaptors. The only
special objects with their own set of such methods are Result and Option.
These share the names of the methods, and these are used often enough to
justify it. There aren't that many of them either, so this really isn't that
big a deal. You just need to know what each of this small number of methods
does. It only hides control flow if you're not aware of these methods, which
is a state of mind that goes away quickly after the first few rust programs.

Besides, because of the closures it's pretty obvious that _some_ tweaking of
control flow is happening, so it isn't hidden. You can check the docs for that
function to know exactly what the tweaking is.

I also don't know what you mean by "traditional code", this pattern is
exceedingly common in languages which aren't C or C++.

------
thomasahle
I'm confused by

    
    
        .map(|row| { json::encode(row) })
        .map(|val| some_new_value(val))
    

Over

    
    
        .map(json::encode)
        .map(some_new_value)
    

Is the explicit extra layer of lambda generally prefered in Rust over just
passing the functions?

~~~
steveklabnik
It's just a style thing, some people prefer one way, some another. I
personally prefer the latter. They compile to the exact same thing.

~~~
thomasahle
The makes sense. Do you know if Rust already has an idiomatic correct style?

~~~
steveklabnik
I don't think that there's ever been an explicit discussion about it. I'm not
even sure how many people know the latter is possible, to be honest, it's a
bit harder to learn about.

------
bfrog
I love the direction this is going, and the performance it achieves.

Debugging promises/deferreds in other languages has given me nightmares,
compare with erlang/golang debugging where you get a simple stacktrace.

Does this provide some nice way of debugging complex future chains? Are there
plans towards making it super easy to debug?

Cheers!

~~~
inglor
Native promises in JavaScript suffered from debugging issues - but there are
hooks that can help you with those - like unhandledRejecton and the like.

Debugging promises in JavaScript is very easy at the moment.

------
Manishearth
I'm rather surprised by the benchmark; I would expect the Go benchmark to be
faster than Java (and the fact that it isn't may indicate some improvements
that can be done to fasthttp by learning from rapidoid or minihttp). Then
again, the difference isn't that much, so it just could be implementation
details that would require a total refactor to fix.

~~~
jerf
You may find this makes somewhere more sense to think of it as ~5.3
microseconds per request for fasthttp vs. ~4.8 microseconds for Java vs. ~4.3
for Rust. It's 40 microseconds or so for the standard lib Go. I'm just
eyeballing the graph but this should be close enough (dominated by local CPU
variances and such). Just as some people point out that "gallons per mile" is
a more intuitively useful way of thinking, I think that at this scale
"overhead per request" is a better way of thinking about it.

I'm not generally a big fan of measuring the "ping time" response for web
servers, but in this case I believe it is justified since we really are trying
to establish that this future library is very fast. I fear, based on
experience, that some developers well be looking at this and will sit there
trying to choose fasthttp vs. rapidoid vs. minihttp based on this one graph,
without considering what they really mean. Overhead-per-request I think makes
it more clear that for the vast, vast majority of purposes, all of these,
including Go and Node, are "way way faster than your code", and unless you
_know_ you're building a server where you seriously need to answer an API call
at several hundred thousand responses per second, all of these are "fast
enough" and the real criteria for choosing should be "everything else".

(One last edit... remember, it's "milli/micro/nano". Micro seems to be
forgotten since it seems like many things that we care about fit into nano- or
milli-. It's quite a challenge in the web world to get your request out in
under a millisecond usually, so .040 milliseconds added as HTTP overhead is
rarely the problem.)

~~~
aturon
Thanks for this comment! I actually totally agree with this perspective, and
wish I'd used your suggested scale in the post.

------
skybrian
Is there any special handling for Futures that complete with an error?

Also, how do you debug code that's hung or taking too long? It might be useful
to get a list of all the jobs (incomplete Futures) that are currently running,
much like running 'ps'.

~~~
aturon
Yes -- the blog post didn't go into details about this, but Futures in general
have an error type as well, and all the combinators know how to propagate
errors correctly. (There's also a notion of "cancellation" for a future --
we'll get into this with later posts).

In terms of debugging, there's not infrastructure currently, but the kind of
thing you're talking about should be easy to add!

~~~
KirinDave
I've used async tooling extensively on several platforms now, including Go,
C#, Java, Clojure's core async, the new async methods in JS.

The challenge of these state machine codegen abstractions is not performance.
It's making source-facing debuggers do the right thing and handle errors in a
way that doesn't break the illusion that the code running is the code in
source.

If you can match the degree of support C# has for error handling, you'll be in
an amazing place.

------
bascule
While the benchmarks are looking a lot better than many other similar Rust
libraries in this space, I'm not sure the code is in a state where they're
actually meaningful yet: [https://github.com/alexcrichton/futures-
rs/blob/master/futur...](https://github.com/alexcrichton/futures-
rs/blob/master/futures-minihttp/src/response.rs#L30-L35)

~~~
steveklabnik
I agree that good benchmarks are important. This is the Techempower benchmark,
which only requires this, and is how it's implemented in most languages for
Techempower. See
[https://github.com/TechEmpower/FrameworkBenchmarks/blob/180d...](https://github.com/TechEmpower/FrameworkBenchmarks/blob/180d44e1064abc6fe8c703b05e065c0564e6ee05/frameworks/Java/rapidoid/src/main/java/hello/SimpleHttpProtocol.java#L37-L77)
for example.

Extra tests for more complex stuff would be great.

------
plesner
This looks really impressive. I'm curious what the story is around propagating
errors through chains of futures. Traditionally future libraries don't pay
much attention to that which can make debugging excruciating, which it doesn't
have to be. But then rust does errors differently so maybe it's less of an
issue there?

About the naming though, I was a little disappointed. Out of future, deferred,
and promise, "promise" is the better term. The two others imply that something
will happen later which is misleading because it's fine to have promises stick
around long after they're fulfilled.

------
cm3
This is cool and validates Rust, but I just want to add that even 2kb stacks
as mentioned in sibling comments is bigger than Erlang's process stacks. In
Erlang 19.0.3, even with dirty-schedulers enabled, a process's default size is
338 words.

------
lossolo
Why you didn't compare it to C++ or C ? If you want to compete with C/C++ it
would be natural to compare those in benchmarks. Java and Go have GC. It's
like comparing super car with street cars when you should compare it to other
super cars.

~~~
Manishearth
Because they're benchmarking the implementation of async IO, and showing the
results against libraries that folks use. This is not a common pattern in
C/C++, so the libraries there are fewer and less people have used them.

But, if you want C++, check the data from the original benchmark which was
linked:
[https://www.techempower.com/benchmarks/#section=data-r12&hw=...](https://www.techempower.com/benchmarks/#section=data-r12&hw=peak&test=plaintext)
. The Java one is slightly faster than the C++ one (more like "just as fast",
since the difference is tiny). And there are 7 more java libs (half of them
with recognizable names) before the next one (which I've never heard of),
which underlines my point about this being more common in Java than C/++.

The benchmark wasn't shown to prove that Rust is the fastest language in the
universe. It was shown to prove that the Rust futures implementation is
competitive with the others in use today, ones which people would recognize.

------
tomdale
The recent flurry of activity around async IO in Rust has been really
exciting; to me, it indicates that the core team's decision to stabilize the
language was a smart bet that is paying off in rapid ecosystem growth.

One quibble I have with this post is that it talks about futures as a zero-
cost abstraction. That might be true (or close to true) from a performance
perspective, but in my (admittedly inexperienced) opinion, it seems to have a
significant ergonomic cost that is not accounted for.

While futures help us deal with multi-threaded coordination of data from
multiple sources, that overhead isn't necessary for situations where you're
running in a single thread dedicated to doing IO operations.

Dealing with futures in your code is not non-trivial. Browsing through the
futures version of the HTTP server, I had a hard time following along:

[https://github.com/alexcrichton/futures-
rs/blob/master/futur...](https://github.com/alexcrichton/futures-
rs/blob/master/futures-minihttp/src/lib.rs#L92-L133)

And it requires a bunch of helper code to go with it:

[https://github.com/alexcrichton/futures-
rs/blob/master/futur...](https://github.com/alexcrichton/futures-
rs/blob/master/futures-minihttp/src/io2.rs)

The blog post mentions Tokio, another high-level abstraction on top of mio (by
the same author). Because it doesn't require the futures abstraction from top-
to-bottom, it offers similar (maybe even a little better) performance with
what, to my eyes, is far simpler code:

[https://github.com/tokio-rs/tokio-
minihttp/blob/master/src/l...](https://github.com/tokio-rs/tokio-
minihttp/blob/master/src/lib.rs#L21-L44)

I'm still learning Rust and spend most of my time in JavaScript. The analogy
I'd use is: imagine if in the Node programming model, every API required you
to use JS Promises, even at the very lowest level. Even if you could reduce
the cost of creating new Promise objects, interacting with them over simple
values could make the code you write more verbose. In Rust, that problem is
exacerbated by the much stricter type system and the fact that you have to do
cross-thread coordination.

I'm a total beginner to systems programming, and a lot of this stuff is above
my pay grade. However this shakes out in the community, I'm very happy to see
Rust on the way to becoming the fastest, most productive way to write high-
performance web services.

~~~
aturon
Thanks for the thoughtful reply!

I'm a little confused about the snippet you're pointing out. It's not actually
_using_ futures at all! In fact, that code is just part of setting up threads
for the server. The reason it's more complicated than the version in Tokio is
that minihttp supports _multiple_ event loop threads (which gives some
performance benefits), and this code is handling that setup.

At a broader level, Tokio's services -- the main thing users write -- are
based on futures, in exactly the same way as minihttp. So for people writing
actual servers, the ergonomics should be the same.

Now, stepping back, it's definitely true that ergonomics are a cost to be
aware of, and it's one we've thought carefully about in the design of futures.
We've had a lot of experience and success in Rust with iterators, which share
a lot of the same API design philosophy.

That said, I do anticipate that over time we'll want to layer sugar on top of
futures. As I mention a couple times in the blog post, async/await (or
something like it) is the obvious way to do it, and there's plenty of prior
art. But I think we should walk before we run -- let's make sure we've got the
core abstraction right, and then we can sprinkle some sugar where it's needed.

------
crudbug
One thing I have not seen in discussion is - Work vs. Worker abstraction.

Your application work - computation logic/business rules, should be decoupled
from the type of worker.

The worker can be - blocking or non-blocking - Futures/Continuations/Co-
routines.

------
vvanders
Not sure if I missed this in the post, does this depend on any unstabilized
features or can we use this today on 1.10.0 stable?

Awesome stuff btw, love the iterator inspiration.

~~~
alexcrichton
You can indeed use this on stable Rust today! Right now 1.9.0 is the minimum
supported version due to the usage of `catch_panic` in a few places.

I'd recommend a beta compiler for now though to compile some of the examples.
There's a bug in the stable compiler which causes them to take up to 8x longer
to compile, but beta/nightly are both speedy!

~~~
vvanders
That's so awesome. I was hoping that was the case.

Serious Kudos, it feels like a lot of the promises of Rust are really paying
off here.

~~~
jdub
"promises" _applause_

------
saynsedit
Big downside is now you will have a dichotomy of functions that block using
futures and functions that block at the OS level and no sane way to intermix
them. Rust essentially becomes two languages. Async/await sugar doesn't fix
this.

Would be great if functions could be written in a general way for both IO
models and users could select the implementation at their convenience.

~~~
steveklabnik

      > no sane way to intermix them.
    

My understanding is, the idea is to put the blocking stuff in a threadpool
with [https://github.com/alexcrichton/futures-
rs/tree/master/futur...](https://github.com/alexcrichton/futures-
rs/tree/master/futures-cpupool)

~~~
saynsedit
Right, there may be a lot of back and forth marshaling between using thread
pools and not depending on whether the library you're using is futures based
or not.

Maybe you use one library that is futures-based and one that isn't. Maybe the
library you use is mostly non-blocking except for one use of sleep() or
another esotorically blocking call. It's just annoying and prone to error.
Most people may not even be aware of the subtly blocking nature of the code
they use in their futures-based project.

This is what I mean by having two different languages. Libraries written for
one aren't always/simply compatible with the other. You'll have a growing
community of nominally "futures-based" rust libraries too.

This is why people use Go or Erlang. It just removes the need to have to think
about this. Not saying they are generally better than Rust, and some Rust
people may even like having a futures-based sublanguage, but I suspect most
programmers will be loathe to have to deal with the extra mental tax.

~~~
pcwalton
> Right, there may be a lot of back and forth marshaling between using thread
> pools and not depending on whether the library you're using is futures based
> or not.

So just like if you use cgo. You can't get away from having to deal with the
issue entirely; the most you can do is to punt it to the FFI layer. There is
the question of how much of the community is using blocking vs. nonblocking
I/O, to be sure, but Go has a version of that too: how much of the community
is using cgo vs. how much of the community is writing in pure Go.

> This is what I mean by having two different languages.

Calling them "two different languages" is a huge exaggeration. You simply
block or switch to a thread pool: it's very easy.

> I suspect most programmers will be loathe to have to deal with the extra
> mental tax.

I like having the low-level control over blocking vs. not, especially in
situations where I can't use async I/O everywhere (for example, my work on
Servo). In fact, it's essential.

Ultimately this is going to come down to "you should be willing to pay a
performance and control tax for a more ergonomic model" vs. "you shouldn't
give up performance and control for a small amount of ergonomics". Yes, there
is a tradeoff here. That's fine. Taking Go's side of the tradeoff would make
Rust unusable for my domain, and for many others (which is why M:N was the
most controversial issue ever in the Rust community, with most of the
community demanding it to be removed, while in Go nobody questions it). Some
people may not want Rust's side of the tradeoff, and that's fine too.

~~~
saynsedit
I see your cgo analogy but at the same time it's much less pronounced there
since the programming interface is the same, the programmer is supposed to
assume everything will work as it should (even if it doesn't always). In this
case it's a different programming interface and I think that stresses the
issues.

Regarding your comment on preferentially having control over blocking/async
code. I think that's right. At the same time, some C++ programmers would say
that they prefer having to think carefully about how memory is managed in
there program (say, for the benefit of fast no bounds checking). C++ draws a
line, Rust draws a line, Go draws a line, Java draws a line, and Python draws
a line. These lines are somewhat about technical superiority and somewhat
about programmer identity/preference but they are mostly about domain-specific
constraints and necessary tradeoffs. This futures-based approach will be
sufficient (if somewhat inconvenient) where Go/Erlang can't be used, e.g.
where GC pauses are absolutely intolerable.

~~~
pcwalton
It's not just about GC pauses. Rust isn't "little Go" that you reach for only
when you can't afford a GC. Many people choose Rust for the cargo package
manager, generics, pattern matching, mature optimizer that prioritizes runtime
speed over compilation speed, ability to write libraries callable by any
language, fast FFI, compiler-enforced data race prevention, memory safety in
multithreaded mode, etc. etc. These benefits apply to servers too. And many of
these benefits are what lead to the futures model being more appropriate than
the M:N model for the language.

Go has its benefits too, of course! One of those benefits is that blocking I/O
is a simpler mental model. Both languages can happily coexist without one
being in the shadow of the other.

------
the_mitsuhiko
Wohoo. I was waiting for this. I hope that at a later point this will also
mean that we get some sort of syntax support for it once it's stable and
entered std.

~~~
kibwen
What sort of syntax support do you have in mind? (Personally I think any
language-level support is unlikely, given that Rust doesn't make a point of
privileging any particular approach to concurrency.)

~~~
the_mitsuhiko
See Python 3.5, ES6 and C# and their await stuff. They generally use a form of
future internally to abstract this.

~~~
kibwen
Ah yes, I wasn't thinking of async/await. I suppose there's enough precedence
from other languages to warrant that, though this is probably far-future work
(I don't even expect futures to land in the stdlib for quite a while, though I
would like them to eventually).

------
hinkley
Back when futures and promises were a new concept to most people, if someone
asked me to explain why you would want to do such a thing, my favorite example
was loading images in a web browser. You wouldn't want to load the same image
four times just because it appears in four places on a page, would you? Yada
yada promises etc etc.

Seeing articles like this makes me feel like a circle has finally been closed.

------
soulbadguy
For those who are curious about how does that fair against a coroutine based
approach : [https://www.youtube.com/watch?v=_fu0gx-
xseY](https://www.youtube.com/watch?v=_fu0gx-xseY)

------
kbenson
> a simple TCP echo server;

How convenient. I've been exploring/learning Rust, and writing a simple echo
server and comparing it to a reference version I've written in Perl is my
first semi-trivial program I wanted to do to compare.

------
michaelmior
Curious if someone has tried this and Eventual[0] with any thoughts on how
they compare.

[https://github.com/carllerche/eventual](https://github.com/carllerche/eventual)

~~~
alexcrichton
The futures crate is intended to be the successor to eventual, the author of
which, Carl, helped us with some key insights in the futures crate as well.

~~~
michaelmior
Cool! Thanks for the reply :) It's nice to see some collaboration as opposed
to trying to decide between competing crates.

------
meneses
Aawesome. So to cancel a future, I just drop it! Awesome.

------
ufo
Unfortunately, it seems that you still need to use callbacks and lots of
and_thens to write this async code.

Wouldn't it be possible to add coroutines to Rust instead?

------
eggnet
How are futures handled for open() and disk i/o?

~~~
steveklabnik
Nothing special yet; those are still implemented in a blocking way, and so
should be put into a threadpool.

~~~
eggnet
A threadpool with futures support, I gather. Very nice.

~~~
steveklabnik
[https://github.com/alexcrichton/futures-
rs/tree/master/futur...](https://github.com/alexcrichton/futures-
rs/tree/master/futures-cpupool)

------
matthewaveryusa
I'm genuinely interested in knowing what the problem is with an event loop
using epoll and a threadpool for IO that blocks but epoll can't poll. I've
used proprietary event loops at 2 giant companies, libuv with C, asio with cpp
and nodes async, and the async IO was never the problem in terms of
performance or complexity. What is the problem that's trying to be solved?

~~~
sanxiyn
The problem being solved is better UX (in this case, developer experience). It
_is_ an event loop using epoll and a threadpool, just in a more palatable (and
composable) interface, without much overhead.

------
ridiculous_fish
How does the zero-cost abstraction work?

Say we make a Future<Int> and then chain `.map(|x| x+1)` on a dynamic number
of times (N). Presumably this requires storing at least N function pointers.

How can we store these N function pointers with zero cost? If it only takes
one allocation, where does the N-1 future store its function pointers?

~~~
dbaupp
A "zero-cost abstraction" really means that abstraction doesn't impose a cost
over the optimal implementation of the task it is abstracting. Some
things—like chaining a dynamic number of (arbitrary) closures—fundamentally
require some sort of dynamic allocation/construction, and so a zero-cost
abstraction would be one that it only does that dynamic behaviour when
necessary.

If you don't need the dynamic behaviour, the library is a zero-cost
abstraction, by statically encoding all the pieces at the type level: like
Rust's iterators, each combinator function returns a new future type that
contains all information about its construction and operations. To add to
this, a closure in Rust is not a function pointer, each one a specialized
struct containing exactly the captures, and thus this all gets put into the
type information too, and everything can be be inlined together into a single
pipeline.

However, if you are dynamically constructing the future you'll have to opt-in
to a uniform representation for the parts (i.e. erase the type information
about the different constructions). This does indeed require allocating and
storing pointers, but AFAICT this is required in any implementation, i.e. this
library imposes no/little extra overhead over the optimal hand-written
implementation.

Furthermore, the static and dynamic parts can work together: if you have parts
that are statically known, these can be constructed as a single static type
(with no function pointers or allocations), and then boxed up into a dynamic
future as a whole unit, which can then also form part of other static chains,
meaning allocations and dynamic calls only need to happen when absolutely
necessary.

~~~
ridiculous_fish
Thanks for your reply. I'm still trying to understand what the restrictions
look like.

Say I sometimes have an outstanding asynchronous operation, i.e. validating
some text in a document. I want to represent this by storing a Future
representing this operation:

    
    
        struct Document {
            text:String,
            validating: Option<Future<Bool>>
        };
    

It seems like there's a problem: in order to have a struct of this type, we
need to have the type of the Future, but the only way to have the type of the
Future is to create the Future (and here we have None).

It this struct possible? How would I initialize it with {"foo", None}?

~~~
steveklabnik
One solution is to put the Future into a Box.

~~~
steveklabnik
And I can't believe I missed this, but the other way is to make the struct
generic over a type implementing Future.

~~~
ridiculous_fish
Does this work though? How do you create an instance of the struct without an
instance of the Future?

~~~
steveklabnik

        trait Future {}
        
        struct DynamicFoo {
            text:String,
            validating: Option<Box<Future>>
        }
        
        struct StaticFoo<T: Future> {
            text:String,
            validating: Option<T>
        }
    

These are the two options. In the first, Box<Future> stores a "trait object",
that is, a tuple of (data ptr, vtable ptr). In the second, there will be a
struct for each type used to generate a StaticFoo.

    
    
        impl Future for i32 {}
        impl Future for f64 {}
        
        fn main() {
            let d = DynamicFoo {
                text: String::from("foo"),
                validating: Some(Box::new(5) as Box<Future>),
            };
            
            let s1 = StaticFoo {
                text: String::from("foo"),
                validating: Some(5),
            };
            
            let s2 = StaticFoo {
                text: String::from("foo"),
                validating: Some(5.0),
            };
        }
    

Works just fine.

------
natrius
What would a rough sketch of async/await syntax sugar look like implemented
with Rust macros?

~~~
steveklabnik
One vision of that
[https://github.com/erickt/stateful/blob/master/examples/gene...](https://github.com/erickt/stateful/blob/master/examples/generator.rs)

------
shmerl
So will this become the official part of the language / standard library?

~~~
steveklabnik
That, if it happens, is still fairly far in the future. This isn't the only
work on this space in Rust (see [https://medium.com/@carllerche/announcing-
tokio-df6bb4ddb34](https://medium.com/@carllerche/announcing-tokio-
df6bb4ddb34) for example, which also uses this library), so it's still in an
early phase. Once everyone has actually used things and found it satisfactory,
then such things can be discussed.

------
hackaflocka
What's the meaning of "zero cost future" in this context? I googled the phrase
and got a bunch of irrelevant material.

~~~
steveklabnik

      > C++ implementations obey the zero-overhead principle: What 
      > you don’t use, you don’t pay for [Stroustrup, 1994]. And 
      > further: What you do use, you couldn’t hand code any better.
      > 
      > – Stroustrup
    

So, in this context, the idea is that if you hand-rolled your own state
machine, you should see no difference than using this library. And, we
measured: the overhead in a benchmark comparing the two was 0.3%, that's three
tenths of one percent.

~~~
jholman
The other concept that GP needs to understand the title is
[https://en.wikipedia.org/wiki/Futures_and_promises](https://en.wikipedia.org/wiki/Futures_and_promises)

So the title/TFA is about "futures" (in the sense of my link) that have "zero
cost" (in the sense of steveklabnik's comment).

------
b34r
select is an odd term choice for what is essentially a race condition. What's
the thought behind the naming of that method?

~~~
steveklabnik
[http://man7.org/linux/man-
pages/man2/select.2.html](http://man7.org/linux/man-pages/man2/select.2.html)

------
pbarnes_1
This is awesome, but I have an off-topic rust question:

Why can't we have some syntactic sugar to get rid of .unwrap()?

~~~
whateveracct
Instead of looking for "syntactic sugar to get rid of unwrap()" I think the
better option would be to prove that the Option isn't None. In my experience
in other functional languages with Options, if you can't prove it, you're
usually making a mistake somewhere upstream in your code.

------
ben0x539
I guess it's cool that Rust is getting zero-cost futures, but they have a long
way to go to catch up to C++'s negative-overhead coroutines!

------
mike_hock
I suppose you could say, this way of programming is _the future._

~~~
infogulch
_get off my lawn_ this isn't reddit _grumble grumble_ doesn't add to the
conversation _blah blah_

