Hacker News new | past | comments | ask | show | jobs | submit login
Why asynchronous Rust doesn't work (theta.eu.org)
603 points by tazjin 6 months ago | hide | past | favorite | 482 comments



A bigger problem in my opinion is that Rust has chosen to follow the poll-based model (you can say that it was effectively designed around epoll), while the completion-based one (e.g. io-uring and IOCP) with high probability will be the way of doing async in future (especially in the light of Spectre and Meltdown).

Instead of carefully weighing advantages and disadvantages of both models, the decision was effectively made on the ground of "we want to ship async support as soon as possible" [1]. Unfortunately, because of this rush, Rust got stuck with a poll-based model with a whole bunch of problems without a clear solution in sight (async drop anyone?). And instead of a proper solution for self-referencing structs (yes, a really hard problem), we did end up with the hack-ish Pin solution, which has already caused a number of problems since stabilization and now may block enabling of noalias by default [2].

Many believe that Rust async story was unnecessarily rushed. While it may have helped to increase Rust adoption in the mid term, I believe it will cause serious issues in the longer term.

[1]: https://github.com/rust-lang/rust/issues/62149#issuecomment-... [2]: https://github.com/rust-lang/rust/issues/63818


> A bigger problem in my opinion is that Rust has chosen to follow the poll-based model

This is an inaccurate simplification that, admittedly, their own literature has perpetuated. Rust uses informed polling: the resource can wake the scheduler at any time and tell it to poll. When this occurs it is virtually identical to completion-based async (sans some small implementation details).

What informed polling brings to the picture is opportunistic sync: a scheduler may choose to poll before suspending a task. This helps when e.g. there is data already in IO buffers (there often is).

There's also some fancy stuff you can do with informed polling, that you can't with completion (such as stateless informed polling).

Everything else I agree with, especially Pin, but informed polling is really elegant.


Could you explain what is informed and stateless informed polling? I haven't really found anything on the web. Thanks!


I believe they mean that when you poll a future, you pass in a context. The future derives a "waker" object from this context which it can store, and use to later trigger itself to be re-polled.

By using a context with a custom "waker" implementation, you can learn which future specifically needs to be re-polled.

Normally only the executor would provide the waker implementation, so you only learn which top-level future (task) needs to be re-polled, but not what specific future within that task is ready to proceed. However, some future combinators also use a custom waker so they can be more precise about which specific future within the task should be re-polled.


So stateful async would be writing IO. You've passed in a buffer, the length to copy from the buffer. In the continuation, you'd need to know which original call you were working with so that you can correlate it with those parameters you passed through.

    var state = socket.read(buffer);
    while (!state.poll()) {}
    state.bytesRead...
Stateless async is accepting a connection. In 95% of servers, you just care that a connection was accepted; you don't have any state that persists across the continuation:

    while (!listeningSocket.poll()) {}
    var socket = listeningSocket.accept();
Stateless async skirts around many of the issues that Rust async can have (because Pin etc. has to happen because of state).


> the resource can wake the scheduler at any time and tell it to poll

Isn't that called interrupting?

The terminology seems a little off here, but perhaps that is only my perception.


No, it's not. Interrupting is when you call the scheduler at any time, even when it's doing some other work. When it's idle, it can not be interrupted.

Interruptions are something one really tries to restrict to the hardware - kernel layers. Because when people write interruption handlers, they almost always write them wrong.


To elaborate on what it is rather than what it is not, when implementing poll based IO with rust async, typically you have code like “select(); waker.wake()” on a worker thread. Select blocks. Waking tells the executor to poll the related future again, from the top of its tree. The waker implementation may indeed cause an executor thread to stop waiting, it depends on the implementation. It could also be the case that the executor is already awake and the future is simply added to a synchronised queue. Etc. You can implement waking however you like, and technically this could involve an interruptible scheduler, if you really wanted. But you would kinda have to write that.


> Instead of carefully weighing advantages and disadvantages of both models, the decision was effectively made on the ground of "we want to ship async support as soon as possible" [1].

That is not an accurate summary of that comment. withoutboats may have been complaining about someone trying to revisit the decision made in 2015-2016, but as the comment itself points out, there were good reasons for that decision.

Mainly two reasons, as far as I know.

First, Rust prefers unique ownership and acyclic data structures. You can make cyclic structures work if you use RefCell and Rc and Weak, but you're giving up the static guarantees that the borrow checker gives you in favor of a bunch of dynamic checks for 'is this in use' and 'is this still alive', which are easy to get wrong. But a completion-based model essentially requires a cyclic data structure: a parent future creates a child future (and can then cancel it), which then calls back to the parent future when it's complete. You might be able to minimize cyclicity by having the child own the parent and treating cancellation as a special case, but then you lose uniqueness if one parent has multiple children.

(Actually, even the polling model has a bit of cyclicity with Wakers, but it's kept to an absolute minimum.)

Second, a completion-based model makes it hard to avoid giving each future its own dynamic allocation, whereas Rust likes to minimize dynamic allocations. (It also requires indirect calls, which is a micro-inefficiency, although I'm not convinced that matters very much; current Rust futures have some significant micro-inefficiencies of their own.) The 2016 blog post linked in the comment goes into more detail about this.

As you might guess, I find those reasons compelling, and I think a polling-based model would still be the right choice even if Rust's async model was being redesigned from scratch today. Edit: Though to be fair, the YouTube video linked from withoutboats' comment does mention that mio decided on polling simply because that's what worked best on Linux at the time (pre-io_uring), and that had some influence on how Futures ended up. But only some.

…That said, I do agree Pin was rushed and has serious problems.


> First, Rust prefers unique ownership and acyclic data structures. You can make cyclic structures work if you use RefCell and Rc and Weak, but you're giving up the static guarantees that the borrow checker gives you in favor of a bunch of dynamic checks for 'is this in use' and 'is this still alive',

One way to get around that is to instead of doing it like the very structureless async way actually impose lifetime restrictions on the lifetimes of async entities. For example, if you use the ideas of the structured concurrency movement ([x], for example but it has since been picked up by kotlin, swift and other projects), then the parent is guaranteed to live longer than any child thus solving most of the problem that way.

[x] https://vorpus.org/blog/notes-on-structured-concurrency-or-g...


The structured parallelism movement predates structured concurrency by years and needs a similar push.

It's from 2004 with the X10 parallel language and then Habanero in Java via their "async-finish" construct

I wonder if your quote (structured XYZ or ABC considered harmful) is inspired by

- https://conf.researchr.org/details/etaps-2019/places-2019-pa...


Well the link I posted is older than your link from 2019 so I doubt that. The other direction may be possible.

However, enither are definitely not the first having ideas along these lines - structures and logic like this can also be found in Erlang supervisors after all.

And as for the quote, it is quite explicitly referring to Dijkstra and structured programming constructs in nonconcurrent settings.



No doubt. In any case, the author notes his main influences in footnote 3, and those are not part of that. It seems he has a more practical than academic background in this.


The author of the talk, Vivek Sarkar, is co-author in both the X10 paper from 2004 and the Habanero paper of 2011.

That's why those are not part of his influences, he co-wrote those papers.


I was referring to the author of vorpus post I posted.


Ah I see.


>That is not an accurate summary of that comment.

How is to so, if he explicitly writes:

> Suggestions that we should revisit our underlying futures model are suggestions that we should revert back to the state we were in 3 or 4 years ago, and start over from that point. <..> Trying to provide answers to these questions would be off-topic for this thread; the point is that answering them, and proving the answers correct, is work. What amounts to a solid decade of labor-years between the different contributors so far would have to be redone again.

How should I read it except like "we did the work on the poll-based model, so we don't want for the results to go down the drain in the case if the completion-based model will turn to be superior"?

I don't agree with your assertion regarding cyclic structures and the need of dynamic allocations in the completion-based model. Both models result in approximately the same cyclisity of task states, no wonders, since task states are effectively size-bound stacks. In both models you have more or less the same finite state machines. The only difference is in how those FSMs interact with runtime and in the fact that in the completion-based model you usually pass ownership of a task state part to runtime during task suspension. So you can not simply drop a task if you no longer need its results, you have to explicitly request its cancellation from runtime.


> How is to so, if he explicitly writes:

There's a difference between "we decided this 3 years ago" and "we rushed the decision". At this point, it's no longer possible to weigh the two models on a neutral scale, because changing the model would cause a huge amount of ecosystem churn. But that doesn't mean they weren't properly weighed in the first place.

Regarding cyclicity… well, consider something like a task running two sub-tasks at the same time. That works out quite naturally in a polling-based model, but in a completion-based model you have to worry about things like 'what if both completion handlers are called at the same time', or even 'what if one of the completion handlers ends up calling the other one'.

Regarding dynamic allocations… well, what kind of desugaring are you thinking of? If you have

    async fn foo(input: u32) -> String;
then a simple desugaring could be

    fn foo(input: u32, completion: Arc<FnOnce(String)>);
but then the function has to responsible for allocating its own memory.

Sure, there are alternatives. We could do...

    struct Foo { /* state */ }
    impl Foo {
        fn call(self: Arc<Self>, input: u32, completion: Arc<FnOnce(String)>);
    }
Which by itself is no better; it still implies separate allocations. But then I suppose we could have an `ArcDerived<T>` which acts like `Arc<T>` but can point to a part of a larger allocation, so that `self` and `completion` could be parts of the same object.

However, in that case, how do you deal with borrowed arguments? You could rewrite them to Arc, I suppose. But if you must use Arc, performance-wise, ideally you want to be moving references around rather than actually bumping reference counts. You can usually do that if there's just `self` and `completion`, but not if there are a bunch of other Arcs.

Also, what if the implementation misbehaved and called `completion` without giving up the reference to `self`? That would imply that any further async calls by the caller could not use the same memory. It's possible to work around this, but I think it would start to make the interface relatively ugly, less ergonomic to implement manually.

Also, `ArcDerived` would have to consist of two pointers and there would have to be at least one `ArcDerived` in every nested future, bloating the future object. But really you don't want to mandate one particular implementation of Arc, so you need a vtable, but that means indirect calls and more space waste.

Most of those problems could be solved by making the interface unsafe and using something with more complex correctness requirements than Arc. But the fact that current async fns desugar to a safe interface is a significant upside. (...Even if the safety must be provided with a bunch of macros, thanks to Pin not being built into the language.)


>There's a difference between "we decided this 3 years ago" and "we rushed the decision".

As far as I understand the situation, the completion-based API simply was not on the table 3 years ago. io-uring was not a thing and there was a negligible interest in properly supporting IOCP. So when a viable alternative has appeared right before stabilization of the developed epoll-centric API, the 3 year old decision has not been properly reviewed in the light of the changed environment and instead the team has pushed forward with the stabilization.

>because changing the model would cause a huge amount of ecosystem churn.

No, the discussion has happened before the stabilization (it's literally in the stabilization issue). Most of the ecosystem at the time was on futures 0.2.

Regarding your examples, I think you simply look at the problem from a wrong angle. In my opinion compiler should not desugar async fns into usual functions, instead it should construct explicit FSMs out of them. So no need for Arcs, the String would be stored directly in the "output" FSM state generated for foo. Yes, this approach is harder for compiler, but it opens some optimization capabilities, e.g. regarding the trade-off between FSM "stack" size and and number of copies which state transition functions have to do. AFAIK right now Rust uses "dumb" enums, which can be quite sub-optimal, i.e. they always minimize the "stack" size at the expense of additional data copies and they do not reorder fields in the enum variants to minimize copies.

In your example with two sub-tasks a generated FSM could look like this (each item is a transition function):

1) initialization [0 -> init_state]: create requests A and B

2) request A is complete [init_state -> state_a]: if request B is complete do nothing, else mark that request A is complete and request cancellation of task B, but do not change layout of a buffer used by request B.

3) cancellation of B is complete [state_a -> state_c]: process data from A, perform data processing common for branches A and B, create request C. It's safe to overwrite memory behind buffer B in this handler.

4) request B is complete [init_state -> state_b]: if request A is complete do nothing, else mark that request B is complete and request cancellation of task A, but do not change layout of a buffer used by request A.

5) cancellation of A is complete [state_b -> state_c]: process data from A, perform data processing common for branches A and B, create request C. It's safe to overwrite memory behind buffer A in this handler.

(This FSM assumes that it's legal to request cancellation of a completed task)

Note that handlers 2 and 4 can not be called at the same time, since they are bound to the same ring and thus executed on the same thread. Other completion handler simply can not call another handler, since they are part of the same FSM and only one FSM transition function can be executed at a time. At the first glance all those states and transitions look like an unnecessary complexity, but I think that it's how a proper select should work under the hood.


> As far as I understand the situation, the completion-based API simply was not on the table 3 years ago

Completion APIs were always considered. They are just significantly harder for Rust to support.


Can you provide any public sources for that? From that I've seen Rust async story was always developed primarily around epoll.


Alex Crichton started with a completion based Future struct in 2015. It was even (unstable) in std in 1.0.0:

https://doc.rust-lang.org/1.0.0/std/sync/struct.Future.html

Our async IO model was based on the Linux industry standard (then and now) epoll, but that is not at all what drove the switch to a polling based model, and the polling based model presents no issues whatsoever with io-uring. You do not know what you are talking about.


>Our async IO model was based on the Linux industry standard (then and now) epoll, but that is not at all what drove the switch to a polling based model

Can you provide a link to a design document or at the very least to a discussion with motivation for this switch outside of the desire to be as compatible as possible with the "Linux industry standard"?

>the polling based model presents no issues whatsoever with io-uring

There are no issues with io-uring compatibility to such extent that you wrote about a whole blog post about those issues: https://boats.gitlab.io/blog/post/io-uring/

IIUC the best solutions right now are either to copy data around (bye-bye zero-cost) or to use another Pin-like awkward hack with executor-based buffer management, instead of using simple and familiar buffers which are part of a future state.


https://aturon.github.io/blog/2016/09/07/futures-design/

The completion based futures that Alex started with were also based on epoll. The performance issues it presented had nothing to do any sort of impedence mismatch between a completion based future and epoll, because there is no impedence issue. You are confused.


Thank you for the link! But immideately we can see the false equivalence: completion based API does not imply the callback-based approach. The article critigues the latter, but not the former. Earlier in this thread I've described how I see a completion-based model built on top of FSMs generated by compiler from async fns. In other words, the arguments presented in that article do not apply to this discussion.

>The performance issues it presented had nothing to do any sort of impedence mismatch between a completion based future and epoll

Sorry, but what? Even aturon's article states zero-cost as one of the 3 main goals. So performance issues with strong roots in the selected model is a very big problem in my book.

>You do not know what you are talking about.

>You are confused.

Please, tone down your replies.


> Please, tone down your replies.

You cannot literally make extremely inflammatory comments about people's work, and accuse them of all sorts of things, and then get upset when they are mad about it. You've made a bunch of very serious accusations on multiple people's hard work, with no evidence, and with arguments that are shaky at best, on one of the largest and most influential forums in the world.

I mean, you can get mad about it, but I don't think it's right.


I found it highly critical but not inflammatory - though I'm not sure if I'd've felt the same way had they been being similarly critical of -my- code.

However, either way, responding with condescension (which is how the 'industry standard' thing came across) and outright aggression is never going to be constructive, and if that's the only response one is able to formulate then it's time to either wait a couple hours or ask somebody else to answer on your behalf instead (I have a number of people who are kind enough to do that for me when my reaction is sufficiently exothermic to make posting a really bad idea).

boats-of-a-year ago handled a similar situation much more graciously here - https://news.ycombinator.com/item?id=22464629 - so it's entirely possibly a lockdown fatigue issue - but responding to calmly phrased criticism with outright aggression is still pretty much never a net win and defending that behaviour seems contrary to the tone the rust team normally tries to set for discussions.


Of course I was more gracious to pornel - that remark was uncharacteristically flippant from a contributor who is normally thoughtful and constructive. pornel is not in the habit of posting that my work is fatally flawed because I did not pursue some totally unviable vaporware proposal.


I am not mad, it was nothing more than an attempt to urge a more civil tone from boats. If you both think that such tone is warranted, then so be it. But it does affect my (really high) opinion about you.

I do understand the pain of your dear work to be harshly criticized. I have experienced it many times in my career. But my critique intended as a tough love for the language in which I am heavily invested in. If you see my comments as only "extremely inflammatory"... Well, it's a shame I guess, since it's not the first case of the Rust team unnecessarily rushing something (see the 2018 edition debacle), so I guess such attitude only increases rate of mistake accumulation by Rust.


I do not doubt that you care about Rust. Civility, though, is a two-way street. Just because you phrase something in a way that has a more neutral tone does not mean that the underlying meaning cannot be inflammatory.

"Instead of carefully weighing advantages and disadvantages of both models," may be written in a way that more people would call "civil," but is in practice a direct attack on both the work, and the people doing the work. It is extremely difficult to not take this as a slightly more politely worded "fuck you," if I'm being honest. In some sense, that it is phrased as being neutral and "civil" makes it more inflammatory.

You can have whatever opinion that you want, of course. But you should understand that the stuff you've said here is exactly that. It may be politely worded, but is ultimately an extremely public direct attack.


>Earlier in this thread I've described how I see a completion-based model built on top of FSMs generated by compiler from async fns. In other words, the arguments presented in that article do not apply to this discussion.

I've been lurking your responses, but now I'm confused. If you are not using a callback based approach, then what are you using? Rust's FSM approach is predicated on polling; In other words if you aren't using callbacks, then how do you know that Future A has finished? If the answer is to use Rust's current systems, then that means the FSM is "polled" periodically, and then you still have "async Drop" problem as described in withoutboat's notorious article and furthermore, you haven't really changed Rust's design.

Edit: As I've seen you mention in other threads, you need a sound design for async Drop for this to work. I'm not sure this is possible in Rust 1.0 (as Drop isn't currently required to run in safe Rust). That said it's unfair to call async "rushed", when your proposed design wouldn't even work in Rust 1.0. I'd be hesitant to call the design of the entire language rushed just because it didn't include linear types.


I meant the callback based approach described in the article, for example take this line from it:

>Unfortunately, this approach nevertheless forces allocation at almost every point of future composition, and often imposes dynamic dispatch, despite our best efforts to avoid such overhead.

It clearly does not apply to the model which I've described earlier.

Of course, the described FSM state transition functions can be rightfully called callbacks, which adds a certain amount of confusion.

I can agree with the argument that a proper async Drop can not be implemented in Rust 1.0, so we have to settle with a compromise solution. Same with proper self-referential structs vs Pin. But I would like to see this argument to be explicitly stated with sufficient backing of the impossibility statements.


>Of course, the described FSM state transition functions can be rightfully called callbacks, which adds a certain amount of confusion.

No, I'm not talking about the state transition functions. I'm talking about the runtime - the thing that will call the state transition function. In the current design, abstractly, the runtime polls/checks every if future if it's in a runnable state, and if so executes it. In an completion based design the future itself tells the runtime that the value is ready (either driven by a kernel thread, another thread or some other callback). (conceptually the difference is, in an poll based design, the future calls waker.wake(), and in a completion one, the future just calls the callback fn). Aaron has already described why that is a problem.

The confusion I have is that both would have problems integrating io_uring into rust (due to the Drop problem; as Rust has no concept of the kernel owning a buffer), but your proposed solution seems strictly worse as it requires async Drop to be sound which is not guaranteed by Rust; which would make it useless for programs that are being written today. As a result, I'm having trouble accepting that your criticism is actually valid - what you seem to be arguing is that async/await should have never been stabilized in Rust 1.0, which I believe is a fair criticism, but it isn't one that indicates that the current design has been rushed.

Upon further thought, I think your design ultimately requires futures to be implemented as a language feature, rather than a library (ex. for the future itself to expose multiple state transition functions without allocating is not possible with the current Trait system), which wouldn't have worked without forking Rust during the prototype stage.


>In an completion based design the future itself tells the runtime that the value is ready

I think there is a misunderstanding. In a completion-based model (read io-uring, but I think IOCP behaves similarly, though I am less familiar with it) it's a runtime who "notifies" tasks about completed IO requests. In io-uring you have two queues represented by ring buffers shared with OS. You add submission queue entries (SQE) to the first buffer which describe what you want for OS to do, OS reads them, performs the requested job, and places completion queue events (CQEs) for completed requests into the second buffer.

So in this model a task (Future in your terminology) registers SQE (the registration process may be proxied via user-space runtime) and suspends itself. Let's assume for simplicity that only one SQE was registered for the task. After OS sends CQE for the request, runtime finds a correct state transition function (via meta-information embedded into SQE, which gets mirrored to the relevant CQE) and simply executes it, the requested data (if it was a read) will be already filled into a buffer which is part of the FSM state, so no need for additional syscalls or interactions with the runtime to read this data!

If you are familiar with embedded development, then it should sound quite familiar, since it's roughly how hardware interrupts work as well! You register a job (e.g. DMA transfer), dedicated hardware block does it, and notifies a registered callback after the job was done. Of course, it's quite an oversimplification, but fundamental similarity is there.

>I think your design ultimately requires futures to be implemented as a language feature, rather than a library

I am not sure if this design would have had a Future type at all, but you are right, the advocated approach requires a deeper integration with the language compared to the stabilized solution. Though I disagree with the opinion that it would've been impossible to do in Rust 1.


Doesn't work because it relies on caller-managed buffers. See withoutboats' post: https://without.boats/blog/io-uring/


It does not work in the current version of Rust, but it's not given that a backwards-compatible solution for it could not have been designed, e.g. by using a deeper integration of async tasks with the language or by adding proper linear types, thus all the discussions around reliable async Drop. The linked blog post takes for given that we should be able to drop futures at any point in time, which while being convenient has a lot of implications.


What happens if you drop the task between 1 and 2? Does dropping block until the cancellation of both tasks is complete?


As I've mentioned several times, in this model you can not simply "drop the task" without running its asynchronous Drop. Each state in FSM will be generated with a "drop" transition function, which may include asynchronous cancellation requests (i.e. cleanup can be bigger than one transition function and may represent a mini sub-FSM). This would require introducing more fundamental changes to the language (same as with proper self-referential types) be it either some kind of linear type capabilities or a deeper integration of runtimes with the language (so you will not be able to manipulate FSM states directly as any other data structure), since right now it's safe to forget anything and destructors are not guaranteed to run. IMO such changes would've maid Rust a better language in the end.


“Rust would have been a better language by breaking its stability guarantees” is just saying “Rust would have been a better language by not being Rust.” Maybe true, but not relevant to the people whose work you’ve blanket criticized. Rust language designers have to work within the existing language and your arguments are in bad faith if you say “async could have been perfect with all this hindsight and a few breaking language changes”.


I do not think that impossibility of a reliable async Drop in Rust 1 is a proven thing (prior to the stabilization of async in the current form). Yes, it may require some unpleasant additions such as making Futures and async fns more special than they are right now and implementing it with high probability would have required a lot of work (at least on the same scale as was invested into the poll-based model), but it does not make it impossible automatically.


I don’t agree with this analysis TBH - async drop has been revisited multiple times recently with no luck. Without a clear path there I don’t know why that would seem like an option for async/await two years ago. Do you actually think the language team should have completely exhausted that option in order to try to require an allocator for async/await?

Async drop would still not address the single-allocation-per-state-machine advantage of the current design that you’ve mostly not engaged with in this thread.


>I don’t agree with this analysis TBH

No worries, I like when someone disagrees with me and argues his or her position well, since it's a chance for me to learn.

>async drop has been revisited multiple times recently with no luck

The key word is "recently", meaning "after the stabilization". It's exactly my point: this problem was not sufficiently explored in my opinion prior stabilization. I would've been fine with a well argued position "async Drop is impossible without breaking language changes, so we will not care about it", but now we try to shoehorn async Drop on top of the stabilized feature.

>Async drop would still not address the single-allocation-per-state-machine advantage of the current design that you’ve mostly not engaged with in this thread.

I don't think you are correct here, please see this comment: https://news.ycombinator.com/item?id=26408524


I agree that Rust async is currently in a somewhat awkward state.

Don't get me wrong, it's usable and many projects use it to great effect.

But there are a few important features like async trait methods (blocked by HKT), async closures, async drop, and (potentially) existential types, that seem to linger. The unresolved problems around Pin are the most worrying aspect.

The ecosystem is somewhat fractured, partially due to a lack of commonly agreed abstractions, partially due to language limitations.

There also sadly seems to be a lack of leadership and drive to push things forward.

I'm ambivalent about the rushing aspect. Yes, async was pushed out the door. Partially due to heavy pressure from Google/Fuchsia and a large part of the userbase eagerly .awaiting stabilization.

Without stabilizing when they did, we very well might still not have async on stable for years to come. At some point you have to ship, and the benefits for the ecosystem can not be denied. It remains to be seen if the design is boxed into a suboptimal corner; I'm cautiously optimistic.

But what I disagree with is that polling was a mistake. It is what distinguishes Rusts implementation, and provides significant benefits. A completion model would require a heavier, standardized runtime and associated inefficiencies like extra allocations and indirection, and prevent efficiencies that emerge with polling. Being able to just locally poll futures without handing them off to a runtime, or cheaply dropping them, are big benefits.

Completion is the right choice for languages with a heavy runtime. But I don't see how having the Rust dictate completion would make io_uring wrapping more efficient than implementing the same patterns in libraries.

UX and convenience is a different topic. Rust async will never be as easy to use as Go, or async in languages like Javascript/C#. To me the whole point of Rust is providing as high-level, safe abstractions as possible, without constraining the ability to achieve maximum efficiency . (how well that goal is achieved, or hindered by certain design patterns that are more or less dictated by the language design is debatable, though)


>A completion model would require a heavier, standardized runtime and associated inefficiencies like extra allocations and indirection, and prevent efficiencies that emerge with polling.

You are not the first person who uses such arguments, but I don't see why they would be true. In my understanding both models would use approximately the same FSMs, but which would interact differently with a runtime (i.e. instead of registering a waker, you would register an operation on a buffer which is part of the task state). Maybe I am missing something, so please correct me if I am wrong in a reply to this comment: https://news.ycombinator.com/item?id=26407824


Is there a good explanation on the difference between polling model and completion model? (not Rust-specific)



I go into a lot of detail in this deck, here's a good starting slide: https://speakerdeck.com/trent/pyparallel-how-we-removed-the-...


Correct me if I'm wrong, but isnt any sort of async support non integral to rust? For example in something like Javscript you can't impliment your own async. But in C, C++ or Rust you can do pretty much anything you want.

So if in the future io-uring and friends become the standard can't that just be a library you could then use?

Similar to how in C you don't need the standard library to do threads or async.


I agree. Completion-based APIs are more high level, and not a good abstraction at the systems language level. IOCP and io_uring use poll-based interfaces internally. In io_uring's case, the interfaces are basically the same ones available in user space. In Windows case IOCP uses interfaces that are private, but some projects have figured out the details well enough to implement decent epoll and kqueue compatibility libraries.

Application developers of course want much higher level interfaces. They don't want to do a series of reads; they want "fetch_url". But if "fetch_url" is the lowest-level API available, good luck implementing an efficient streaming media server. (Sometimes we end up with things like HTTP Live Streaming, a horrendously inefficient protocol designed for ease of use in programming environments, client- and server-side, that effectively only offer the equivalent of "fetch_url".)

Plus, IOCP models tend to heavily rely on callbacks and closures. And as demonstrated in the article, low-level languages suck at providing ergonomic first-class functions, especially if they lack GC. (It's a stretch to say that Rust even supports first-class functions.) If I were writing an asynchronous library in Rust, I'd do it the same way I'd do it in C--a low-level core that is non-blocking and stateful. For example, you repeatedly invoke something like "url_fetch_event", which returns a series of events (method, header, body chunk) or EAGAIN/EWOULDBLOCK. (It may not even pull from a socket directly, but rely on application to write source data into an internal buffer.) Then you can wrap that low-level core in progressively higher-level APIs, including alternative APIs suited to different async event models, as well as fully blocking interfaces. And if a high-level API isn't to some application developer's liking, they can create their own API around the low-level core API. This also permits easier cross-language integration. You can easily use such a low-level core API for bindings to Python, Lua, or even Go, including plugging into whatever event systems they offer, without losing functional utility.

It's the same principle with OS and systems language interfaces--you provide mechanisms that can be built upon. But so many Rust developers come from high-level application environments, including scripting language environments, where this composition discipline is less common and less relevant.


> IOCP models tend to heavily rely on callbacks and closures

While perhaps higher level libraries are written that way, I can’t think of a reason why the primitive components of IOCP require callbacks and closures. The “poll for io-readiness and then issue non-blocking IO” and “issue async IO and then poll for completion” models can be implemented in a reactor pattern in a similar manner. It is just a question of whether the system call happens before or after the reactor loop.

EDIT: Reading some of the other comments and thinking a bit, one annoying thing about IOCP is the cancelation model. With polling IO readiness, it is really easy to cancel IO and close a socket: just unregister from epoll and close it. With IOCP, you will have to cancel the in-flight operation and wait for the completion notification to come in before you can close a socket (if I understand correctly).

Anyways, I've been playing around with implementing some async socket APIs on top of IOCP for Windows in Rust [1]. Getting the basic stuff working is relatively easy. Figuring our a cancellation model is going to be a bit difficult. And ultimately I think it would be cool if the threads polling the completion ports could directly execute the wakers in such a way that the future could be polled inline, but getting all the lifetimes right is making my head hurt.

[1] https://github.com/AustinWise/rust-windows-io


Yes, you can encode state machines manually, but it will be FAR less ergonomic than the async syntax. Rust has started with a library-based approach, but it was... not great. Async code was littered with and_then methods and it was really close to the infamous JS callback hell. The ergonomic improvements which async/await brings is essentially a raison d'être for incorporating this functionality into the language.


Interesting!

For comparison, Haskell went with the library approach but has the syntactic sugar of the equivalent of `and_then` built into the language. (I am talking about Monads and do-notation.)

It's a bit like iterating in Python: for-loops are a convenient syntactic sugar to something that can be provided by a library.


It is a little old, but for the general gist of it, http://aturon.github.io/tech/2018/04/24/async-borrowing/ is an amazing description of the problem here.

It is a PhD level research problem to know if monads and do notation would be able to work in Rust. The people who are most qualified to look into it (incidentally: a lot of the same crew was who was working on async) believe that it may literally be impossible.


My impression is that some of them even burnt out in slog of a supposedly rushed process to land async/await!


Yes, a number of people have suggested introduction of a general do notation (or its analog) instead of usecase-specific async/awayt syntax, but since Rust does not have proper higher kinded types (and some Rust developers say it never will), such proposals have been deemed impractical.


It’s not just HKTs. Figuring out how to handle Fn, FnOnce, FnMut is a whole other can of worms.


I think there were some (related, I think!) proposals here: https://github.com/rust-lang/lang-team/issues/23 https://github.com/rust-lang/rfcs/issues/3000


Can't Rust come up with a new syntax (that matches io_uring idea better) and deprecate the old one? Or simply replace the implementation keeping the old syntax if it's semantically the same?


It could, although I highly doubt that any deficiencies with the current implementation of async/await are so severe as to warrant anything so dramatic.


FWIW I've done a fair bit of researching with io_uring. For file operations it's fast, bit over epoll the speedups are negligible. The creator is a good guy but they're having issues with the performance numbers being skewed due to various deficiencies in the benchmark code, such as skipping error checks in the past.

Also, io_uring can certainly be used via polling. Once the shared rings are set up, no syscalls are necessary afterward.


We've briefly been playing with io_uring (in async rust) for a network service that is CPU-bound and seems to be bottlenecked in context switches. In a very synthetic comparison, the io_uring version seemed very promising (as in "it may be worth rewriting a production service targeting an experimental io setup"), we ran out of the allocated time before we got to something closer to a real-world benchmark but I'm fairly optimistic that even for non-file operations there are real performance gains in io_uring for us.

I'm not sure io_uring polling counts as polling since you're really just polling for completions, you still have all the completion-based-IO things like the in-flight operations essentially owning their buffers.


Yes, I should have specified - in theory io_uring is much faster and less resource intensive. With the right polish, it can certainly be the next iteration of I/O syscalls.

That being said, you have to restructure a lot of your application in order to be io_uring ready in order to reap the most gains. In theory, you'll also have to be a bit pickier with CPU affinities, namely when using SQPOLL (submit queue poll), which creates a kernel thread. Too much contention means such facilities will actually slow you down.

The research is changing weekly and most of the exciting stuff is still on development branches, so tl;dr (for the rest of the readers) if you're on the fence, best stick to epoll for now.


This post is completely and totally wrong. At least you got to ruin my day, I hope that's a consolation prize for you.

There is NO meaningful connection between the completion vs polling futures model and the epoll vs io-uring IO models. comex's comments regarding this fact are mostly accurate. The polling model that Rust chose is the only approach that has been able to achieve single allocation state machines in Rust. It was 100% the right choice.

After designing async/await, I went on to investigate io-uring and how it would be integrated into Rust's system. I have a whole blog series about it on my website: https://without.boats/tags/io-uring/. I assure you, the problems it present are not related to Rust's polling model AT ALL. They arise from the limits of Rust's borrow system to describe dynamic loans across the syscall boundary (i.e. that it cannot describe this). A completion model would not have made it possible to pass a lifetime-bound reference into the kernel and guarantee no aliasing. But all of them have fine solutions building on work that already exists.

Pin is not a hack any more than Box is. It is the only way to fit the desired ownership expression into the language that already exists, squaring these requirements with other desireable primitives we had already committed to shared ownership pointers, mem::swap, etc. It is simply FUD - frankly, a lie - to say that it will block "noalias," following that link shows Niko and Ralf having a fruitful discussion about how to incorporate self-referential types into our aliasing model. We were aware of this wrinkle before we stabilized Pin, I had conversations with Ralf about it, its just that now that we want to support self-referential types in some cases, we need to do more work to incorporate it into our memory model. None of this is unusual.

And none of this was rushed. Ignoring the long prehistory, a period of 3 and a half years stands between the development of futures 0.1 and the async/await release. The feature went through a grueling public design process that burned out everyone involved, including me. It's not finished yet, but we have an MVP that, contrary to this blog post, does work just fine, in production, at a great many companies you care about. Moreover, getting a usable async/await MVP was absolutely essential to getting Rust the escape velocity to survive the ejection from Mozilla - every other funder of the Rust Foundation finds async/await core to their adoption of Rust, as does every company that is now employing teams to work on Rust.

Async/await was, both technically and strategically, as well executed as possible under the circumstances of Rust when I took on the project in December 2017. I have no regrets about how it turned out.

Everyone who reads Hacker News should understand that the content your consuming is usually from one of these kinds of people: a) dilettantes, who don't have a deep understanding of the technology; b) cranks, who have some axe to grind regarding the technology; c) evangelists, who are here to promote some other technology. The people who actually drive the technologies that shape our industry don't usually have the time and energy to post on these kinds of things, unless they get so angry about how their work is being discussed, as I am here.


Thank you for this post. I have been interested in rust because of matrix, and although I found it a bit more intimidating than go to toy with, I was inclined to try it on a real project over go because it felt like the closest to the hardware while not having the memory risks of C. The co-routines/async was/is the most daunting aspect of Rust, and a post with a sensational title like the grand-parent could have swayed me the other way.

As an aside, It would be great to have some sort of federated cred(meritocratic in some way) in hackernews, instead of a flat democratic populist point system; it would lower the potential eternal September effect.

I would love to see a personal meta-pointing system, it could be on wrapping site: if I downvote a "waste of hackers daytime" article (say a long form article about what is life) in my "daytime" profile, I get a weighted downvoted feed by other users that also downvoted this item--basically using peers that vote like you as a pre-filter. I could have multiple filters, one for quick daytime hacker scan, and one for leisure factoid. One could even meta-meta-vote and give some other hackers' handle a heavier weight...


To hopefully make your day better.

I for one, amongst many people I am sure, am deeply grateful for the work of you and your peers in getting this out!


I second this. Withoutboats has done some incredible work for Rust.


For what it's worth, I agree 100% with the premise of withoutboats' post, based on the experience of having worked a little on Zig's event loop.

My recommendation to people that don't see how ridiculous the original post was, is to write more code and look more into things.


Please, calm down. I do appreciate your work on Rust, but people do make mistakes and I strongly belive that in the long term the async stabilization was one of them. It's debatable whether async was essential or not for Rust, I agree it gave Rust a noticeable boost in popularity, but personally I don't think it was worth the long term cost. I do not intend to change your opinion, but I will keep mine and reserve the right to speak about this opinion publicly.

In this thread [1] we have a more technicall discussion about those models, I suggest to continue that thread.

>I assure you, the problems it present are not related to Rust's polling model AT ALL.

I do not agree about all problems, but my OP was indeed worded somewhat poorly, as I've admitted here [2].

>Pin is not a hack any more than Box is. It is the only way to fit the desired ownership expression into the language that already exists

I can agree that it was the easiest solution, but I strongly disagree about the only one. And frankly it's quite disheartening to hear from a tech leader such absolutistic statements.

>It is simply FUD - frankly, a lie - to say that it will block "noalias,

Where did I say "will"? I think you will agree that it at the very least will it will cause a delay. Also the issue shows that Pin was not proprely thought out, especially in the light of other safety issues it has caused. And as uou can see by other comments, I am not the only one who thinks so.

>the content your consuming is usually from one of these kinds of people:

Please, satisfy my curiosity. To which category do I belong in your opinion?

[1]: https://news.ycombinator.com/item?id=26410359

[2]: https://news.ycombinator.com/item?id=26407565


> Please, calm down.

By the way, that will almost certainly be taken in a bad way. It's never a good idea to start a comment with something like "chill" or "calm down", as it feels incredibly dismissive.

> I do appreciate your work on Rust, but

There's a saying that anything before a "but" is meaningless.

This is not meant to critique the rest of the comment, just point out a couple parts that don't help in defusing the tense situation.


Thank you for the advice. Yes, I should've been more careful in my previous comment.

I have noticed this comment only after engaging with him in https://news.ycombinator.com/item?id=26410565 in which he wrote about me:

> You do not know what you are talking about.

> You are confused.

So my reaction was a bit too harsh partially due to that.


So why did you not present your own solutions to the issues that you criticized or better yet fix it with an RFC rather than declaring a working system as basically a failure (per your title). I think you wouldn't have 10% of the saltiness if you didn't have such an aggressive title to your article.


I have tried to raise those issues in the stabilization issue (I know, quite late in the game), but it simply got shut down by the linked comment with a clear message that further discussion be it in an issue on in a new RFC will be pointless.

Also please note that the article is not mine.


You know the F-35 is a disaster of a government project from looking at it, why not submit a better design? That isn't helpful. You might be interested in the discussion from here: https://news.ycombinator.com/item?id=26407770


Please keep going, Rust is awesome and one of the few language projects trying to push the efficient frontier and not just rolling a new permutation of the trade-off dice.


I've jumped on the Rust bandwagon as part of ZeroTier 2.0 (not rewriting its core, but rewriting some service stuff in Rust and considering the core eventually). I've used a bit of async and while it's not as easy as Go (nothing is!) it's pretty damn ingenious for language-native async in a systems programming language.

I personally would have just chickened out on language native async in Rust and told people to roll their own async with promise patterns or something.

Ownership semantics are hairy in Rust and require some forethought, but that's also true in C and C++ and in those languages if you get it wrong there you just blow your foot off. Rust instead tells you that the footgun is dangerously close to going off and more or less prohibits you from doing really dangerous things.

My opinion on Rust async is that it its warts are as much the fault of libraries as they are of the language itself. Async libraries are overly clever, falling into the trap of favoring code brevity over code clarity. I would rather have them force me to write just a little more boilerplate but have a clearer idea of what's going on than to rely on magic voodoo closure tricks like:

https://github.com/hyperium/hyper/issues/2446

Compare that (which was the result of hours of hacking) to their example:

https://hyper.rs/guides/server/hello-world/

WUT? I'm still not totally 100% sure why mine works and theirs works, and I don't blame Rust. I'd rather have seen this interface (in hyper) implemented with traits and interfaces. Yes it would force me to write something like a "factory," but I would have spent 30 minutes doing that instead of three hours figuring out how the fuck make_service_fn() and service_fn() are supposed to be used and how to get a f'ing Arc<> in there. It would also result in code that someone else could load up and easily understand what the hell it was doing without a half page of comments.

The rest of the Rust code in ZT 2.0 is much clearer than this. It only gets ugly when I have to interface with hyper. Tokio itself is even a lot better.

Oh, and Arc<> gets around a lot of issues in Rust. It's not as zero-cost as Rc<> and Box<> and friends but the cost is really low. While async workers are not threads, it can make things easier to treat them that way and use Arc<> with them (as long as you avoid cyclic structures). So if async ownership is really giving you headaches try chickening out and using Arc<>. It costs very very little CPU/RAM and if it saves you hours of coding it's worth it.

Oh, and to remind people: this is a systems language designed to replace C/C++, not a higher level language, and I don't expect it to ever be as simple and productive as Go or as YOLO as JavaScript. I love Go too but it's not a systems language and it imposes costs and constraints that are really problematic when trying to write (in my case) a network virtualization service that's shooting (in v2.0) for tens of gigabits performance on big machines.


I skimmed some of this, but are you asking why you need to clone in the closure? Because "async closures" don't exist at the moment, the closest you can get is a closure that returns a future, this usually has the form:

   <F, Fut> where F: Fn() -> Fut, Fut: Future
i.e. you call some closure f that returns a future that you can then await on. when writing that out it will look like:

   || {
    // closure
       async move {
          // returned future
       }
   }
`make_service_fn` likely takes something like this and puts it in a struct, then for every request it will call the closure to create the future to process the request. (edit: and indeed it does, it's definition literally takes your closure and uses it to implement the Service trait, which you are free to do also if you didn't want to write it this way https://docs.rs/hyper/0.14.4/src/hyper/service/make.rs.html#...)

The reason you need to clone in the closure is that is what 'closes over' the scope and is able to capture the Arc reference you need to pass to your future. Whenever make_service_fn uses the closure you pass to it, it will call the closure, which can create your Arc references, then create a future with those references "moved" in.

It's a little deceptive as this means the exact same thing as above, just with the first set of curly braces not needed

   || async move {}
This is still a closure which returns a Future. Does all of that make sense? Perhaps they could use a more explicit example, but it also helps to carefully read the type signature.


Wait so you're saying "|| async move {}" is equivalent to "|| move { async move {} }"? If so then mystery solved, but that is not obvious at all and should be documented somewhere more clearly.

In that case all I'm doing vs. their example is explicitly writing the function that returns the promise instead of letting it be "inferred?"


Well, no, that second one isn't valid rust, perhaps you mean:

   move || async move {} 
But this is not equivalent to:

  || async move {}
crucially the closure is not going to take ownership of anything. This is kind of besides the point though, what I'm getting at is that both of the above are a closure which returns a future. i.e. you can also write them in this style:

    || {
       return async move {};
    }
Maybe that's more clear with the explicit return?

I don't understand your second question about it begin "inferred", I never used that word. make_service_fn is a convenience function for implementing the Service trait.


Ohhh.... I think I get it. The root of my confusion is that BRACES ARE OPTIONAL in Rust closures.

This is apparently valid Rust:

let func = || println!("foo!");

I didn't know that, which is why I thought "|| async move ..." was some weird form of pseudo-async-closure instead of what it is: a function that returns an async function.

Most of the code I see always uses braces in closures for clarity, but I now see that a lot of async code does not.


> I didn't know that, which is why I thought "|| async move ..." was some weird form of pseudo-async-closure instead of what it is: a function that returns an async function.

It does not return an async function, it is a closure that returns a future. Carefully read the function signature I had posted:

    fn foo<F, Fut>(f: F) where F: Fn() -> Fut, Fut: Future
async move {} is just a future, there is no function call. || is a closure, put them both together and you have a closure that returns a future.

edit: I'm trying to think of how else to explain this. a future is just a state machine, an expression, there is no function call.

   let f = async move { };
Is a valid future, you can f.await just fine.


Is it just me or you're supporting your parent's point of:

> ...the decision was effectively made on the ground of "we want to ship async support as soon as possible" [1].

When you write:

> Moreover, getting a usable async/await MVP was absolutely essential to getting Rust the escape velocity to survive the ejection from Mozilla...

This whole situation saddens me. I wish Mozilla could have given you guys more breathing room to work on such critical parts. Regardless, thank you for your dedication.


That is not a correct reading of the situation. async/await was not rushed, and does not have flaws that could have been solved with more time. async/await will continue to improve in a backward compatible way, as it already has since it was released in 2019.


You are awesome. Thank you for clarifying these things.


Thank you for your tremendous work!


In all this time, maestro Andrei Alexandrescu was right when he said Rust feels like it "skipped leg day" when it comes to concurrency and metaprogramming capabilities. Tim Sweeney was complaining about similar things, saying about Rust that is one step forward, one step backward. These problems will be evident at a later time, when it will be already too late. I will continue experimenting with Rust, but Zig seems to have some great things going on, especially the colourless functions and the comptime thingy. Its safety story does not dissapoint also, even if it is not at Rust's level of guarantees.


In case anyone else was interested in the original sources for the quotes:

> Andrei Alexandrescu was right when he said Rust feels like it "skipped leg day" when it comes to concurrency and metaprogramming capabilities.

https://archive.is/hbBte (the original answer appears to have been deleted [0])

> Tim Sweeney was complaining about similar things, saying about Rust that is one step forward, two steps backward.

https://twitter.com/timsweeneyepic/status/121381423618448588...

(He said "Kind of one step backward and one step forward", but close enough)

[0]: https://www.quora.com/Which-language-has-the-brightest-futur...


Thanks for the references, indeed, Tim said one step forward, one backward, my bad. He posted long time ago.


And Zap (scheduler for Zig) is already faster than Tokio.

Zig and other recent languages have been invented after Rust and Go, so they could learn from them, while Rust had to experiment a lot in order to combine async with borrow checking.

So, yes, the async situation in Rust is very awkward, and doing something beyond a Ping server is more complicated than it could be. But that’s what it takes to be a pioneer.


> And Zap (scheduler for Zig) is already faster than Tokio.

I'm not necessarily doubtful, tokio isn't the fastest implementation of a runtime.

But can you point to a non-trivial benchmark that shows this?

Performance claims should always come with a verifiable benchmark.


Check out @kingprotty's Twitter posts and Zig Show presentations.


D and Zig have dynamically typed generics (templates/"comptime thingy"), while Rust has statically typed generics. A lot of people confuse this for Rust having less powerful generics. It's simply a different approach: the dynamic vs. static types distinction, at the type level instead of the value level.


Since you clearly have expertise, I'm curious if you might provide some insight into what would roughly be different in an async completion-based model & why that might be at a fundamental odds with the event-based one? Like is it an incompatibility with the runtime or does it change the actual semantics of async/await in a fundamental way to the point where you can't just swap out the runtime & reuse existing async code?


It's certainly possible to pave over the difference between models to a certain extent, but the resulting solution will not be zero-cost.

Yes, there is a fundamental difference between those models (otherwise we would not have two separate models).

In a poll-based model interactions between task and runtime look roughly like this:

- task to runtime: I want to read data on this file descriptor.

- runtime: FD is ready, I'll wake-up the task.

- task: great, FD is ready! I will read data from FD and then will process it.

While in a completion based model it looks roughly like this:

- task to runtime: I want data to be read into this buffer which is part of my state.

- runtime: the requested buffer is filled, I'll wake-up the task.

- task: great requested data is in the buffer! I can process it.

As you can see the primary difference is that in the latter model the buffer becomes "owned" by runtime/OS while task is suspended. It means that you can not simply drop a task if you no longer need its results, like Rust currently assumes. You have either wait for the data read request to complete or to (possibly asynchronously) request cancellation of this request. With the current Rust async if you want to integrate with io-uring you would have to use awkward buffers managed by runtime, instead of simple buffers which are part of the task state.

Even outside of integration with io-uring/IOCP we have use-cases which require async Drop and we currently don't have a good solution for it. So I don't think that the decision to allow dropping tasks without an explicit cancellation was a good one, even despite the convenience which it brings.


FWIW, I'd bet almost anything that this problem isn't solvable in any general way without linear types, at which point I bet it would be a somewhat easy modification to what Rust has already implemented. (Most of my development for a long time now has been in C++ using co_await with I/O completion and essentially all of the issues I run into--including the things analogous to "async Drop", which I would argue is actually the same problem as being able to drop a task itself--are solvable using linear types, and any other solutions feel like they would be one-off hacks.) Now, the problem is that the Rust people seem to be against linear types (and no one else is even considering them), so I'm pretty much resigned that I'm going to have to develop my own language at some point (and see no reason to go too deep into Rust in the mean time) :/.


I did a double take seeing your username above this comment!

Thank you for your contributions to the jailbreak community, it’s what got me started down the programming / tinkering path back in middle school and has significantly shaped the opportunities I have today. Can’t believe I’m at the point where I encountered you poking around the same threads on a forum... made my day! :)


> at which point I bet it would be a somewhat easy modification to what Rust has already implemented.

You'd lose that bet: https://gankra.github.io/blah/linear-rust/


This is an article about why linear types are hard to implement... and it doesn't even claim they can't be done; regardless, I have argued this at you before :(.

https://news.ycombinator.com/item?id=23579426

I continue to believe that the strongest point in that article is actually the third footnote, which correctly admits that this is mostly about a lack of appreciation.

> The Swift devs have basically the exact same argument for move-only code, and their implicit Copy bound. Hooray!

My claim here is that, given linear types, it should be trivial to use async/await style coroutines for I/O continuation. You have given no evidence against this idea.


Ah! I misunderstood you, sorry. I thought you were saying that linear types would be easy to implement. I wasn't trying to say anything about the stuff you'd do with them if you had them.


> FWIW, I'd bet almost anything that this problem isn't solvable in any general way without linear types

I think this part of your comment is absolutely right and it's fatal to the argument that Rust made the wrong decision about I/O models for Rust. Maybe in the context of some other language, Rust's decision was not the best one, but not for Rust, because Rust just doesn't have linear types.


I'm not familiar with linear types but as far as I can see they're pretty much the same as rust's ownership rules. Is there something I'm missing?


Rust implements affine types, which means every object must be used at most once. You cannot use them twice, but you can discard them and not do anything with them. Linear types means exactly once.

but I don't think you can easily move from affine types to linear types in the case of Rust, see leakpocalypse[1]

[1]: https://cglab.ca/~abeinges/blah/everyone-poops/#leakpocalyps...


ATS has linear types.


How does this proposal address the problems with "blocking on drop does not work" here? https://without.boats/blog/io-uring/

The problem here is not with Rust's async design. It's that Rust has affine types and not linear types. This is not something that could have been solved with more work on the design. It is not that there was "a decision" to allow dropping tasks; it's a constraint on the design that the language requires. (Personally, I'm unsure as to whether a practical language with true linear types is possible, but it's worth experimenting with. Rust is not and never will be that language, however.)


I'm also curious about this. Boats wrote some about rust async and io-uring a while ago that's interesting[1], but also points out a very clear path forward that's not actually outside the framework of rust's Future or async implementation: using interfaces that treat the kernel as the owner of the buffers being read into/out of, and that seems in line with my expectations of what should work for this.

But I haven't touched IOCP in nearly 20 years and haven't gotten into io-uring yet, so maybe I'm missing something.

Really the biggest problem might be that switching out backends is currently very difficult in rust, even the 0.x to 1.x jump of tokio is painful. Switching from Async[Reader|Writer] to AsyncBuf[Reader|Writer] might be even harder.

[1] https://boats.gitlab.io/blog/post/io-uring/


There's a workaround, but it's unidiomatic, requires more traits, and requires inefficient copying of data if you want to adapt from one to the other.

However, I wouldn't call this a problem with a polling-based model.

At least part of the goal here must be to avoid allocations and reference counting. If you don't care about that, then the design could have been to 'just' pass around atomically-reference-counted buffers everywhere, including as the buffer arguments to AsyncRead/AsyncWrite. That would avoid the need for AsyncBufRead to be separate from AsyncRead. It wouldn't prevent some unidiomaticness from existing – you still couldn't, say, have an async function do a read into a Vec, because a Vec is not reference counted – but if the entire async ecosystem used reference counted buffers, the ergonomics would be pretty decent.

But we do care about avoiding allocations and reference counting, resulting in this problem. However, that means a completion-based model wouldn't really help, because a completion-based model essentially requires allocations and reference counting for the futures themselves.

To me, the question is whether Rust could have avoided this with a different polling-based model. It definitely could have avoided it with a model where the allocations for async functions are always managed by the system, just like the stacks used for regular functions are. But that would lose the elegance of async fns being 'just' a wrapper over a state machine. Perhaps, though, Rust could also have avoided it with just some tweaks to how Pin works [1]… but I am not sure whether this is actually viable. If it is, then that might be one motivation for eventually replacing Pin with a different construct, albeit a weak motivation by itself.

[1] https://www.reddit.com/r/rust/comments/dtfgsw/iou_rust_bindi...


> I am not sure whether this is actually viable.

Having investigated this myself, I would be very surprised to discover that it is.

The only viable solution to make AsyncRead zero cost for io-uring would be to have required futures to be polled to completion before they are dropped. So you can give up on select and most necessary concurrency primitives. You really want to be able to stop running futures you don't need, after all.

If you want the kernel to own the buffer, you should just let the kernel own the buffer. Therefore, AsyncBufRead. This will require the ecosystem to shift where the buffer is owned, of course, and that's a cost of moving to io-uring. Tough, but those are the cards we were dealt.


Well, you can still have select; it "just" has to react to one of the futures becoming ready by cancelling all the other ones and waiting (asynchronously) for the cancellation to be complete. Future doesn't currently have a "cancel" method, but I guess it would just be represented as async drop. So this requires some way of enforcing that async drop is called, which is hard, but I believe it's equally hard as enforcing that futures are polled to completion: either way you're requiring that some method on the future be called, and polled on, before the memory the future refers to can be reused. For the sake of this post I'll assume it's somehow possible.

Having to wait for cancellation does sound expensive, especially if the end goal is to pervasively use APIs like io_uring where cancellation can be slow.

But then, in a typical use of select, you don't actually want to cancel the I/O operations represented by the other futures. Rather, you're running select in a loop in order to handle each completed operation as it comes.

So I think the endgame of this hypothetical world is to encourage having the actual I/O be initiated by a Future or Stream created outside the loop. Then within the loop you would poll on `&mut future` or `stream.next()`. This already exists and is already cheaper in some cases, but it would be significantly cheaper when the backend is io_uring.


> But then, in a typical use of select, you don't actually want to cancel the I/O operations represented by the other futures. Rather, you're running select in a loop in order to handle each completed operation as it comes.

You often do want to cancel them in some branches of the code that handles the result (for example, if they error). It indeed may be prohibitively expensive to wait until cancellation is complete - because io-uring cancellation requires a full round trip through the interface, the IORING_OP_ASYNC_CANCEL op is just a hint to the kernel to cancel any blocking work, you still have to wait to get a completion back before you know the kernel will not touch the buffer passed in.

And this doesn't even get into the much better buffer management strategies io-uring has baked into it, like registered buffers and buffer pre-allocation. I'm really skeptical of making those work with AsyncRead (now you need to define buffer types that deref to slices that are tracking these things independent of the IO object), but since AsyncBufRead lets the IO object own the buffer, it is trivial.

Moving the ecosystem that cares about io-uring to AsyncBufRead (a trait that already exists) and letting the low level IO code handle the buffer is a strictly better solution than requiring futures to run until they're fully, truly cancalled. Protocol libraries should already expose the ability to parse the protocol from an arbitrary stream of buffers, instead of directly owning an IO handle. I'm sure some libraries don't, but that's a mistake that this will course correct.


> Well, you can still have select; it "just" has to react to one of the futures becoming ready by cancelling all the other ones and waiting (asynchronously) for the cancellation to be complete.

Right. Which is more or less what the structured concurrency primitives in Kotlin, Trio, and soon Swift are doing.


Wouldn't a more 'correct' implementation be moving the buffer into the thing that initiates the future (and thus, abstractly, into the future), rather than refcounting? At least with IOCP you aren't really supposed to even touch the memory region given to the completion port until it's signaled completion iirc.

Ie. to me, an implementation of read() that would work for a completion model could be basically:

    async read<T: IntoSystemBufferSomehow>(&self, buf: T) -> Result<T, Error>
I recognize this doesn't resolve the early drop issues outlined, and it obviously does require copying to adapt it to the existing AsyncRead trait, or if you want to like.. update a buffer in an already allocated object. It's just what I would expect an api working against iocp to look like, and I feel like it avoids many of the issues you're talking about.


I'm not a rust expert, so I'm not sure how close this proposal is to Composita

http://concurrency.ch/Content/publications/Blaeser_Component...

Essentially each component has a buffered interface (an interface message queue), which static analysis sizes at compile time. This buffer can act as a daemon, ref counter, offline dropbox, cache, cancellation check, and can probably help with cycle checking.

Is this the sort of model which would be useful here?


The poll model has the advantage that you have control when async work starts and therefor is the more predictable model.

I guess that way it fits more the Rust philosophy.


Could Rust switch? More importantly, would a completion based model alleviate the problems mentioned?


Without introducing Rust 2? Highly unlikely.

I should have worded my message more carefully. Completion-based model is not a silver bullet which would magically solve all problems (though I think it would help a bit with the async Drop problem). The problem is that Rust async was rushed without careful deliberation, which causes a number of problems without a clear solution in sight.


> The problem is that Rust async was rushed without careful deliberation

As someone who observed the process, this couldn't be further from the truth. Just because one disagrees with the conclusion does not mean that the conclusion was made in haste or in ignorance.

> Without introducing Rust 2? Highly unlikely.

This is incorrect. async/await is a leaf node on the feature tree; it is supported by other language features, but does not support any others. Deprecating or removing it in favor of a replacement would not be traumatic for the language itself (less so for the third-party async/await ecosystem, of course). But this scenario is overly dramatic: the benefits of a completion-based model are not so clear-cut as to warrant such actions.


>Just because one disagrees with the conclusion does not mean that the conclusion was made in haste or in ignorance.

Believe me, I do understand the motivation behind the decision to push async stabilization in the developed form (at least I think I do). And I do not intend to argue in a bad faith. My point is that in my opinion the Rust team has chosen to get the mid-term boost of Rust popularity at the expense of the long-term Rust health.

Yes, you are correct that in theory it's possible to deprecate the current version of async. But as you note yourself, it's highly unlikely to happen since the current solution is "good enough".


I and many others would disagree that they made the decision "at the expense of the long-term Rust health". You aren't arguing in good faith if you put words in their mouth. There is no data to suggest the long-term health of rust is at stake because of the years long path they took in stabilizing async today. There are merits to both models but nothing is as clear-cut as you make it to be - completion-based futures are not definitively better than poll-based and would have a lot of trade-offs. To phrase this as "Completion based is totally better and the only reason it wasn't done was because it would take too long and Rust needed popularity soon" is ridiculous


I do not put words in their mouth or have you missed the "in my opinion" part?

The issues with Pin, problems around noalias, inability to design a proper async Drop solution, not a great compatibility with io-uring and IOCP. In my eyes they are indicators that Rust health in the async field has suffered.

>Completion based is totally better and the only reason it wasn't done was because it would take too long and Rust needed popularity soon

And who now puts words into other's mounts? Please see this comment: https://news.ycombinator.com/item?id=26407565


I find your statements so strange. I honestly don't care about noalias, and very few people really should. Same with 'async drop'. Same with io-uring, which seems to be totally fine in Rust so far.

Despite your repeated statements that async has harmed Rust, I don't have any problem whatsoever day to day writing 10s of thousands of lines of async code with regards to what you've brought up.


Isn't it very likely that going the other route would also result in a different but equally long list of issues?


Yes, it's a real possibility. But the problem is that the other route was not properly explored, so we can not compare advantages and disadvantages. Instead Rust went all-in on a bet which was made 3 years ago.


> My point is that in my opinion the Rust team has chosen to get the mid-term boost of Rust popularity at the expense of the long-term Rust health.

I don't think a conscious decision of that sort was made? My impression is that at the time the road taken was understood to be the correct solution and not a compromise. Is that wrong?


The decision was made 3 years ago and at the time it was indeed a good one, but the situation has changed and the old decision was not (in my opinion) properly reviewed. See this comment: https://news.ycombinator.com/item?id=26408524


Does anything about the implementation of async prevent a future completion-based async feature? Say it's called bsync/bwait.


Yes, it's possible, but it will be a second way of doing async, which will split the ecosystem even further. So without a REALLY good motivation it simply will not happen. Unfortunately, the poll-based solution is "good enough"... I guess, some may say that "perfect is the enemy of good" applies here, but I disagree.


I was going to say... even as a casual observer I remember the finalization of async took a loooong time.


> The problem is that Rust async was rushed without careful deliberation, which causes a number of problems without a clear solution in sight.

Are we talking about the same Rust? I remember the debate and consideration over async was enormous and involved. It was practically the polar opposite of “without careful deliberation”.


There exists actually a proposal for adding completion based futures at [1], which is compatible to what exists now and certainly doesn't require a Rust 2. It will however certainly increase the language surface area.

[1] https://rust-lang.zulipchat.com/#narrow/stream/187312-wg-asy...


I think there are 2 separate findings in it:

First of all yes, Rust futures use a poll model, where any state changes from different tasks don't directly call completions, but instead just schedule the original task to wake up again. I still think this is a good fit, and makes a lot of sense. It avoids a lot of errors on having a variety of state on the call stack before calling the continuation, which then gets invalidated. The model by itself also doesn't automatically make using completion based IO impossible.

However the polling model in Rust is combined with the model of always being able to drop a Future in order to cancel a task. This doesn't allow to use lower level libraries which require to do this without applying any additional workarounds.

However that part of Rusts model could be enhanced if there is enough interest in it. e.g. [1] discusses a proposal for it.

[1] https://rust-lang.zulipchat.com/#narrow/stream/187312-wg-asy...


Why did polling have to be baked into the language? Seems bizarre for a supposedly portable language to assume the functionality of an OS feature which could change in the future.

Meanwhile C and C++ can easily adopt any async system call style because it made no assumptions in the standards about how that would be done.

Rust also didn't solve the colored functions problem. Most people think that's an impossible problem to solve without a VM/runtime (like Java Loom), but people also thought garbage collection was impossible in a systems language until Rust solved it. It could have been a great opportunity for them.


> people also thought garbage collection was impossible in a systems language until Rust solved it

No, they didn't. Linear typing for systems languages had already been done in ats, cyclone, and clean, the latter two of which were a major inspiration for rust.

Venturing further into gc territory: long before rust was even a twinkle in graydon hoare's eye, smart pointers were happening in c++, and apple was experimenting with objective c for drivers.


Perhaps more accurate to say "safe reclamation of dynamic allocations without GC was not known to be possible in a practical programming language, before Rust".

The problem with languages like ATS and Cyclone is that you need heavy usage in real-world applications to prove that your approach is actually usable by developers at scale. Rust achieved that first.


Cyclone was a c derivative (I believe it was even backwards compatible), and ats a blend of ml and c. Ml and c are both certainly proven.

Cyclone was, and ats is, a research project; not necessarily intended to achieve widespread use. And again, obj-c was being used by apple in drivers, which is certainly a real-world application.

> without GC

I don't know what you mean by this. GC is a memory management policy in which the programmer does not need to manually end the lifetimes of objects. Rust is a garbage collected language. How many manual calls to 'drop' or 'free' does the average rust program have?


Cyclone wasn't backwards-compatible with C.

ATS is not just "a blend of ML and C", it has a powerful proof system on top.

You can't just say "well, these languages were derived from C in part, THEREFORE they must be easy to adopt at scale", that doesn't follow at all.

Yes, Cyclone and ATS were research projects, that's why they were never able to accumulate the real-world experience needed to demonstrate that their ideas work at scale.

Objective-C isn't memory safe.

By "GC" here I meant memory reclamation schemes that require runtime support and object layout changes ... which is the way most people use it. If you use the term "garbage collection" in a more expansive way, so that you say Rust "is a garbage collected language", then most people are going to misunderstand you.


> Objective-C isn't memory safe.

No, but it is garbage collected.

> By "GC" here I meant memory reclamation schemes that require [...] object layout changes

Changes with respect to what?

One example of a popular GC is the boehm GC. It provides a drop-in replacement for malloc, usable in c for existing c structures without any ABI changes.

Perhaps you are thinking specifically of compacting GCs, which usually need objects to have a header with a forwarding pointer?

> require runtime support

‘malloc’ and ‘free’ are a memory reclamation scheme that is part of the c runtime. I don't think there's any argument to be made that they are garbage collection. What's the difference between them and some other runtime support?

  -----------------------------------------
Broadly, you are referring to mechanisms which can be used to implement garbage collection, but those are not what's interesting here. What's interesting is a memory management policy which supports garbage collection and is usable for a systems programming language.

  -----------------------------------------
> If you use the term "garbage collection" in a more expansive way, so that you say Rust "is a garbage collected language", then most people are going to misunderstand you.

‘Garbage collection’ is a technical term with a specific, precise meaning. This meaning is generally understood and accepted throughout the literature. It's also the thing that's specifically interesting here: manually managing object lifetimes is error-prone and tends to lead to bugs, and bugs in systems software tend to be far-reaching, so a way to eliminate those bugs categorically is considered valuable.

> You can't just say "well, these languages were derived from C in part, THEREFORE they must be easy to adopt at scale", that doesn't follow at all.

That's fair as such, but I think the situation is a bit more nuanced than that. The semantics of ats and cyclone are largely designed to augment c directly. Ats's proof semantics in particular map very well to the semantics of c programs as written. Which, true, doesn't prove anything, but shows that there is much less to be proved: the existing paradigm can still be used.

> Yes, Cyclone and ATS were research projects, that's why they were never able to accumulate the real-world experience needed to demonstrate that their ideas work at scale.

Are the several multi-100-kloc ats compilers out there not real-world enough? If not then, on the topic of proof languages, ada/spark and isabelle/hol had proven themselves long before rust.


I have connections with the academic GC community. I gave an invited talk at ISMM 2012. I guarantee that they will not agree "Rust is a garbage-collected language".

FWIW Wikipedia describes "garbage collection" as "a form of automatic memory management" and goes on to say "Other similar techniques include stack allocation, region inference, memory ownership ..." so whoever wrote that doesn't agree that all forms of automatic reclamation are garbage collection.

I prefer to avoid arguing about the meaning of words but it's not good to sow confusion.

> Are the several multi-100-kloc ats compilers out there not real-world enough?

Yes, projects written by the creators of the language are not enough.

> If not then, on the topic of proof languages, ada/spark and isabelle/hol had proven themselves long before rust.

Before Rust, Ada/SPARK didn't support dynamic deallocation. See https://www.adacore.com/uploads/techPapers/Safe-Dynamic-Memo..., which cites Rust.

SeL4 required 200K lines of Isabelle/HOL proofs to verify 7.5K lines of C; that approach simply doesn't scale.


I have always thought Pascal solved that in practise with automated reference counting on arrays

A good optimizer could then have removed the counting on non-escaping local variables


If you squint, this is sorta what Swift is.


Apple wasn't experimenting with Objective-C for drivers, NeXTSTEP drivers were written in Objective-C.

macOS IO Kit replacement, Driver Kit, is an homage to NeXTSTEP Driver Kit name, the Objective-C framework.


> Meanwhile C and C++ can easily adopt any async system call style because it made no assumptions in the standards about how that would be done.

This is comparing apples to oranges; Rust's general, no-assumptions-baked-in coroutines feature is called "generators", and are not yet stable. It is this feature that is internally used to implement async/await. https://github.com/rust-lang/rust/issues/43122


> but people also thought garbage collection was impossible in a systems language until Rust solved it?

What?

Several OSes have proven their value written in GC enabled systems programming languages.

They aren't as mainstream as they should due to UNIX cargo cult and anti-GC Luddites.

Rust only proved that affine types can be easier than cyclone and ATS.


>Why did polling have to be baked into the language?

See this comment: https://news.ycombinator.com/item?id=26407440

>Meanwhile C and C++ can easily adopt any async system call style because it made no assumptions in the standards about how that would be done.

Do you know about co_await in C++20? AFAIK (I only have a very cursory knowledge about it, so I may be wrong) it also makes some trade-offs, e.g. it requires allocations, while in Rust async tasks can live on stack or statically allocated regions of memory.

Also do not forget that Rust has to ensure memory safety at compile time, while C++ can be much more relaxed about it.


C++20 coroutines are not async in the standard. They are just coroutines. Actually they have no implementation. The user has to write classes to implement the promise type and the awaitable type. You could just as easily write a coroutine library wrapping epoll as you could io_uring. The only thing it does behind your back (other than compile to stackless coroutines) is allocate memory, which also goes for a lot of other things.


Is this not also true of Rust? Are you saying Rust in some sense hardcodes an implementation to await in a way C++ doesn't? (I am not a Rust programmer, but I am very very curious about this and would appreciate any insight; I do program in C++ with co_await daily, with my own promise/task classes.)


Rust's async/await support is not intended as a general replacement of coroutines. In fact, async/await is built on top of coroutines (what Rust calls "generators"), but these are not yet stable. https://github.com/rust-lang/rust/issues/43122


Ouch... thanks; I didn't realize the Rust situation was this bad :(. FWIW, I do not look at generators as being what I would want as my interface for working with coroutines, and am very much on board there with the comments from tommythorn. I guess I just have too many decades of experience working with coroutines in various systems I have used :(.

https://github.com/rust-lang/rust/issues/43122#issuecomment-...

https://github.com/rust-lang/rust/issues/43122#issuecomment-...


You may want to watch/read my talk: https://www.infoq.com/presentations/rust-2019/

I also did a follow up, walking through how you would implement all of the bits: https://www.infoq.com/presentations/rust-async-await/

TL;DR: rust makes you bring some sort of executor along. You can write your own, you can use someone else's. I have not done enough of a deep dive into what made it into the standard to give you a great line-by-line comparison.


Which makes them quite powerful, as they allow for other kinds of patterns.


It requires allocation if the coroutine outlives the scope that created it.

Otherwise compiler are free to implement heap allocation elision (which is done in Clang).

Now compared to Rust, assuming you have a series of coroutines to process a deferred event, Rust will allocate once for the whole series while C++ would allocate once per coroutine to store them in the reactor/proactor.


Rust never implicitly allocates, even with async/await. I have written Rust programs on a microcontroller with no heap, using async/await for it.


I don't think I've implied that allocation in Rust was implicit but that's a fair point.


You said

> Rust will allocate once for the whole series

which, it will not.

It is true that some executors will do a single allocation for the whole series, but that is not done by Rust, nor is required. That's all!


> people also thought garbage collection was impossible in a systems language until Rust solved it

Only if you understand "garbage collection" in a narrow sense of memory safety, no explicit free() calls, a relatively readable syntax for passing objects around, and acceptable amount of unused memory. This comes with a non-negligible amount of small text for Rust when compared to garbage-collected languages.


I'm not totally sure what the author is asking for, apart from refcounting and heap allocations that happen behind your back. In my experience async Rust is heavily characterised by tasks (Futures) which own their data. They have to - when you spawn it, you're offloading ownership of its state to an executor that will keep it alive for some period of time that is out of the spawning code's control. That means all data/state brought in at spawn time must be enclosed by move closures and/or shared references backed by an owned type (Arc) rather than & or &mut.

If you want to, nothing is stopping you emulating a higher-level language - wrap all your data in Arc<_> or Arc<Mutex<_>> and store all your functions as trait objects like Box<dyn Fn(...)>. You pay for extra heap allocations and indirection, but avoid specifying generics that spread up your type hierarchy, and no longer need to play by the borrow checker's rules.

What Rust gives us is the option to _not_ pay all the costs I mentioned in the last paragraph, which is pretty cool if you're prepared to code for it.


I'm a giant Rust fanboy and have been since about 2016. So, for context, this was literally before Futures existed in Rust.

But, I only work on Rust code sporadically, so I definitely feel the pros and cons of it when I switch back and forth to/from Rust and other languages.

The problem, IMO, isn't about allocations or ownership.

In fact, I think that a lot of the complaints about async Rust aren't even about async Rust or Futures.

The article brings up the legitimate awkwardness of passing functions/closures around in Rust. But it's perfectly fair to say that idiomatic Rust is not a functional language, and passing functions around is just not the first tool to grab from your toolbelt.

I think the actual complaint is not about "async", but actually about traits. Traits are paradoxically one of Rust's best features and also a super leaky and incomplete abstraction.

Let's say you know a bit of Rust and you're kind of working through some problem. You write a couple of async functions with the fancy `async fn foo(&x: Bar) -> Foo` syntax. Now you want to abstract the implementation by wrapping those functions in a trait. So you try just copy+pasting the signature into the trait. The compiler complains that async trait methods aren't allowed. So now you try to desugar the signature into `fn foo(&x: Bar) -> impl Future<Foo>` (did you forget Send or Unpin? How do you know if you need or want those bounds?). That doesn't work either because now you find out that `impl Trait` syntax isn't supported in traits. So now you might try an associated type, which is what you usually do for a trait with "generic" return values. That works okay, except that now your implementation has to wrap its return value in Box::pin, which is extra overhead that wasn't there when you just had the standalone functions with no abstraction. You could theoretically let the compiler bitch at you until it prints the true return value and copy+paste that into the trait implementation's associated type, but realistically, that's probably a mistake because you'd have to redo that every time you tweak the function for any reason.

IMO, most of the pain isn't really caused by async/await. It's actually caused by traits.


also a long-time rust user, and I buy this. one of the things it took me longest to realize when writing rust is to reach for traits carefully/reluctantly. they can be amazing (e.g. serde), but I've wasted tons of time trying to make some elegant trait system work when I could have solved the problem much more quickly otherwise.


Exactly. Which is unfortunate, because the fact that Rust has true type classes is absolutely awesome.

But when dealing with traits, you have to remember the orphan rules, and the implicit object-safety rules- which sucks, because you might not have planned on using trait objects when you first defined the trait, but only tried to do so later.

Async definitely makes it even more painful.


I wonder if most of the pain is actually caused by Rust async being an MVP, so things like async trait functions (which would be very nice) don't exist... yet.

I don't know if anybody has shown that they can't ever exist, it's just that they weren't considered necessary to get the initial async features out of the door. Rather like how you can't use impl Trait in trait method signatures either (there's definitely some generics-implications-complexity going on with that one).


This macro goes a very long way toward solving the problem: https://github.com/dtolnay/async-trait


> The article brings up the legitimate awkwardness of passing functions/closures around in Rust.

That's a hard problem when you have linear/affine types! Closures don't work so neatly as in Haskell; currying has to be different.


^this

IMO, Rust already provides a decent amount of way to simplify and skip things. Reference counting, async, proc_macro, etc.

In my experience, programming stuffs at a higher-level-language where things are heavily abstracted, like (cough) NodeJS, is easy and simple up to a certain point where I have to do a certain low-level things fast (e.g. file/byte patching) or do a system call which is not provided by the runtime API.

Often times I have to resort to making a native module or helper exe just to mitigate this. That feels like reinventing the wheel because the actual wheel I need is deep under an impenetrable layer of abstraction.


This is where Scala (and JVM based languages) would shine in theory, the JVM has a well defined memory model, provides great low-level tools, etc. (But JVM-based software is always very bulky to deploy, both in terms of memory and size, so this shining is rarely seen in practice.)


Memory usage has been my main issue with JVM, running an instance of it is very costly compared to languages that compiles to near abstract machines (or at least have minimal runtime code like Go).

Anyway on abstraction, it's just hard, because everyone has different concept of what abstraction is. For some it's just combining couple of function calls into one, for some it is providing defaults, for some it is providing composable functions with controllable continuations.


I like to joke that the best way to encounter the ugliest parts of Rust is to implement an HTTP router. Hours and days of boxing and pinning, Futures transformations, no async fn in traits, closures not being real first class citizens, T: Send + Sync + 'static, etc.

I call this The Dispatch Tax. Because any time you want more flexibility than the preferred static dispatch via generics can give you - oh, so you just want to store all these async fn(HttpRequest) -> Result<HttpResponse>, right? - you immediately start feeling the inconvenience and bulkiness of dynamic dispatch. It's like Rust is punishing you for using it. And with async/await taking this to a new level altogether, because you are immediately forced to understand how async funcs are transformed into ones that return Futures, how Futures are transformed into anon state machine structs; how closures are also transformed to anon structs. It's like there's no type system anymore, only structs.

That's one of the reasons, I think, why Go has won the Control Plane. Sure, projects like K8s, Docker, the whole HashiCorp suite are old news. But it's interesting and telling that even solid Rust shops like PingCAP are using Go for their control plane. It seems to me that there's some fundamental connection between flexibility of convenient dynamic dispatch and control plane tasks. And of course having the default runtime and building blocks like net and http in the standard library is a huge win.

That said, after almost three months of daily Rust it does get better. To the point when you can actually feel that some intuition and genuine understanding is there, and you can finally work on your problems instead of fighting with the language. I just wish that the initial learning curve wasn't so high.


Definitely dynamic dispatch + async brings out a lot of pain points.

But I only agree that the async part of that is unfortunate. Making dynamic dispatch have a little extra friction is a feature, not a bug, so to speak. Rust's raison d'être is "zero cost abstraction" and to be a systems language that should be viable in the same spaces as C++. Heap allocating needs to be explicit, just like in C and C++.

But, I agree that async is really unergonomic once you go beyond the most trivial examples (some of which the article doesn't even cover).

Some of it is the choices made around the async/await design (The Futures, themselves, and the "async model" is fine, IMO).

But the async syntax falls REALLY flat when you want an async trait method (because of a combination-and-overlap of no-HKTs, no-GATs, no `impl Trait` syntax for trait methods) or an async destructor (which isn't a huge deal- I think you can just use future::executor::block_on() and/or use something like the defer-drop crate for expensive drops).

Then it's compounded by the fact that Rust has these "implicit" traits that are usually implemented automatically, like Send, Sync, Unpin. It's great until you write a bunch of code that compiles just fine in the module, but you go to plug it in to some other code and realize that you actually needed it to be Send and it's not. Crap- gotta go back and massage it until it's Send or Unpin or whatever.

Some of these things will improve (GATs are coming), but I think that Rust kind of did itself a disservice with stabilizing the async/await stuff, because now they'll never be able to break it and the Pin/Unpin FUD makes me nervous. I also think that Rust should have embraced HKTs/Monads even though it's a big can of worms and invites Rust devs to turn into Scala/Haskell weenies (said with love because I'm one of them).


Oh yeah, I can totally relate to the Send-Sync-Unpin massaging, plus 'static bound for me. It's so weird that individually each of them kinda makes sense, but often you need to combine then and all of a sudden the understanding of combinations just does not.. combine. After a minute or two of trying to figure out what should actually go into that bound I give up, remove all of them and start adding them back one by one until the compiler is happy.


Yep. Same. I've been doing Rust for years at this point (not full time, and with long gaps- granted), and it's exactly like you said: individually these things are simple, but then you're trying to figure out where you accidentally let a reference cross an await boundary that killed your automatic Unpin that you didn't realize you needed. Suddenly it feels like you don't understand it like you thought you did.

The static lifetime bound is annoying, too! I guess it crops up if you take a future and compose it with another one? Both future implementations have to be static types to guarantee they live long enough once passed into the new future.


Is there any chance of fixing Pin/Unpin via a later Rust edition?


I don't know. But the Rust team is VERY against introducing any breaking changes. So we're stuck with the semantics we have today.


In a systems context, where performance and memory ostensibly matter, why wouldn’t you want to be made aware of those inefficiencies?

Sure, Go hides all that, but as a result it’s also possible to have memory leaks and spend extra time/memory on dynamic dispatch without being (fully) aware of it.


I think Rust is also able to hide certain things. Without async things are fine:

    type Handler = fn(Request<Body>) -> Result<Response<Body>, Error>; 
    let mut map: HashMap<&str, Handler> = HashMap::new(); 
    map.insert("/", |req| { Ok(Response::new("hello".into())) }); 
    map.insert("/about", |req| { Ok(Response::new("about".into())) });
Sure, using function pointer `fn` instead of one of the Fn traits is a bit of a cheating, but realistically you wouldn't want a handler to be a capturing closure anyway.

But of course you want to use async and hyper and tokio and your favorite async db connection pool. And the moment you add `async` to the Handler type definition - well, welcome to what author was describing in the original blog post. You'll end up with something like this

    type Handler = Box<dyn Fn(Request) -> BoxFuture + Send + Sync>; 
    type BoxFuture = Pin<Box<dyn Future<Output = Result> + Send>>;
plus type params with trait bounds infecting every method you want pass your handler to, think get, post, put, patch, etc.

    pub fn add<H, F>(&mut self, path: &str, handler: H)
    where
        H: Fn(Request) -> F + Send + Sync + 'static,
        F: Future<Output = Result> + Send + 'static,
And for what reason? I mean, look at the definitions

    fn(Request<Body>) -> Result<Response<Body>, Error>;
    async fn(Request<Body>) -> Result<Response<Body>, Error>;
It would be reasonable to suggest that if the first one is flexible enough to be stored in a container without any fuss, then the second one should as well. As a user of the language, especially in the beginning, I do not want to know of and be penalized by all the crazy transformations that the compiler is doing behind the scene.

And for the record, you can have memory leaks in Rust too. But that's besides the point.


>It would be reasonable to suggest that if the first one is flexible enough to be stored in a container without any fuss, then the second one should as well

I don't think this a reasonable in Rust (or in C/C++). I 90% of the pain of futures in Rust is most users don't want to care about memory allocation and want Rust to work like JS/Scala/C#.

When using a container containing a function, you only have to think allocating memory for the function pointer, which is almost always statically allocated. However for an async function, there's not only the function, but the future as well. As a user the language now poses a problem to you, where does the memory for the future live.

1. You could statically allocate the future (ex. type Handler = fn(Request<Body>) -> ResponseFuture, where ResponseFuture is a struct that implemented Future).

But this isn't very flexible and you'd have to hand roll your own Future type. It's not as ergonomic as async fn, but I've done it before in environments where I needed to avoid allocating memory.

2. You decide to box everything (what you posted).

If Rust were to hide everything from you, then the language could only offer you (2), but then the C++ users would complain that the futures framework isn't "zero-cost". However most people don't care about "zero-cost", and come from languages where the solution is the runtime just boxes everything for you.


Thanks for the suggestion. I didn't think of (1), although it's a pity that it's not as ergonomic as async fn.

I kinda feel like there's this false dichotomy here: either hide and be like Java/Go or be as explicit as possible about the costs like C/C++. Is there maybe a third option, when I as a developer aware of the allocation and dispatch costs, but the compiler will do all the boilerplate for me. Something like `async dyn fn(Request) -> Result<Response>`? :)


In this example rust doesn't just make me aware of the tradeoffs. It almost feels like the language is actively standing in the way of making the trade offs I want to make. At least as the language is today. I think a bunch of upcoming features like unsized rvalues and async fns in traits will help.


> In a systems context, where performance and memory ostensibly matter, why wouldn’t you want to be made aware of those inefficiencies?

Perhaps, but a bigger problem is that lots of folks are using Rust in a non-systems context (see HN frontpage on any random day).


I've been working on a reasonably complicated project that is using rust, and I think about de-asyncing my code somewhat often because there are some nasty side effects you can see when you try to connect async & non async code. E.g., tokio makes it very difficult to correctly & safely (in a way the compiler, not runtime) catches launch an async task in a blocking function (with no runtime) that itself may be inside of a async routine. It makes using libraries kind of tough, and I think you end up with a model where you have a thread-per-library so the library knows it has a valid runtime, which is totally weird.

All that said, the author's article reads as a bit daft. I think anyone who has tried building something complicated in C++ / Go will look at those examples and marvel at how awesome Rust's ability to understand lifetimes is (i.e. better than your own) and keep you from using resources in an unintended way. E.g., you want to keep some data alive for a closure and locally? Arc. Both need to writable? Arc Mutex. You are a genius and can guarantee this Fn will never leak and it's safe to have it not capture something by value that is used later in the program and you really need the performance of not using Arc? Cast it to a ptr and read in an unsafe block in the closure. Rust doesn't stop you from doing whatever you want in this regard, it just makes you explicitly ask for what you want rather than doing something stupid automatically and making you have a hard to find bug later down the line.


> The thing I really want to try and get across here is that *Rust is not a language where first-class functions are ergonomic.*

So... don’t use first-class functions so much? It’s a systems language, not a functional language for describing algorithms in CS whitepapers. Or use `move` (the article does mention this).

There are easy paths in most programming languages, and harder paths. Rust is no exception. The fact that passing around async closures with captured variables and also verifying nothing gets dropped prematurely and without resorting to runtime GC is bleeding-edge technology, so it should not be surprising that it has some caveats and isn’t always easy to do. The same could be said of trying to do reflection in Go, or garbage collection in C. These aren’t really the main use case for the tool.


> So... don’t use first-class functions so much? It’s a systems language, not a functional language for describing algorithms in CS whitepapers.

Then maybe it was a mistake to adopt an async paradigm from functional languages that relies heavily on the idea that first-class functions are cheap and easy?

(FWIW I think Rust was right to pick Scala-style async; it's really the only nice way of working with async that I've seen, in any language. I think the mistake was not realising the importance of first-class functions and prioritising them higher)


> maybe it was a mistake to adopt an async paradigm from functional languages

> I think Rust was right to pick Scala-style async

I'm confused by this assertion. I'm more aware of procedural language origins of syntactic async/await than functional? The scala proposal in 2016 for async/await even cites C#'s design (which came in C# 5.0 in 2012) as an inspiration[1].

From there, it appears python and typescript added their equivalents in 2015 [2].

If anything, async-await feels like an extremely non-functional thing to begin with, in the sense that in a functional language it should generally be easier to treat the execution model itself as abstracted away.

[1] https://docs.scala-lang.org/sips/async.html

[2] https://en.wikipedia.org/wiki/Async/await


> If anything, async-await feels like an extremely non-functional thing to begin with

Futures/promises (they mean different things in different languages), like many other things, form monads. In fact async-await is a specialization of various monad syntactic sugars that try to eliminate long callback chains that commonly affect many different sorts of monads.

Hence things like Haskell's do-notation are direct precursors to async-await (some libraries such as Scala's monadless https://github.com/monadless/monadless make it even more explicit, there lift and unlift are exactly generalized versions of async and await).

To see how async-await might be generalized, one could turn to various other specializations of the same syntax, e.g. an async that denotes a random variable and an await that draws once from the random variable.

To see the correspondence with a flatMap method (which is the main component of a monad), it's enough to look at the equivalent callback-heavy code and see that it looks something like

  Future(5)
    .flatMap(x ->
      doSomethingFutureyWithX(x)
        .flatMap(lookItsAnotherCallback)
    )


I'm not clear on if this is supposed to be disagreement or elaboration or education.

The fact that in a language like Haskell, you can perform something like async-await with futures (which are absolutely a kind of monad) in a natural way is precisely what I had in mind with what you quoted.

Regardless, the specific heritage of async-await syntax seems rooted in procedural languages (that do borrow much else as well from functional languages, yet are still not functional in any meaningful sense) like C# and python. They are absolutely an attempt to bring some of the power of something like monadic application (including do notation) into a procedural environment as an alternative to threads (green or otherwise), which hide the execution state machine completely.


> the specific heritage of async-await syntax seems rooted in procedural languages

I don't think so. Async-await in both syntax and semantics is pretty firmly rooted in the FP tradition.

For semantics, the original implementation in C#, and as far as I know most of its successors, is to take imperative-looking code and transform it into CPS-ed code. That's a classic FP transformation (indeed it's one of the paradigmatic examples of first-class functions/higher-order functions) and one of the most popular methods of desugaring imperative-looking code in a functional context.

For syntax, the idea of hiding all that CPS behind an imperative-looking syntax sugar is the whole reason why do-notation and its descendants exist. (Indeed, there's an even deeper connection there specifically around CPS and monads: https://www.schoolofhaskell.com/school/to-infinity-and-beyon...)

But my point was simply that there's a pretty straight line from CPS, monads, and do-notation to async-await and so I think it's pretty fair to say that async-await is rooted in the FP tradition.


You don't need async/await to do monadic comprehension in Scala, it's built into the language from the very beginning with `for`.

This was inspired by do notation, which came about ~1998.


> I'm more aware of procedural language origins of syntactic async/await than functional? The scala proposal in 2016 for async/await even cites C#'s design (which came in C# 5.0 in 2012) as an inspiration[1].

The C# version comes from F# (2007) which was in turn inspired by the "Poor Man's Concurrency Monad" implementation for Haskell (1999) (in turn inspired by Concurrent Haskell and, ultimately, Concurrent ML). It's very much a functional lineage.


I wasn't aware of the F# heritage, that's interesting. I'm curious why the scala proposal wouldn't cite it. Especially surprising since scala's at least superficially looks more like F#'s than it does like C#'s.

I don't dispute (as I have had to say repeatedly in other branches of this) that the roots of futures as a concept are in functional programming, but the path I'm saying I see here is effectively:

    - haskell/monads
    |- do-notation
    |- monadic futures
    |- (this is new to me) F# appears to have added 'types' of do blocks, including a specifically async one?
    \-> async-await as a sort of re-integration of quasi-monadic evaluation into procedural languages like C#, python, and eventually javascript and rust.
So what's weird to me is drawing a direct line between scala and rust here when the relevant precedent seems to be procedural languages distilling a thing from functional languages into a less flexible syntactic-sugar mechanism we now usually call async-await. Scala seems like a footnote here, where if you want to claim its descent from functional languages you would go farther back.


I don't see async/await (at least when built on top of futures/promises) as a procedural thing - the parts of C# where it's used are the least procedural parts of C#, and Python has always been multi-paradigm. I'd say it's mainly a way of doing monadic futures in languages that don't have monads (mainly because of lacking HKT) - hence why F# adopted it first, and then it made its way into functional-friendly languages that either didn't have HKT, or in Scala's case found it to be a useful shortcut anyway. "Functional" is a spectrum rather than a binary, but I don't think it's right to see async/await as being imperative any more than having map/reduce/filter in C#/Python/Javascript makes them an imperative thing. (I would agree that Haskell and Scala, with true monads, are more functional than C#/Python/Javascript - but I'd say that having async/await means C#/Python/Javascript are closer to Haskell/Scala than similar languages that don't have async/await; async/await are making them more functional, not less).

As for why I mentioned Scala specifically, I understand that Rust's Futures are directly based on Scala's. I had assumed this would apply to async/await as well, but it sounds like apparently not? In any case there's not a huge difference between the Scala/C# versions of the concept AFAICS.


Sure. What I'm getting at is that I see the specific syntax of "async-await" as a synthesis of concepts from functional and procedural heritage. I agree about it being a spectrum and all that, so I think we're mostly just talking past each other about the specifics of how it came to be and using different ways of describing that process.


What procedural heritage do they have? I don't think there's anything procedural about them (unless you consider "doesn't have HKT" to be the same as "procedural").


> Rust was right to pick Scala-style async

Huh, I actually find Rust and Scala to do async quite differently. The only thing in common to me is the monadic nature of Future itself. Otherwise I find there to be a big fundamental difference in how Scala's async is threadpool/execution-context based, while Rust's async is polling-based.

Then there are the syntactic differences around Rust having async/await and Scala... not.


It's always possible there's a better way. If you know of one, maybe you can write a proposal? There is always Rust v2. Rust lacks a BDFL and so it's almost like the language grows itself. Chances are, the async model that was used was picked because it was arrived upon via consensus.


I don't think there is a better approach; I think the right thing would have been to lay proper functional foundations to make async practical. I did speak against NLL at the time, and I keep arguing that HKT should be a higher priority, but the consensus favoured the quick hack.


I agree that Rust should have embraced HKTs instead of resisting them every step of the way. It's been my observation that the Rust lang team is pretty much against HKTs as a general feature, which makes me sad.

I'm also happy to meet the only other person who had reservations about NLL! :p I have mixed feelings about it, still. It really is super ergonomic and convenient, but I really value language simplicity and NLL makes the language more complex.

I also don't think there's much Rust can do differently in regards to "functional foundations". With Rust's borrow/lifetime model, what would you even do differently with respect to `fn`, `Fn`, `FnMut`, and `FnOnce`?


HKTs were considered, and do not actually solve the problem, in Rust. Rust is not Haskell. The differences matter.

Nobody is against HKTs on some sort of conceptual level. They just literally do not work to solve this problem in Rust.


Which problem are we referring to? I was only making a general statement that I think Rust would have benefited from HKTs instead of doing umpteen ad-hoc implementations of specific higher-kinded types. I'm far from an expert, so please correct me if I'm wrong:

Wouldn't HKTs help us abstract over function/types more easily, including closures?

Aren't GATs a special case of HKTs?

If Rust somehow had HKTs and "real" monads, we could have a do-notation instead of the Try trait+operator and Future trait+async+await, right?

I'm not saying that HKTs would fix (most of) the issues mentioned in the article around tricky closure semantics and whatnot.


> Wouldn't HKTs help us abstract over function/types more easily, including closures?

In the sense that HKTs are a higher level abstraction, sure. More later.

> Aren't GATs a special case of HKTs?

My understanding is that GATs can get similar things done to HKTs for some stuff that Rust cares about, but that doesn't give them a subtyping relationship. Haskell has both higher kinded types and type families. That being said, my copy of Pierce is gathering dust.

> If Rust somehow had HKTs and "real" monads, we could have a do-notation instead of the Try trait+operator and Future trait+async+await, right?

It depends on what you mean by "somehow." That is, even if Rust had a monad trait, that does not mean that Try and Future could both implement it. This is because, in Haskell, these things have the same signatures. In Rust, they do not have the same signature. For reference:

  pub trait Iterator {
      type Item;
      pub fn next(&mut self) -> Option<Self::Item>;
  }

  pub trait Future {
      type Output;
      pub fn poll(
          self: Pin<&mut Self>, 
          cx: &mut Context<'_>
      ) -> Poll<Self::Output>;
  }
While they are both traits, both with something returning their associated type:

1. Iterator returns Option, while Future returns something like it. These do have the same shape in the end though, so maybe this is surmountable. (Though then you have backwards compat issues)

2. poll takes a Pin'd mutable reference to self, whereas iterator does not

3. Future takes an extra argument

These are real, practical problems that would need to be sorted, and it's not clear how, or even if it's possible to, sort them. Yes, if you handwave "they're sort of the same thing at a high enough level of abstraction!", sure, in theory, this could be done. But it is very unclear how, or if it is even possible to, get there.


> That is, even if Rust had a monad trait, that does not mean that Try and Future could both implement it. This is because, in Haskell, these things have the same signatures. In Rust, they do not have the same signature. For reference:

This is nonsense; Try and Future are not the same thing in Haskell, there are plenty of functions you can only call with one or the other.

The point of the monad abstraction is to abstract over the part that is the same, mostly the function which Rust calls and_then:

    pub fn and_then<F, B>(self, f: F) -> AndThen<Self, B, F>
    where
    F: FnOnce(Self::Item) -> B,
    B: IntoFuture<Error = Self::Error>,
    Self: Sized, 
    
    pub fn and_then<U, F>(self, op: F) -> Result<U, E>
    where
    F: FnOnce(T) -> Result<U, E>,
Obviously these signatures aren't quite identical, but they're actually even more similar than I thought; AndThen<Self, B, F> is a subtype of `impl Future<B, Self::Error>`, and the fact that there's no IntoResult seems like plumbing rather than anything fundamental. So if we could write an interface like:

    trait Monad<M<*>> {
      pub fn and_then<A, B>(self: M<A>, f: F) -> impl M<B>
      where
      F: FnOnce(A) -> M<B>
    }
then these both conform to that - in the first case with M=Future<Error=self::E> and A=Self::Item, in the second case with M=Result<Err=E> and A=T.

Yes, there are low-level things you might want to do with Future or Result that you can't do via the monad interface - just as with any other high-level interface. But having the high-level interface available makes the simple, common cases a lot easier. I don't know what these "real, practical problems are", but they're certainly not at the syntactic/interface level.


Sorry, you're right! Ironically, I remembered the high level problem, but filled in the wrong concrete details. I also made the mistake of talking about Try/Future, and then switched to Iterator/Future.

> is a subtype of

Rust does not have subtyping. (except for lifetimes)

The point is, we don't actually have an existence proof that this is possible, and a lot of evidence to the contrary. It may be possible, but it's not a simple "just do x." There is a lot of handwaving in this post that would need to be hammered out, and details that would need to be fixed.

(Another point here is that "impl Trait in traits" isn't currently implemented in Rust either; this one is more feasible, though I am less sure about it when parameterized.)


> Rust does not have subtyping. (except for lifetimes)

This is frequently claimed but AFAICS it's no longer true post-impl trait. (I guess the counterargument is that impl trait is not a first-class type?). In any case, "Liskov-substitutable for" instead of "a subtype of" carries my point.

> The point is, we don't actually have an existence proof that this is possible, and a lot of evidence to the contrary. It may be possible, but it's not a simple "just do x."

I mean you could say the same for any feature not currently present in Rust - I appreciate that there's more to it but what I've followed of the discussions really didn't feel like there were clear blockers so much as a lack of interest. If there's genuinely a question mark about the possibility, would making an working implementation (that makes a lot of arbitrary choices about syntax, efficiency etc.) advance the conversation?


impl Trait is an existential type, not a subtype relationship. And it doesn't really mean "Liskov substitutable" either. After all, it names a single type, just un-named.

> would making an working implementation (that makes a lot of arbitrary choices about syntax, efficiency etc.) advance the conversation?

I don't know. It might, but it's also possible that the team has other objections I'm not aware of.


> impl Trait is an existential type, not a subtype relationship. And it doesn't really mean "Liskov substitutable" either. After all, it names a single type, just un-named.

Well, "impl Iterator<Item=i32>" is clearly not the same thing as "Vec<i32>", but "Vec<i32>" is Liskov-substitutable for it. AFAICS it meets every definition of a subtype unless you take the position that "impl Iterator<Item=i32>" isn't a type at all, which begs the question of what "impl Iterator<Item=i32>" is - certainly it's something more than syntax sugar for "Vec<i32>", and it looks like a type and largely quacks like a type AFAICS.


`Vec<i32>` or `vec::IntoIter<i32>` being a subtype of `impl Iterator<Item=i32>` would imply that you could take have a vec of type `Vec<impl Iterator<Item=i32>>` and have many structs of different types stuffed in, that all implement `Iterator<Item=i32>`. You can't do that, ergo, it's not a subtype.


> maybe it was a mistake to adopt an async paradigm from functional languages

I always thought of async as higher language sugar for coroutine patterns that I'd seen in use in assembler from the 1980's.


> It’s a systems language, not a functional language for describing algorithms in CS whitepapers.

This kind of toxicity is why I left programming behind as a career.


How is saying Rust is one thing but not another thing toxic? I never said it’s the author’s fault Rust is broken or anything like that. It just has some goals, and being a functional programming language isn’t one of them (as far as I know).


One way to read you comment, which maybe you didn't intend, is "this is a language for Real Work, not one for those silly academics." I don't personally think you went that far, but I imagine that is how the parent read it.

I think it's the "for" to the end of the sentence that does it.


That would make sense. On the contrary, though, I quite admire whitepapers’ use of FP and would like to learn it someday. But my understanding is that there are already quite a few languages devoted to that, and Rust’s focus is something different. After all, if you have the same goals as another language, you just may end up re-creating the same language with different syntax. That said, it would be nice if someday Rust could be as convenient to write FP in as say Lisp or Haskell.


Yeah, a struct holding four different closures is not a pattern that makes much sense in Rust.

That would look a lot better as a Trait with four methods.


I think your implication that a system programming language can't have nice stuff is based on system languages of old. Unless a language is absolutely must have manual memory allocations, it can have 1st class functions and closures.


I never implied it can’t be nice, just that it currently isn’t and that a solution to this blog post is, “don’t do that, at least for now.” I mean, the title is that async Rust doesn’t work, yet I use several successful codebases that use it, so it actually does work, just not the way the author wants it to. Another solution could be to submit a proposal to improve async support in the desired way.


Wanted to write something similar, just not as elaborate as you did (native German speaker here).

Another good example is Java, and languages running on top of the JVM.


As I read through the database example, I saw that the compiler just caught a multi-threading bug for the author, and instead of being thankful, he’s complaining that Rust is bad.

I think he should use a higher level framework, or wait a few years for them to mature, and use a garbage collected language until then.


He's just pointing out that it isn't very ergonomic, not that Rust is bad (in fact he states the opposite multiple times).

Pointing out weaknesses and things that might be done better is what helps something mature, not just praising it.


"Ergonomic" is such a nebulous word as to be nearly useless honestly.

I don't see how it is [unduly] inefficient or uncomfortable for a language at Rust's level to ensure that the programmer actually thinks through the execution of the code they are writing. If one doesn't want to think about such things - and there's absolutely nothing wrong with that! - there are plenty of higher level languages that can do that legwork for them with more than good enough performance.

To me it feels a bit like pointing out that the process of baking a cake from scratch isn't very ergonomic, what with all the careful choosing and measuring of ingredients and combining them in the right order using different techniques (whisking, folding, creaming, etc). That is simply what goes into creating baked goods. If that process doesn't work for you, you can buy cake mix instead and save yourself quite a bit of time and energy - but that doesn't necessarily mean there's anything to be done better about the process of making a cake from scratch.


I actually find ergonomic to be a very clear and intuitive term as it refers to programming language design decisions.

And to use your cake analogy, the fact that a process is complex or laborious is orthogonal to the degree to which it is ergonomic. You could imagine baking a cake in a well organized kitchen with well designed, comfortable tools, or you could imagine having to stand on your toes and reach to the back of a slightly-too-high cupboard every time you need to fetch an ingredient, and having to use a mixer with way too many settings you can never remember, and none that does exactly what you want. I think this is a better analogy for ergonomics in programming languages.


> I think this is a better analogy for ergonomics in programming languages.

While it may be a better analogy, it doesn't really reflect the way the term is used. In my (admittedly anecdotal) experience, most people's complaints about ergonomics are more of expecting difficult things to be easier than they inherently are than of the actual quality of the tools they are presented with. People think they are complaining about a mixer with too many settings, when in reality what they are complaining about is the fact that there are several variables that go into mixing batter and dough. They buy a mixer that's targeted at a baker or chef who wants complete control over how their batter comes out or who even needs a mixer versatile enough to also mill grain or roll pasta, then are predictably lost because it's not as immediately intuitive to use as a simple hand mixer. I don't think that makes the mixer not ergonomic, I think it just makes it not the right tool for that particular person.


Or maybe it's a poorly designed mixer, or it's trying to be a mixer and a blender at the same time and not doing a good job at either task.

Sometimes complexity is a necessary consequence of the domain, and sometimes it's simply the result of poor design.


The whole point of async is entirely ergonomics. Anything you can do in async you can do in continuation-passing style.

I believe the point the author is making is that they added a feature for ergonomics' sake that ended up not being ergonomic; async-style programming usually coincides with first-class functions and closures, and those are painful in Rust.


> async-style programming usually coincides with first-class functions and closures, and those are painful in Rust.

Are they, though?

The example the author gives in the article tries to write a multithreaded program as if it is a singlethreaded one. Of course that is going to be painful, whether you're trying to use futures/promises or you're using callbacks. If you want to use multiple threads safely (i.e. at all) you need to use the right structures for the job - smart pointers, mutexes, etc. This is independent of language really, even dynamically typed languages like Ruby still have mutexes for when you want to use multiple threads (though they do take control away from the programmer in the form of things like global interpreter locks).

If you don't actually want to deal with the complexities of multithreading, then don't use it. For example there are a good number of languages/runtimes that only expose a single-threaded event loop to the programmer (even though they may use thread pools in the background). But I don't think "ergonomic" should necessarily mean "abstract away every single detail" or "things should Just Work even though one is straight-up not approaching the problem correctly".


It is my understanding that the entire point of the async/await sugar in any language is to explicitly take away the appearance of multithreading/concurrency/parallelism in code. I often see the argument for async/await syntax as being made in the following way, it is not natural or feasible to reason about concurrent and multithreaded code, programmers need to be able to reason about all code as though it were a linear list of instructions no matter what is going on with the implementation, therefore, we need the async/await syntax to enable linear, straight line code appearance for programmer benefit. While I do not agree that doing so is at all a good thing, it is the most prevelant argument I have seen in justifying async/await style sugar. If I am right and this is the primary motivation for adding this to a language, your comment is in direct opposition to the entire point of such constructs. Using this reasoning, I think the author was doing as suggested and writing multithreaded code as though it was just a non threaded imperative program.

Note: I don’t really disagree with your view, but it seems that view is in the minority.


Nothing about Rust async depends on closures or functions being first-class. It's literally just a state machine transformation.

But now that state data that previously lived on a nice last-in-first-out stack has a different lifetime. Since Rust fastidiously encodes lifetimes in the type system, using async can result in having more complicated types to talk about.

To me it's just the price that must be paid to use a language that doesn't constantly spill garbage everywhere, necessitating collection.


You are to find fancy runtime-only errors in async functions. Their existence outweighs any other compiler’s benefits.


> caught a multi-threading bug

The compiler complained: “error[E0308]: mismatched types”


That’s why it’s amazing, type safe efficient multi-threading without dynamic memory allocation with a nice syntax...it’s the holy grail of server programming.


It would be nice if it was explicitly a borrowing-related message. That's the kind of thing I can see happening in a future rustc actually.


Yeah, the good thing is that the devs are always happy to take feedback on improving the error messages (like explaining differences between Fn/FnMut/fn here)


Why would I want to write a server in a language that requires such awkward approaches to basics like coroutines and dynamic dispatch when I could use kotlin on the JVM and get pauseless GC, ultra fast edit/compile/run cycles, efficient and powerful coroutines, and eventually Loom which will eliminate the whole problem of coloured functions completely and let me just forget about coroutines? Multi threading bugs are not so common in real Java/Kotlin programs to justify this epic set of costs, in my view.


Have you ever tried writing the same application on both a JVM language and in Rust, and then measuring the latency & throughput differentials? Blew my mind the first time.

Of course, if speed isn’t a concern for you, then please carry on.


GC always forces you into a tradeoff between pause time, throughput, and memory overhead.


Good luck getting Google's ad backend with 20ms latency budget and billions of dollars of revenue approved with the JVM.

Therr are many tasks where what you suggests is just too slow and unpredictable.

Don't get me wrong, I think JVM is great, it's just not systems level programming.


Modern JVM GCs have pause times below 1msec. That's a new capability though so most people aren't yet aware of it.

And Google's ads backend are hardly the definition of server. Their ads front-end for example always used to be in Java. Not sure what it is these days


Or even better: Go


Yeah I don’t get it either. We’re supposed to be upset that it did it’s job? Lol?


I believe haskellers would love (and maybe did) to encode commutativity and thread-safety in the type system :)


Everything is "mismatched types" in Rust, literally it doesn't do any automatic type conversion (casting), so it's not the right language for most people.


What is the bug? I don't see it.


The bug is that the closure is mutating the same data structure (a Vec in this case) in a different thread - i.e., a data race.

In this specific case, only the spawned thread is mutating the Vec, but the Rust compiler is usually conservative, so it marked this as a bug.

The actual bug is one or both of the following:

1. One of the threads could cause the underlying Vec to reallocate/resize while other threads are accessing it.

2. One of the threads could drop (free) the Vec while other threads are using it.

In Rust, only one thread can “own” a data structure. This is enforced through the Send trait (edit: this is probably wrong, will defer to a Rust expert here).

In addition, you cannot share a mutable reference (pointer) to the same data across threads without synchronization. This is enforced through the Sync trait.

There are two common solutions here:

1. Clone the Vec and pass that to the thread. In other words, each thread gets its own copy of the data.

2. Wrap the Vec in a Mutex and a Arc - your type becomes a Arc<Mutex<Vec<String>>>. You can then clone() the data and pass it to the new thread. Under the hood, this maps to an atomic increment instead of a deep clone of the underlying data like in (1).

The Mutex implements the Sync trait, which allows multiple threads to mutate the data. The Arc (atomic ref count) allows the compiler to guarantee that the Vec is dropped (freed) exactly once.


There wasn't any multi-threading in that example I think.


Yes, there was: the function was calling the passed in closure from a different thread.

It is just one thread that’s actually appending to the Vec, but it’s still a multi-threaded example.


Are we talking about the same example?

  struct Database {
    data: Vec<i32>
  }
  impl Database {
     fn store(&mut self, data: i32) {
         self.data.push(data);
     }
  }

  fn main() {
    let mut db = Database { data: vec![] };
    do_work_and_then(|meaning_of_life| {
        println!("oh man, I found it: {}", meaning_of_life);
        db.store(meaning_of_life);
    });
    // I'd read from `db` here if I really were making a web server.
    // But that's beside the point, so I'm not going to.
    // (also `db` would have to be wrapped in an `Arc<Mutex<T>>`)
    thread::sleep_ms(2000);
  }
No threads are spawned.


`

    fn do_work_and_then(func: fn(i32)) {
        thread::spawn(move || {
            // Figuring out the meaning of life...
            thread::sleep_ms(1000); // gee, this takes time to do...
            // ah, that's it!
            let result: i32 = 42;
            // let's call the `func` and tell it the good news...
            func(result)
        });
    }
`

you missed a snippet, a thread is spawned.

edit: formatting


If for some magic reason thread::sleep_ms(1000) takes longer than 2000ms the main function would reach its end and deallocate the closure that is about to get called. Basically use after free.


And that's possible, because the sleep call is a suggestion, not a guarantee. If the CPU is blocked for 2s, then the order of waking the threads is undetermined, and the race will occur.


That is technically a race (sure) but at that point the program is getting killed anyway. Agree on the overall point though. I am a big fan of moving complete ownership / lifetime. It just makes things easier to reason about. In Chromium weak pointers are used to solve this use after free problem.


The `do_work_and_then` function does spawn a thread, it's one of the first things established in the article.


Ah, I thought it just named a block, my mistake.


You created a DB on one thread and accessed / modified it on another.


Async isn't really the problem - the same issue pops up with error handling, with resource management, with anything where you want to pass functions around. The real problem is that Rust's ownership semantics and limited abstractions mean it doesn't really have first-class functions: there are three different function types and the language lacks the power to abstract over them, so you can't generally take an expression and turn it into a function. NLL was actually a major step backwards that has made this worse in the medium term: it papers over a bunch of ownership problems if the code is written inline, but as soon as you try to turn that inlined code into a closure all those ownership problems come back.

IMO the only viable way out is through: Rust needs to get the ability to properly abstract over lifetimes and functions that it currently lacks (HKT is a necessary part of doing this cleanly). A good litmus test would be reimplementing all the language control flow keywords as plain old functions; done right this would obviate NLL.

But yeah that's going to take a while, and in the meantime for 99% of things you should just write OCaml. Everyone thinks they need a non-GC language because "muh performance" but those justifications rarely if ever hold up.


> mean it doesn't really have first-class functions: there are three different function types

Rust absolutely does have first-class functions, though. Their type is `fn(T)->U`. The "three different function types" that you refer to are traits for closures. And note that closures with no state (lambdas) coerce into function types.

    higher_order(is_zero);
    higher_order(|n| n == 0);
    
    fn higher_order(f: fn(i32)->bool) {
        f(42);
    }

    fn is_zero(n: i32) -> bool {
        n == 0
    }
It's true that when you have a closure with state then Rust forces you to reason about the ownership of that state, but that's par for the course in Rust.


If function literals don't have access to the usual idioms of the language - which, in the case of Rust, means state - then functions are not first-class.

> It's true that when you have a closure with state then Rust forces you to reason about the ownership of that state, but that's par for the course in Rust.

The problem isn't that you have to reason about the state ownership, it's that you can't abstract over it properly.


You can't abstract over it in the same way that you can in Haskell, because you have to manage ownership. You can't abstract away ownership as easily, because the language is designed to make you care about ownership.

I write Rust code for my day job, and I frequently use map/reduce/filter. IMO if I can write all my collection-processing code using primitives like that, it's got first-class functions.


It's not about "abstracting away" ownership, it's about being polymorphic over it. I want to keep the distinction between Fn, FnOnce, and FnMut. But I want to be able to write a `compose` function that works on all three, returning the correct type in each case. That's not taking away from my ability to manage ownership, it's letting me abstract over it.


You could do the polymorphic return with an enumeration. But those are distinct types with very different semantics: you can’t just say “it returns a function that you can call once or maybe a function you can call more than once. Shrug”.


I believe they're asking for "this function returns a function of the same type (Fn/FnMut/FnOnce) as its first argument, but with different arguments. A generic way to write something like `fn bind<T, F: Fn(T,...) >(f: F, arg: T) -> Fn(...)`


> But those are distinct types with very different semantics: you can’t just say “it returns a function that you can call once or maybe a function you can call more than once. Shrug”.

Right, that's why I want parametricity i.e. HKT. String and Int are different types with very different semantics, you can't just say "it returns a String or maybe an Int, shrug", but it's very useful to be able to write generic code and datastructures (e.g. collections) that work for String and Int.


Nit, you can already abstract over lifetimes. But I really agree with the HKT comment, in principle. In practice it's not so bad. Abstracting over async, mutability, and borrows would be more important to smooth over a few of these issues (there are ways around both, but it's not always obvious).

Most of these issues arise for library authors. I think a lot of folks in the ecosystem today are writing libraries instead of applications, which is why we get these blog posts lamenting the lack of HKTs or difficult syntax for doing difficult things that other languages hide from you. I don't think users have the same experience when building things with the libraries.

That said, there's a bit of leakage of complexity. Sometimes it's really hard to write a convenient API (like abstracting over closures) and that leads to gnarly errors for users of those APIs. Closures are particularly bad about this (which some, including myself, would say is good and also easy to work around), and there's always room for improvement. I don't think a lack of HKTs is really holding anything essential back right now.


> In practice it's not so bad. Abstracting over async, mutability, and borrows would be more important to smooth over a few of these issues (there are ways around both, but it's not always obvious).

My point is that absent some horrible specific hacks, HKT is a necessary part of doing that. A type that abstracts over Fn, FnMut, or FnOnce is a higher-kinded type.


> Async isn't really the problem - the same issue pops up with error handling, with resource management

There are ways to handle that, though. It's called algebraic effects and they are nicely implemented in Unison language[1] (although there they're called abilities[2]). It's very interesting language; I recommend reading [3] for good overview.

[1] https://www.unisonweb.org/

[2] https://www.unisonweb.org/docs/abilities

[3] https://jaredforsyth.com/posts/whats-cool-about-unison/


Rust programs can theoretically be fast, but most of the ones I've used are slow. I tried two high profile implementations of the same type of software, one in Rust and one in Java. The Java one was faster and used less memory.

Rust programmers tend to do all kinds of little hacks here and there to make the borrow checker happy. It can add up. The borrow checker is perfectly happy when you copy everything.

Rust is becoming one giant antipattern.

(I'm sure highly experienced Rust programmers can get it to work, but there are probably less than 1000 people on this planet that can write good Rust that outperforms C++ so does it really count.)


Interesting, could you elaborate on this software and what it does? At work, we also offer two backends for the same kind of functionality. One is the industry standard Java implementation, one is a homegrown Rust implementation. We find that for this usecase (essentially text processing, string manipulation and statistical algorithms), the rust version is much faster while using less memory. However, it is not as feature-rich.


Not the OP, but my guess is that rust will do well for the sort of code that would be fast in c anyway: linear operations large, contiguous arrays. For pointer-chasing code, or code with lots of small objects, the little tricks you use in c aren't available (or at least, not as accessible), and the lack of gc and flexibility wrt references harms you.


Code with lots of small short-lived objects is often by default faster in something like rust, because stack allocation is very efficient (even compared to the bump allocation in a generational GC, which is much more efficient than heap allocation but worse than stack). If you have a lot of objects which can exist on the stack in your program, java will tend to do poorly in comparison.

(and in general java will trade off memory for speed: modern GCs can be very efficient in execution time but will use 2x-3x as much memory in return. If you want low memory usage your code is going to run slower).


Does the borrow checker interfere with memory and object pools implementation?

Because if you require heap allocation for many short-lived objects, I expect this would be one of Java strengths unless you use an object pool.


"interfere" is a funny word, but you could do this in Rust. It's not super common, though the related technique of "arenas" can be, depending on domain.


> Rust programmers tend to do all kinds of little hacks here and there to make the borrow checker happy. It can add up. The borrow checker is perfectly happy when you copy everything.

Do you mind sharing an example? If you're talking about using to_owned or clone without reason, then it's fully on the developer. Some more pitfalls to avodi: https://llogiq.github.io/2017/06/01/perf-pitfalls.html

What you say definitely doesn't match my experience. I would say Java code and Rust code are roughly in the same order of processing speed but the JVM "wastes" some memory. You also have garbage collection complicating performance in some scenarios.

I'm pretty sure you can get in the same processing speed ballpark with careful programming in both languages.

Outperforming C++ is definitely harder but Java should be doable.


> If you're talking about using to_owned or clone without reason, then it's fully on the developer.

Well, if we could just make developers smarter then everything would be easy, but we can't. It's reasonable to ask whether real-world developers working in Rust end up doing enough extra copying to outweigh the overhead of a JVM-like garbage collector that would let them avoid ever manually copying.


Until Java finally supports value types, like .NET, D, Nim, Eiffel and plenty of other GC enabled languages.


I can't reduce JVM memory usage below 100MB. It simply is impossible. Meanwhile with rust I can easily write 20 times more memory efficient applications.


Try Graal native image. It precompiles all code ahead of time and drops all metadata that you don't explicitly say you need. The results start as fast as a C program would and uses 5-10x less memory. The trade-off is lower runtime performance: hotspot is using that memory to make your app run faster at peak.


I will agree that Rust programs can be surprisingly slow. That said, in most cases I have experienced this it came down to very simple to detect situations that are usually resolved by enclosing some large things in smart pointers.

I consider this a plus because that's a pretty simple change.


> anything where you want to pass functions around.

You should read the article! The author goes into this in fascinating detail.


I did read the article; it's overly focused on async (especially the headline). The body quite correctly analyses the problem with functions, but misses that this is much more general than async; just adopting a different async model isn't a solution.


Well, the article’s focus is on asynchrony. Naturally then, it talks about closures within the context of asynchrony and to the extent that it is relevant.

The author could have made the topic about closures, but that’s not what they wanted to talk about.


The author's focus on async leads them to the wrong conclusion. Not adopting async would not have solved the fundamental underlying problem; the perennial issues with error handling in Rust are another manifestation of the same problem, and as soon as people start trying to do things like database transaction management they'll hit the same problem again. Ripping async out of Rust isn't a solution; this problem will come up again and again unless and until Rust implements proper first-class functions.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: