
Why async fn in traits are hard - azhenley
http://smallcultfollowing.com/babysteps/blog/2019/10/26/async-fn-in-traits-are-hard/
======
tuukkah
This article shows how it's hard to _design_ and _implement_ a novel and
powerful language like Rust. You have to understand "Haskell concepts" like
GADTs and then try and find a way to hide them from the _users_ of the
language.

~~~
pjmlp
Erik Meijr and his team did wonders bringing Haskell concepts to VB and C#.

"Confessions Of A Used Programming Language Salesman, Getting The Masses
Hooked On Haskell"

[https://www.researchgate.net/publication/237445028_Confessio...](https://www.researchgate.net/publication/237445028_Confessions_of_a_Used_Programming_Language_Salesman_Getting_the_Masses_Hooked_on_Haskell)

~~~
fluffything
> Erik Meijr and his team did wonders bringing Haskell concepts to VB and C#.

If it took wonders to make Haskell concepts mainstream in languages with a GC,
I can only describe what Rust achieves as true miracles.

~~~
pjmlp
Indeed, however having a GC doesn't preclude having linear types as well, thus
having the cake and eating it too.

In any case, Rust has already managed other language designers to take a look
into adopting such type systems ideas, in itself that is a big victory from
Rust community.

------
fluffything
> The only real drawback here is that there is some performance hit from
> boxing the futures – but I suspect it is negligible in almost all
> applications. I don’t think this would be true if we boxed the results of
> all async fns; there are many cases where async fns are used to create small
> combinators, and there the boxing costs might start to add up. But only
> boxing async fns that go through trait boundaries is very different.

In a situation like this:

    
    
        trait Foo { async fn foo(&self) -> Bar; } 
        trait Baz: Foo { async fn baz(&self) -> Meow { ...await self.foo() into a Meow ... }
    

you'll end up with boxes of boxes of boxes. It kinds of makes the feature
something for the outermost abstraction layer or you end up with boxes of
boxes of boxes.

> And of course it’s worth highlighting that most languages box all their
> futures, all of the time. =)

It's also worth highlighting that most of those languages (1) have a GC that
makes boxing a much cheaper operation than in Rust, and (2) those languages
aren't advertised as "low-level" languages and that justifies implicit boxing
as a trade-off.

\---

I feel that async shouldn't really be special but rather just build on other
features that stand on its own. GATs and impl Trait in Traits seem reasonable.
But being able to do dynamic dispatch on trait methods that return a type that
isn't `Sized` seems like a tough problem to solve, and the blog post didn't
managed to convince me that implicit boxing is the right approach here.
AFAICT, we need either some kind of boxing, or better support for unsized
rvalues or similar. I think I would be more comfortable with "sugar" for
boxing if the caller of the trait method were in control of where exactly the
result is allocated (not necessarily a Box), but that calls for some kind of
placement syntax.

~~~
likeliv
> most of those languages (1) have a GC that makes boxing a much cheaper
> operation than in Rust

Why is that? I would intuitively think it is the other way. (Is a malloc/free
pair not cheaper than an allocation on the GC heap + collecting its garbages?)

~~~
steveklabnik
As always, it depends on a _lot_ of details. A generational garbage collector
can be made to allocate extremely quickly; IIRC for the JVM it's like, seven
instructions? For short lived allocations, it sort of acts like an arena,
which is very high performance. malloc/free need to be quite general.

It's always about details though. If a GC is faster than malloc/free, but your
language doesn't tend to allocate much to begin with, the whole system can be
faster even if malloc is slower. It always depends.

~~~
geogriffin
Doesn't something like jemalloc basically give you this, but without pauses?
Thread-local freelists for quick recycling of small allocations without
synchronization.. funnily enough, jemalloc even uses some garbage collection
mechanisms internally.

~~~
steveklabnik
I don't know a ton about jemalloc internals, but it is true that a lot of
modern mallocs use some mechanisms similar to GCs. There's some pretty major
constraint differences though.

------
The_rationalist
Does anyone know how kotlin coroutines compares to async await in rust? And
why did they choose async instead of coroutines?

~~~
steveklabnik
I don't really know Kotlin, but it was discussed while async/await was being
designed.
[https://www.reddit.com/r/rust/comments/83tak7/implementing_k...](https://www.reddit.com/r/rust/comments/83tak7/implementing_kotlinstyle_continuations_on_top_of/)
is one thread I remember.

There's also
[https://www.reddit.com/r/rust/comments/6zy8hl/kotlins_corout...](https://www.reddit.com/r/rust/comments/6zy8hl/kotlins_coroutines_and_a_comparison_with_rusts/)

One tough part about digging through history here is that designs change over
time...

------
skybrian
What's a "semver hazard?" Doing a Google search finds very few references.

~~~
comex
It's just a fancy way to say "risk of breaking backwards compatibility".
Semver is Semantic Versioning, a versioning scheme that all Rust crates are
expected to use.

~~~
lugg
Why is semver expected? Isn't it well known to be a flawed practice yet?

~~~
jsjohnst
> Isn't it well known to be a flawed practice yet?

Care to explain? Why would using semantic versioning, something near
universally accepted as a best practice, be flawed in your eyes?

~~~
zzzcpan
It's a convention that cannot really be enforced by compiler or any tooling,
it ultimately relies on humans following it, hence it will always be broken in
some way for some users.

~~~
fluffything
Actually, you are wrong in that semver cannot be enforced by tooling.

Rust has tooling to automatically enforce semver. For example, this tool:
[https://github.com/rust-dev-tools/rust-semverver](https://github.com/rust-
dev-tools/rust-semverver)

Once you modify a Rust library, it downloads the last released version,
compiles it and extracts its AST, and compares it with the AST of the current
version.

The diff of the two ASTs tells you what the changes are, and there is a Rust
book that documents which changes are semver breaking and which aren't. So if
you only add a new function to your crate, the tool says that you can bump the
semver patch version or minor version, but if you change the name of a public
API, the tool requires you to make a new major semver release.

Setting this tool in CI is dead easy, and will make your CI fail if a PR
changes the API of your crate without properly updating its semver version.

~~~
lugg
That sounds super interesting I didn't know rust crates were managed like
this.

The question in my mind is where does semver come into this?

Why do you need three arbitrary and meaningless numbers when a commit hash,
branch, or tag would suffice.

------
noncoml
I grew up when OOP was the shit. Everyone wanted to add OOP to every language.
C++, ObjC then Java.

I absolutely hated, not because OOP is bad, but because they never really
manage to blend it just right with static typing. But I was the odd one.
Everybody else seemed to love it and use it for everything.

I was lucky enough to outlive this OOP craziness.

Now I am seeing the same craziness with “a sync” and continuations and wonder
if I will be lucky enough to outlive it to.

    
    
        fn get_user(&self) -> Pin<Box<dyn Future<Output = User> + Send + '_>>;
    

This is plain craziness.

~~~
dtech
This is a silly argument, of course de-sugaring language leads to a less nice
type signature. C functions compiles down to assembly and C++ or Java hide
vtables and such when you do method calls.

If you go lower in the abstraction level you get more complicated mechanics,
and that says nothing over the validity of the high level semantics.

~~~
fauigerzigerk
_> C functions compiles down to assembly and C++ or Java hide vtables and such
when you do method calls._

I don't think it makes sense to compare the output of Rust macros to assembly
code generated by a C compiler or to vtables.

Macros are part of the Rust language, as is their output. Understanding Rust
means understanding both input and output. So this abstraction boundary is
intentionally leaky to some degree.

Generated assembly and vtables on the other hand are compiler implementation
details subject to change without notice. Any abstraction leakage is
unintended and undesirable (even if developers sometimes benefit from
understanding that output)

Your line of reasoning makes all debates about language complexity completely
pointless.

~~~
likeliv
I disagree that the user of the macro needs to understand its output. The
output of a macro is an implementation detail, and the documentation of the
macro should be enough to use the macro without even looking at its output.

For example, no need to understand the magic behind the 'quote!' or the
#[async_trait] macros to use them.

~~~
fauigerzigerk
Not every user has to understand every macro. But the output of a Rust macro
is valid Rust code whereas the output of a C compiler is not valid C code.

As a consequence, criticising the complexity of whatever a C compiler
generates cannot possibly be valid criticism of C's complexity on a _semantic_
level whereas criticising the output of a Rust macro can be valid criticism of
Rust on a semantic level.

~~~
dtech
Nearly all C compilers allow inline assembly. Macros are similar to inline
assembly in that they step outside the normal bounds/use case of the language
and are a complicated but valid and useful tool.

Most C programmers won't have to write or understand inline assembly often, if
ever. Of course you can encounter it in production problem or something, so
you could make an argument that all C programmers need to understand "C with
inline assembly", which you are making for Rust macros.

As long as you just use Rust macro's and not write your own you are solidly in
"C _without_ inline assembly" territory.

~~~
fauigerzigerk
_> Nearly all C compilers allow inline assembly. Macros are similar to inline
assembly in that they step outside the normal bounds/use case of the language
and are a complicated but valid and useful tool._

I couldn't disagree more. Macros are not similar to inline assembly at all
precisely because they do _not_ step outside the bounds of the language.

Whatever similarities you may find, it's simply not helpful to deny the
fundamental distinction between language A generating code in language A and
language A invoking/generating code in language B.

It's futile to debate the properties of a particular language if you can't
make a distinction between that language and anything it can generate or embed
in some opaque way.

~~~
dtech
What I fail to understand is why it's so important to you whether a pre-
processing/compiler/de-sugaring of language A results in a valid snippet of
language A or another language for complexity of language A.

Take Objective-C automatic reference counting [1], implemented as a
transformation of the original code to valid code of the same language
(similar to a Lisp/Rust/Scala style macro) by automatically adding the
appropriate statements.

If I understand your argument correctly, according to you this increases the
complexity of "Objective-C with ARC", but would not have done so if the
compiler would have implemented it as a direct transformation to its
compilation target instead.

To me, that is an implementation detail which does not matter. "Objective-C
with ARC" is exactly as complex in both cases. I'd argue it's even a little
bit less complex with the "macro" implementation since you don't need to know
assembly to know what ARC is doing.

Similar to ARC, Rust implements some things with macro's, which first
"compiles" something to valid Rust. To me this is not more difficult for users
than it would be if the compiler would directly generate LLVM IR without this
intermediate step.

The inclusion of macro's in a language do make the language more complex of
course! And creating macros is notoriously difficult since you're basically
implementing a small compiler step! But for the user using something is not
suddenly more difficult because it's implemented using a macro.

[1]
[https://en.wikipedia.org/wiki/Automatic_Reference_Counting](https://en.wikipedia.org/wiki/Automatic_Reference_Counting)

~~~
fauigerzigerk
_> What I fail to understand is why it's so important to you whether a pre-
processing/compiler/de-sugaring of language A results in a valid snippet of
language A or another language for complexity of language A._

It's important because any and all code in language A is fair game when it
comes to criticising semantic properties of language A. Code in other
languages isn't.

noncoml criticised Rust based on a piece of Rust code. My point is simply that
this criticism is potentially legitimate in ways that criticising C based on a
piece of assembly code could never be.

I think our disagreement arises because you are asking a completely different
question. What you're saying is that for devs who invoke some code it may not
matter one bit whether that code was implemented in language A or language B
or language A generated by language A or B. Those distinctions do not
necessarily affect the semantic complexity for users of that code.

I completely agree with that. I also agree that the code snippet noncoml
posted does not mean using async code in Rust has to be overly complicated.

But when I see a piece of Rust source code, I can criticise Rust based on it
regardless of where that code came from or what purpose it serves.

Someone had to think in terms of Rust in order to write that code, and it's
always worth asking whether it shouldn't be possible to express the same thing
in a simpler way or whether that would have been possible in another language.

The fact that this code does not have to be understood by its users is
completely irrelevant for this particular question.

