I respectfully disagree; I don't think concurrency has to be that much more fundamentally complicated. It's likely that Rust's other design decisions are what made concurrency so difficult in Rust.
Pony does fearless concurrency better IMO, and Forty2 shows how we can expand on Pony to be faster and more flexible.
There are other approaches that have emerged recently too. For example, one can apply Loom's memory techniques to most memory management approaches to eliminate the coloring problem, to decouple functions from concurrency concerns.
There are also languages which separate threads' memory from each other which allows them to do non-atomic refcounting, relying on copying for any messages crossing thread boundaries (though that's often optimized away, and could be even less than Rust's clone()ing elsewhere).
One could also apply that technique to a language using generational references, if they want something without RC or tracing GC.
Sometimes I wish Rust waited just a few more years before going all-in on async/await. Alas!
Pony is garbage collected. Most of the reasons why Rust async/await are the way it is boil down to the fact that Rust is memory safe without using GC.
> Forty2 shows how we can expand on Pony to be faster and more flexible
I can't tell from a glance, but that also looks garbage collected.
> For example, one can apply Loom's memory techniques to most memory management approaches to eliminate the coloring problem
Assuming you're referring to the JVM Project Loom, that's just M:N threading. This was tried in Rust almost a decade ago. Nobody used it because the performance was not appreciably better than 1:1 threading.
> There are also languages which separate threads' memory from each other which allows them to do non-atomic refcounting
You mean like Rust? Like, that's exactly why Rust can have both Rc and Arc and still be safe.
> relying on copying for any messages crossing thread boundaries (though that's often optimized away, and could be even less than Rust's clone()ing elsewhere).
Ancient Rust did this, but it was removed because with the current immutability and borrow checking rules there is no need for copying anymore. Why would you want copying if you don't need it?
I'm also not going to just accept that clone() could be faster. I mean, I'm sure the clone codegen could be improved by better register allocation or whatever, but I don't think that's what you mean.
> One could also apply that technique to a language using generational references, if they want something without RC or tracing GC.
Why would you want to copy if you don't have to?
> Sometimes I wish Rust waited just a few more years before going all-in on async/await. Alas!
I haven't seen anything here that is better than Rust's async/await, and a lot that's either worse or doesn't fit with the rest of Rust's design.
I'd push back on "concurrency so difficult in Rust" -- because async isn't the only, or even best, way to do concurrency in Rust. I prefer using threads when I can, and Rust makes working with threads quite joyful[1]. I'd cautiously agree that it's possible async wasn't the best model to go "all in" on, though Rust is quite happily multi-paradigm so if something better comes along and has a notably different set of optimal use-cases than threads or async, I wouldn't be surprised to see Rust adopt it as well.
I'm personally sort of skeptical about "color free async" because the models for sync/blocking IO and async IO are so different -- you can paper over the syntax differences, but you're going to be in a world of hurt when the semantic differences arise[2]. I'll admit I haven't tried a color-free async implementation myself though, so it's just speculation / sour grapes :-)
> There are also languages which separate threads' memory from each other which allows them to do non-atomic refcounting, relying on copying for any messages crossing thread boundaries (though that's often optimized away, and could be even less than Rust's clone()ing elsewhere).
Curious what you mean by this -- my understanding is that Rust also does this (i.e., you can `move |x|` a value into a thread and that thread owns it now, and then the thread can hand it back in a `JoinHandle`. That sort of sharing doesn't require an Arc or Mutex, since there's only one owner at a time. Is this something different?
[1]: The other day I turned something reading in files from the filesystem sequentially into a custom threadpool passing blocks of parsed JSON over a MPSC channel that exposed the whole thing as a sequential iterator and it worked first try. I almost didn't believe it until I wrote the tests.
[2]: E.g., "I wrote this and tested it with blocking IO but this syscall isn't supported by io_uring so in async mode it goes to a threadpool and passes some huge object in a message which kills perf with a huge memcpy", or some similar jank. Just spitballing on the type of thing I would fear happening, not a specific example.
The biggest problems with "colorless async" arise with FFI. You really can't abstract over the differences between a real OS mutex and a language mutex when you're interfacing with system libraries that expect locks to actually behave like locks. Otherwise it's a recipe for deadlocks.
Pony does fearless concurrency better IMO, and Forty2 shows how we can expand on Pony to be faster and more flexible.
There are other approaches that have emerged recently too. For example, one can apply Loom's memory techniques to most memory management approaches to eliminate the coloring problem, to decouple functions from concurrency concerns.
There are also languages which separate threads' memory from each other which allows them to do non-atomic refcounting, relying on copying for any messages crossing thread boundaries (though that's often optimized away, and could be even less than Rust's clone()ing elsewhere).
One could also apply that technique to a language using generational references, if they want something without RC or tracing GC.
Sometimes I wish Rust waited just a few more years before going all-in on async/await. Alas!