Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Using a Tokio mutex is even more of an antipattern :) come to my RustConf talk about async cancellation next week to find out why!




Most people on this forum are not attending RustConf. It might be helpful to post at least the abstract of your idea.

The big thing is that futures are passive, so any future can be cancelled at any await point by dropping it or not polling it any more. So if you have a situation like this, which in my experience is a very common way to use mutexes:

  let guard = mutex.lock().await;
  // guard.data is Option<T>, Some to begin with
  let data = guard.data.take(); // guard.data is now None

  let new_data = process_data(data).await;
  guard.data = Some(new_data); // guard.data is Some again
Then you could cancel the future at the await point in between while the lock is held, and as a result guard.data will not be restored to Some.

I'm not sure this introduces any new failure though:

    let data = mutex.lock().take();
    let new_data = process_data(data).await;
    *mutex.lock() = Some(new_data);
Here you are using a traditional lock and a cancellation at process_data results in the lock with the undesired state you're worried about. It's a general footgun of cancellation and asynchronous tasks that at every await boundary your data has to be in some kind of valid internally consistent state because the await may never return. To fix this more robustly you'd need the async drop language feature.

True! This is the case with std mutexes as well. But holding a std MutexGuard across an await point makes the future not Send, and therefore not typically spawnable on a Tokio runtime [1]. This isn't really an intended design decision, though, as far as I can tell -- just one that worked out to avoid this footgun.

Tokio MutexGuards are Send, unfortunately, so they are really prone to cancellation bugs.

(There's a related discussion about panic-based cancellations and mutex poisoning, which std's mutex has but Tokio's doesn't either.)

[1] spawn_local does exist, though I guess most people don't use it.


You argument then is that the requirement to acquire the lock multiple times makes it more likely you'll think about cancellation & keeping it in a valid interim state? Otherwise I'm not sure how MutexGuards being send really makes it any more or less prone to cancellation bugs.

Right, in general a lot of uses of mutexes are to temporarily violate invariants that are otherwise upheld while the mutex is released, something that can usually be reasoned out locally. Reasoning about cancellation at an await point is inherently non-local and so is much harder to do. (And Rust is all about scaling up local reasoning to global correctness, so async cancellation feels like a knife in the back to many practitioners.)

The generally recommended alternative is message passing/channels/"actor model" where there's a single owner of data which ensures cancellation doesn't occur -- or, at least that if cancellation happens the corresponding invalid state is torn down as well. But that has its own pitfalls, such as starvation.

This is all very unsatisfying, unfortunately.


I realize now that you were proposing unlocking and relocking the mutex several times. Generally, if someone unlocks a mutex in the middle, they're already thinking about races with other callers (even if the code isn't cancelled in between).

You can definitely argue that developers should think about await points the same way they think about letting go of the mutex entirely, in case cancellation happens. Are mutexes conducive to that kind of thinking? Practically, I've found this to be very easy to get wrong.


Same would be true for any resource that needs cleaned up, right? Referring to stop-polling-future as canceling is probably not good nomenclature. Typically canceling some work requires cleanup, if only to be graceful let alone properly releasing resources.

Yes, this is true of any resource. But Tokio mutexes, being shared mutable state, are inherently likely to run into bugs in production.

In the Rust community, cancellation is pretty well-established nomenclature for this.

Hopefully the video of my talk will be up soon after RustConf, and I'll make a text version of it as well for people that prefer reading to watching.


Thank you, I look forward to watching your presentation.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: