It's very cool that mutexes on Linux don't require an allocation anymore! A very common pattern is to use an Arc<Mutex<T>> for sharing ownership across threads and as far as I can tell this will now only require one allocation instead of two.
It's only lexically scoped though, so you don't have much use of them in thread pools and, importantly, async runtimes.
Pre-1.0, we used to have JoinHandle<'a> which let you use the borrow checker to its full potential with multiple threads. This is a big practical hurdle today, where mutices are required even for simple cases which don't suffer data races in practice.
Of course you can't really use it for everything (ie. whatever you put inside `Mutex::new()` needs to be static as well, but still, it's a nice option too
Yes. The reason why currently you need to wrap Mutexes in Arcs is not thread-safety per se. Without std::thread::scope the Rust compiler doesn't know if the spawned threads won't outlive the Mutex (which is borrowed by the closures). So it's more about memory-safety than thread-safety in this case. std::thread::scope guarantees that the threads will be joined in the std::thread::scope, thus providing enough information to the compiler (via lifetime annotations) for it to decide that they won't outlive the Mutex, so borrowing is safe in this case. That sorted, Mutex always had Sync and Send, so everything is in place to use it from multiple threads.
These are synthetic benchmarks but it's quite significant in them.
From a different tweet:
> It's the total time for 32 threads each doing 10'000 lock+unlocks (on a 64C/128T threadripper). So, the numbers you quoted correspond to a lock+unlock operation going from 8.75ns to 2.45ns, under low contention.
> The numbers can vary a lot in different situations/hardware though.
I think the focus is on the synchronization and implementation choice based performance differences, https://twitter.com/m_ou_se/status/1526211117651050497 which are not super easy to characterize but come from much more than just removing an allocation.
> you're often going to be better off eliminating the Arc/Mutex anyway
Not always. Mutexes can be really fast (10-20ns), especially since they often optimistically spin, and Arc in Rust is (often) relatively low cost since you can hand out "free" refs without touching the atomic.
If removing the Arc/Mutex would require allocations the Arc/Mutex could easily be faster.
Notably, still worse than 0 ns. Ditto for Arc's refcounting and additional allocation. I'm not saying go on a crusade against Arc+Mutex here, but the easiest way to make effective use of modern multicore CPUS is to go to shared-nothing, independent data-per-thread designs (obviating Arc+Mutex). And if you aren't using Arc+Mutex, it's harder to accidentally share mutable state between threads.
I just think people seriously overestimate the cost of a mutex when implemented efficiently. Unlocking a mutex can be ~10-20x faster than fetching a value from main memory, or just a bit slower than a few integer operations. The way people talk about mutex operations you'd think that it's akin to hitting disk when it's actually a few orders of magnitude closer to hitting your L2 cache.
It gets a lot more expensive if you’re actually contending the mutex between threads; and if you’re not, why use a mutex? I agree the uncontended case is fast — it’s just not very useful.
There are a lot of scenarios where you're rarely contended but you cannot rule it out, so for correctness reasons you should use mutual exclusion but your measured performance in the real world essentially never cares about the contended case.
Modern fast mutexes are perfect for that, because their uncontended case is so good. This also inculcates the correct choice for the programmer, you should prefer to write code that is less often contended, not fight hard to get better contended performance at a cost of worse uncontended performance. Contention is bad even if your mutual exclusion primitive performs well.
But Mara measured across simulated workloads with varying contention and this fix improves them all to different extents.
Because it's an incredibly efficient, safe option for doing so. Lots of shared state is rarely contended. For example, imagine you have a 'Config' that gets updated periodically in the background, readers of that config only check for updates every 1 second, and you have 7 parallel readers (and 1 writer for an 8 core system).
A Mutex is a trivial way to solve that problem that will be extremely efficient.
Don't atomic operations trigger cache synchronisation in CPUs? Doesn't that affect performance negatively? That would mean even a non-contended mutex would affect performance negatively. I suspect it depends a lot on the specific workload (and maybe even what addresses data is stored at in memory), so I'd measure the specific case, but that's my a priori gut feeling.
The overhead of atomics is almost (if not entirely?) exclusively with regards to managing the caches in the CPU. Otherwise they're just normal bytes. Your CPU already has to do some cache management with regular bytes, so an atomic is only worse if there's contention (because that forces a flush).
The worst case for an atomic write is two additional cache line flushes, iirc.
I used to feel like I was fluent in C++ but over time I realized that I was using just like 40-50% of the language because the rest of it was so complicated that it made it difficult to work with others who may not have had as deep of a knowledge.
With Rust I've found that even though sometimes it feels like it is harder to write code with, I am finding it far more likely that it is correct when the compiler is happy with it, and I find myself worrying much less about move semantics/correctness/pointers and all that stuff like I was with C++, it has allowed me to move faster and not spend as much mental time on trying to make sure/understand if what I am doing is ACTUALLY safe.
I am using C++ as an example, because the other language I use regularly is Python and there are still pieces of Python code that make me scratch my head with "how does THAT work?!"
Absolutely agree with you. Rust lets me offload all the boring, inscrutable memory management checking to the compiler. Sure, I can’t get away with undefined behavior, but I spend my Rust programming time thinking about the business logic and letting the compiler tell me when the books are off.
> I am finding it far more likely that it is correct when the compiler is happy with it,
it's pretty interesting though - surely you would have flagged the equivalent C++ code to what is discussed in https://github.com/rust-lang/rust/issues/93740, if my understanding is correct - akin to std::unique_ptr<pthread_mutex_t>, as obviously wrong ? It's a pattern that was apparently commonplace in rust until now yet I would definitely not let my first year comp. sci. interns get away with something like this during code review.
Rust definitely has a different feeling of mastery than other languages, due to the fact that most if not all bugs are caught during the compilation phase.
In other languages I like, I reach a point where I’m generally confident my code will compile before I ask it to. In Rust, for all but the most trivial logic, the compiler will usually have several things to say. However, I usually feel like the compiler is guiding me towards a simpler/more idiomatic/better way of doing things, and the compiler is more strict simply because Rust is so well-equipped for static compile-time analysis.
So fret not, the borrow checker and all the other checks that happen when writing Rust exist precisely because these things are so painful and complex to reason about as a human. The compiler is your colleague, not your boss!
I find Rust much easier to hold in my head as well. It’s reduced the need for mid-session, “just-in-case” compilations to the point that building sometimes feels like a formality.
Unfortunatelly logic errors are not caught - this would require some sophisticated AI in compiler. But other problems should be caught.
Of course Rust will not prevent you from placing backdoors and other more sophisticated vulnerabilities in code. Compiler is great, but you still have to think.
Your comment was dead for some reason. I vouched for it, because I think it's interesting to discuss.
Although obviously no compiler of a Turing-complete language is going to eliminate all logic errors, the user of a language like Rust or Haskell may use the type system to prevent certain classes of logical errors (not just problems with the shape of data, or incorrect memory handling). The way you do it is with Abstract Data Types. One example of such a type in Rust is &str. If you don't use unsafe code, it should preserve the invariant that the slice holds valid UTF-8 data. Containing invalid UTF-8 data would be a logical error, not a memory error or data shape error. Similar things may be achieved in C++ and Java with the use of access-modifiers (public vs private class fields and methods). The idea is well-explained in the famous Parse, don't validate[1] article.
The flipside is that too much of it and code becomes so complicated, it's very hard to work with --- you're falling into a Turing tarpit[2]. It becomes easier to just write simple code without bugs, without using all that type system wizardry. But a judicious use of this pattern, where it's appropriate, may be very beneficial.
Very true, my phrasing was a bit of hyperbole in retrospect. But the bugs I end up finding at runtime when writing Rust are usually deeper issues with my own reasoning about the higher-level problem, rather than my reasoning about the inscrutable low-level machinations of a memory allocator or garbage collector.
If it helps, I was very discouraged during my first two attempts to learn Rust (both in early/mid 2018). I thought I had also reached my limit.
If you haven't tried again in a while, you should give it another attempt -- I found it much easier after the 2018 edition was finalized, in part because of significant improvements to the borrow checker[1]. In terms of programming ergonomics, the pre-2018 and post-2018 versions of the language feel very different, even if they happen to share most of the same syntax.
I've seen people find Rust difficult for three reasons:
* &mut is an "infectious leaky abstraction". It places restrictions on the caller, specifically that nobody can have a reference to that data. This disqualifies many useful patterns such as backreferences, observers, dependency references, graphs, delegates, and certain kinds of RAII.
* Rust tends to lean very heavily on the type system to surface as much detail as possible into e.g. function signatures, but can conflict when the signature is set in stone, such as when implementing a trait or exposing a public function. It also does this with non-type-system denizens, such as async/await.
* A reluctance to fall back to Rc and RefCell. Programs often have inherent shared mutability, and the alternatives are often more complex.
These restrictions make the borrow checker incompatible with code we'd naturally write in any other language. Luckily, with enough practice, it can "click" and one can get used to the architectures that are compatible with its restrictions.
The tradeoff isn't ideal for some use cases. For others, it can be a great fit.
I can understand Rust when I'm writing my own code. But when it comes to understanding other people's code, it's essentially gibberish.
A lot of library devs seem to have read "strong type system" and taken it as a challenge, meaning that half of their code is actually declarative to the compiler rather than readable in the source file. As a native Java dev this is very annoying.
Are all the other languages fairly mainstream ones? Most procedural/OOP languages are similar enough that if you’ve learnt one you’ve basically learnt them all. Rust is a just a bit different and requires actually learning something new.
For 90% of people I think the main hurdle is realising this.
My advice is to avoid optimization like the plague while learning Rust. Don't worry about extra copying or optimizing lifetimes or any of it. Make all the "extra" structure and String copies.
Doing it the "basic" way is sooooooo much easier, which is important while learning. Nine times out of ten it will be fast/efficient enough anyway.
After you've got a handle on it with a few non-trivial programs under your belt, then start tightening up once you have the necessary context and familiarity.
This is fairly good advice, but I've run into a few cases where codebases don't "optimize lifetimes" and you're left with static lifetimes everywhere. Which is ok for some use cases, but it's a core language concept. One that is confusing unless you work with it some.
While the Rust syntax around lifetimes can be confusing, The Rust Book does a decent guide describing what lifetimes are. Reading up on Modern C++ practices and language constructs actually helps with Rust too.
The rustdoc book doesn't have a lot of examples of the full markdown support for documenting your code unfortunately. Browsing through std helps.
Cargo rocks, and it's easy to spin up a new project. One thing I did was create a few new ones just for concepts that were difficult to me at first.
I found the O'reilly book, "Programming Rust" 2nd edition, to be the best book. It is long but thorough. "The book" might be a better starting point, but is less thorough I thought, and while it left me feeling more like "I got this", I didn't think it prepared me for "real world" Rust quite as well as the O'reilly book.
I think this (education) is a lacking area in Rust.
You can read the reference book, and even a few other ones, but still not being able to have a sufficient/solid grasp of the language. I couldn't find good material past the beginner stage (I consider Rust for Rustaceans advanced).
My personal advice is to find some project you'd like to implement, and practice a lot, or contribute to small projects you like, without worrying about best practices.
There have been definitely many moments where I wanted to close the Rust chapter, and that's why I think it's more important than other languages to keep the motivation high, which I think is best achieved by working on projects.
I learned c++ and rust around the same time. Even though c++ was more difficult in every single aspect, I have used it over rust in every project since. Everything about rust was great. Pattern matching, enums, result types; I could go on forever, but I just can't get in "the zone" when I develop stuff in rust. I think the mental overhead is just too much for my brain to process. So I'm right there with you, rust is too smart for me.
Its a feature rich language with a more-complex syntax then many languages and a community that is still discovering its best practices, critical libraries, and more.
Its a difficult language to learn full-stop. And its a language that can be used to produce valuable software, but probably should not be the first choice for many applications at most organizations.
Rust can feel VERY intimidating at first, but honestly once you get over the steep but abruptly ending learning hump, writing Rust is no harder than any other language you are familiar with. I only started writing Rust last October, but felt very comfortable within a few months and now feel fairly proficient. As in all things, YMMV
Have the same problem, I was able to pick up a lot of languages to fix programs and/or add features to an existing codebase, but rust was really frustrating. I programmed with it around 3 months and I still need 10 times as long for implementing some things (And the async thing is really, really awkward)
Rust has a sort of meta aspect, in the sense of "programming the programming language", that other programming languages don't have. Each problem requires two solutions - the logical one and the programming language one.
It takes time and effort to adjust... a lot of time and effort :)
In my personal experience, there are two major hurdles.
Access model is the first (I believe it's what is commonly referred to a "borrow checking"). It just happened one day that I was understanding the access model, and not screaming in terror anymore :) The same may happen to you, so don't feel incapable in the meanwhile.
Lifetimes is the second hurdle, but even before fully understanding them, you'll be able to work on complex programs.
The syntax is really off putting, but I suspect I'm not alone in that feeling, so someone will make a pretty Rust Front that transpiles to Rust. I'll wait.
While others are pointing you towards library docs I highly suggest going to the rust book[1] instead. A lot of languages you are fine learning breadth first by going through some documentation and playing around building things. This is how I learn best myself, but it doesn't work well with rust. It has enough unique concepts and concepts you might not have come across unless you have experience with a similar language. It is one language that is best served by a depth first approach. I usually suggest reading chapters 1-11 before trying to dive in and build something.
Because it’s not a programming language but a syntax error. It passes the duck test for a syntax error - it looks like a syntax error, it acts like a syntax error, and it prints a syntax error, so it is a syntax error. Avoid it!
`cargo add` is such a small thing, but I'm so excited about it. Many tiny instances of friction as you have to open your browser and manually google a crate to find the latest version number so you can copy and paste it into your Cargo.toml
I see multiple posts about Tauri 1.0 when I search, and saw the posts about Tauri here when they hit 1.0? Maybe it was taken down after, I'm not sure what transpired but perhaps it was a misunderstanding on dangs part, where he may be more familiar with Rust's release cadence.
I'm not sure why jumping to the worst possible conclusion and aggressiveness is warranted.
Rather than rely on me to narrate the facts a third time, I encourage you to look at the linked Tauri thread and check the number and vintage of other threads about Tauri.
I did. Rather than have me reiterate my reply once again (see how everyone can be as flippant as you?) you can also see that I mentioned I saw multiple posts about Tauri hitting 1.0, including the post that hit the front page on the day it hit 1.0.
Edit: Rust 1.62 uses futexes directly; ParkingLot is a thing but not backing mutex on Linux. Thanks to ibraheemdev for the correction.
I'm working to understand how Rust 1.62 avoid allocations for futex-based locks. I think the summary is this:
1. A mutex is a u8. The mutex fast path is locking via atomic ops to set a bit.
2. Contention is resolved by adding the blocked thread into a wait queue. The queue is allocated/discovered via a global hash table, keyed by the address of the mutex. The mutex keeps a stable address while locked, as the lock holds shared ownership and so the mutex cannot be moved for the duration.
I ported a POSIX semaphore wrapper from C++ to Rust, and needed an extra allocation for the same reason, to get an int with a stable address. Unfortunately this technique won't work for semaphores, they need to be async-signal safe. So I'm still stuck with `Pin<Box<UnsafeCell<sem_t>>>` where in C++ it's just `sem_t` and a deleted move constructor.
Anyways this is a lot of hoops to jump through to avoid non-movable types, hope it's worth it!
> 2. Contention is resolved by adding the blocked thread into a wait queue. The queue is allocated/discovered via a global hash table, keyed by the address of the mutex. The mutex keeps a stable address while locked, as the lock holds shared ownership and so the mutex cannot be moved for the duration.
Wouldn't this break any platform-specific priority/importance donation mechanisms, or does Linux have a way to work around this?
Actually upon reflection why wasn't this just trivial for futex? A Linux futex requires only user-space initialization, and it has a stable address while locked for the same reason, so you can get the same zero-allocations without the use of parking lot.
Neat, although the blog post doesn't make this explicit, the new x86_64-unknown-none target disables the red zone, truly making it useful in kernels where otherwise an incoming interrupt would corrupt your stack.
- No need to look up the version (though some IDEs do it for you)
- Auto-completion
- It shows you what features the crate has and whether they are activated, making it easier to discover features you need to enable or what you can remove to improve compile times
- Make it easier to document how to add a set of dependencies needed for a project (e.g. "Run `cargo add serde serde_json -F serde/derive`")
It is impressive the amount of work it took to get this ready. I took over the effort almost a year ago and at times was working full time on it (thanks to my employer). Just my part included
- a near rewrite of the format-preserving toml parser (toml_edit)
- a major revamp of the UI
- a major revamp of testing
- a near rewrite to make it compatible with cargo's code base
Generic associated types were supposed to be stabilized in 1.62 but there has been a recent backlash in the PR of what exactly to stabilize [0]. I posted it as its own separate post here [1].
Yes, but note that it's not an implementation of the Ord trait, it's just a convenient opt-in comparison method. You would need to provide it as part of a custom sorting predicate, as shown in the documentation.
Yeah, the difference is that total_cmp can handle NaNs, whereas with partial_cmp you either had to add your own handling, or get panics on NaN if you blindly call partial_cmp.unwrap().
I think in most cases, splitting NaNs across the start and end of the list isn't desired behavior, so I'll probably keep using partial_cmp with my own handling.
I think I'd be happier to see total_cmp anywhere that it's clear we want some consistent order and it's not important what the order actually is.
The hand-rolled solution risks inconsistency. Given the opportunity to pick the order, people are going to pick different orders, sometimes by mistake, and providing total_cmp fixes that.
I can see that in things which decided they should be Ord but internally have a f32 and so they need to implement the comparison themselves, total_cmp seems like it's almost always going to be the right choice for that unless partial_cmp with unwrap() is definitely correct and they're sure of it.
This is based on the IEEE 754 totalOrder predicate. If you look at how this function works on binary floats, it turns out that it sorts numbers as 0xffff, 0xfffe, 0xfffd, ..., 0x8001, 0x8000, 0x0000, 0x0001, 0x0002, ..., 0x7ffd, 0x7ffe, 0x7fff [1]. This is not a bad sort order, and the reason why the NaNs are split up are because of the existence of signed NaN.
When it comes to floating point, it's probably better to bow to what IEEE 754 specifies (if it specifies something) than do what you come up yourself, even if it's less useful. x != x holding true for NaNs is another example here.
[1] Actually, the sort order of NaN payloads is implementation-defined, but given the nice bit representation mentioned above is easy to implement, you're probably unlikely to see any other implementation.