Author of the original WTF::ParkingLot here (what rust’s parking_lot is based on...

scottlamb · 2025-11-24T17:41:36 1764006096

> The main benefit of parking lot is that it makes locks very small, which then encourages the use of fine grained locking. For example, in JavaScriptCore (ParkingLot’s first customer), we stuff a 2-bit lock into every object header - so if there is ever a need to do some locking for internal VM reasons on any object we can do that without increasing the size of the object

IMHO that's a very cool feature which is essentially wasted when using it as a `Mutex<InnerBlah>` because the mutex's size will get rounded up to the alignment of `InnerBlah`. And even when not doing that, afaict `parking_lot` doesn't expose a way to use the remaining six bits in `parking_lot::RawMutex`. I think the new std mutexes made the right choice to use a different design.

> I’m surprised that this only compared to std on one platform (Linux).

Can't speak for the author, but I suspect a lot of people really only care about performance under Linux. I write software that I often develop from a Mac but almost entirely deploy on Linux. (But speaking of Macs: std::mutex doesn't yet use futexes on macOS. Might happen soon. https://github.com/rust-lang/rust/pull/122408)

Rusky · 2025-11-25T03:00:17 1764039617

Hypothetically Rust could make `Mutex<InnerBlah>` work with just two bits in the same way it makes `Option<&T>` the same size as `&T`. Annotate `InnerBlah` with the information about which bits are available and let `Mutex` use them.

scottlamb · 2025-11-25T04:06:27 1764043587

There was talk of Rust allowing stride != alignment. [1] I think this would mean if say `InnerBlah` has size 15 and alignment 8, `parking_lot::Mutex<InnerBlah>` can be size 16 rather than the current 24. Same would be true for an `OuterBlah` the mutex is one field of. But I don't think it'll happen.

[1] e.g. https://internals.rust-lang.org/t/pre-rfc-allow-array-stride...

timClicks · 2025-11-25T03:37:48 1764041868

References only have a single bit available as a niche (the null byte), which Option makes use of for null pointer optimization (https://doc.rust-lang.org/std/option/index.html#representati...).

In principle, you Rust could create something like std::num::NonZero and its corresponding sealed trait ZeroablePrimitive to mark that two bits are unused. But that doesn't exist yet as far as I know.

Rusky · 2025-11-25T04:32:18 1764045138

There are also currently the unstable rustc_layout_scalar_valid_range_start and rustc_layout_scalar_valid_range_end attributes (which are used in the definition of NonNull, etc.) which could be used for some bit patterns.

Also aspirations to use pattern types for this sort of thing: https://github.com/rust-lang/rust/issues/135996

torginus · 2025-11-25T14:55:10 1764082510

I think he meant 1 byte on the heap for the shared state, on the stack it's larger.

Which is fine since in Rust we almost always have the mutex in function scope as long as we're using it.

pizlonator · 2025-11-24T19:17:08 1764011828

> I suspect a lot of people really only care about performance under Linux

Yeah this is true

skitter · 2025-11-25T02:09:05 1764036545

I do the same in my toy JVM (to implement the reentrant mutex+condition variable that every Java object has), except I've got a rare deadlock somewhere because, as it turns out, writing complicated low level concurrency primitives is kinda hard :p

nextaccountic · 2025-11-24T17:31:53 1764005513

How can a parking_lot lock be less than 1 byte? does this uses unsafe?

Rust in general doesn't support bit-level objects unless you cast things to [u8] and do some shifts and masking manually (that is, like C), which of course is wildly unsafe for data structures with safety invariants

pizlonator · 2025-11-24T19:20:41 1764012041

Original post: https://webkit.org/blog/6161/locking-in-webkit/

Post that mentions the two bit lock: https://webkit.org/blog/7122/introducing-riptide-webkits-ret...

I don’t know the details of the Rust port but I don’t imagine the part that involves the two bits to require unsafe, other than in the ways that any locking algorithm dances with unsafety in Rust (ownership relies on locking algorithms being correct)

writebetterc · 2025-11-24T19:49:58 1764013798

This is very similar to how Java's object monitors are implemented. In OpenJDK, the markWord uses two bits to describe the state of an Object's monitor (see markWord.hpp:55). On contention, the monitor is said to become inflated, which basically means revving up a heavier lock and knowing how to find it.

I'm a bit disappointed though, I assumed that you had a way of only using 2 bits of an object's memory somehow, but it seems like the lock takes a full byte?

pizlonator · 2025-11-25T03:49:02 1764042542

The lock takes two bits.

It’s just that if you use the WTF::Lock class the. You get a full byte simply because the smallest possible size of a class instance in C++ is one byte.

But there’s a template mixing thing you can use to get it to be two bits (you tell the mixin which byte to steal the two bits from and which two bits).

I suspend the same situation holds in the Rust port.

I am very familiar with how Java does locks. This is different. Look at the ParkingLot/parking_lot API. It lets you do much more than just locks, and there’s no direct equivalent of what Java VMs call the inflated or fat lock. The closest thing is the on demand created queue keyed by address.

writebetterc · 2025-11-29T10:35:44 1764412544

>I am very familiar with how Java does locks. This is different. Look at the ParkingLot/parking_lot API. It lets you do much more than just locks, and there’s no direct equivalent of what Java VMs call the inflated or fat lock. The closest thing is the on demand created queue keyed by address.

Are you familiar with the new LightweightSynchronizer approach with an indirection via a table, instead of overwriting the markWord? I'd say that this has pushed the ParkingLot approach and Java's (Hotspot's, really) approach closer to each other than before. I think the table approach in Java could be encoded trivially into ParkingLot API, and the opposite maybe. Obviously the latter would be a lot more hamfisted.

zozbot234 · 2025-11-24T19:56:11 1764014171

The idea is that six bits in the byte are free to use as you wish. Of course you'll need to implement operations on those six bits as CAS loops (which nonetheless allow for any arbitrary RMW operation) to avoid interfering with the mutex state.

bobbylarrybobby · 2025-11-24T22:37:42 1764023862

The lock uses two bits but still takes up a whole (atomic) byte

Conscat · 2025-11-24T17:44:31 1764006271

This article elaborates how it works.

scottlamb · 2025-11-24T20:33:53 1764016433

Unhelpful response. This cuongle.dev article does not answer nextaccountic's question, and neither do the webkit.org articles that describe the parking lot concept but not this Rust implementation. The correct answer appears to be that it's impossible: `parking_lot::RawMutex` has private storage that owns the entire byte and does not provide any accessor for the unused six bits.

https://docs.rs/parking_lot/0.12.5/parking_lot/struct.RawMut...

(unless there's somewhere else in the crate that provides an accessor for this but that'd be a weird interface)

(or you just use transmute to "know" that it's one byte and which bits within the byte it actually cares about, but really don't do that)

(slightly more realistically, you could probably use the `parking_lot_core::park` portion of the implementation and build your own equivalent of `parking_lot::RawMutex` on top of it)

(or you send the `parking_lot` folks a PR to extend `parking_lot::RawMutex` with interface you want; it is open source after all)

pizlonator · 2025-11-25T03:52:09 1764042729

> and neither do the webkit.org articles that describe the parking lot concept but not this Rust implementation

The WebKit post explicitly talks about how you just need two bits to describe the lock state.

> The correct answer appears to be that it's impossible: `parking_lot::RawMutex` has private storage that owns the entire byte and does not provide any accessor for the unused six bits.

Not impossible. One way to do this is to just use parking_lot directly.

In WebKit there’s a template mixin that lets you steal two bits for locking however you like. JavaScriptCore uses this to steal two bits from the indexing type byte (if I remember right)

scottlamb · 2025-11-25T16:59:34 1764089974

> The WebKit post explicitly talks about how you just need two bits to describe the lock state.

It describes the algorithm but not how a caller of the Rust `parking_lot` crate could take advantage of this.

> Not impossible. One way to do this is to just use parking_lot directly.

By "just use parking_lot directly", I think you're talking about reimplementing the parking lot algorithm or using the C++ `WTF::ParkingLot` implementation? But not actually using the existing Rust crate called `parking_lot` described in the cuongle.dev article? That's confusingly put, and nextaccountic's question is certainly Rust-specific and likely expecting an answer relating to this particular crate. At the least, "does this use unsafe" would certainly be true with an implementation from scratch or when using FFI into C++.

I hear that this algorithm and the C++ implementation are your invention, and all due respect for that. I'm also hearing that you are not familiar with this Rust implementation. It does not offer the main benefit you're describing. `parking_lot::RawMutex` is a one-byte type; that six bits within it are unused is true but something callers can not take advantage of. Worse, `parking_lot::Mutex<InnerFoo>` in practice is often a full word larger than `InnerFoo` due to alignment padding. As such, there's little benefit over a simpler futex-based approach.

pizlonator · 2025-11-26T05:52:02 1764136322

> It describes the algorithm but not how a caller of the Rust `parking_lot` crate could take advantage of this.

Read the WebKit post.

> By "just use parking_lot directly", I think you're talking about reimplementing the parking lot algorithm or using the C++ `WTF::ParkingLot` implementation? But not actually using the existing Rust crate called `parking_lot` described in the cuongle.dev article?

See https://docs.rs/parking_lot_core/latest/parking_lot_core/

That's my ParkingLot API. You can use it to implement many kinds of locks, including:

- Efficient ones that use a tristate, like the glibc lowlevellock, or what I call the Cascade lock. So, this doesn't even need two bits.

- The lock algorithm I prefer, which uses two bits.

- Lots of other algorithms. You can do very efficient condition variables, rwlocks, counting locks, etc.

You can do a lot of useful algorithms with fewer than 8 bits. You don't have to use the C++ ParkingLot. You don't have to implement parking_lot.

What you do have to do is RTFM

scottlamb · 2025-11-26T16:14:59 1764173699

> Read the WebKit post.

Clearly I have already.

> See https://docs.rs/parking_lot_core/latest/parking_lot_core/ ... That's my ParkingLot API.

"just use parking_lot directly" is a weird way to say "use the `parking_lot_core` crate instead of the `parking_lot` crate".

...and note that I mentioned this in my earlier comment: (slightly more realistically, you could probably use the `parking_lot_core::park` portion of the implementation and build your own equivalent of `parking_lot::RawMutex` on top of it)

I'm not trying to be disagreeable here, but really you could save a lot of trouble if you were a bit more careful in your communication.

loeg · 2025-11-24T21:40:01 1764020401

The two bit lock was specifically refering to the C++ WTF::ParkingLot (and the comment mentioning it explicitly said that). nextaccountic is confused.

scottlamb · 2025-11-24T21:42:17 1764020537

No. nextaccountic's comment and the cuongle.dev article are both talking about Rust. The Rust `parking_lot` implementation only uses two bits within a byte, but it doesn't provide a way for anything else to use the remaining six.

pizlonator's comments mention both the (C++) WTF::ParkingLot and the Rust `parking_lot`, and they don't answer nextaccountic's question about the latter.

> nextaccountic is confused.

nextaccountic asked how this idea could be applied to this Rust implementation. That's a perfectly reasonable question. pizlonator didn't know the anwer. That's perfectly reasonable too. Conscat suggested the article would be helpful; that was wrong.

loeg · 2025-11-24T21:43:12 1764020592

nextaccountic replied to this original comment: https://news.ycombinator.com/item?id=46035698

Yes, nextaccountic's reply is confused about Rust vs C++ implementations. But the original mention was not talking about Rust.