Isn't manually locking and unlocking (as seen in the examples) frowned upon anyway?
In any development I do, and with every developer I've ever developed a project internally, the use of std::lock_guard<> (which is a scope lock) was required when working with mutexes. This code looks like that classic "C with arrays" type of C++ code, that is prone to errors that have long been solved.
As a rule of thumb, when prototyping new multi threaded code and I have a class that needs to be thread safe, I add a scoped lock in every member function that is not const. Problem solved, your class is now thread safe, although not very performant...
These analysis passes also work just fine with scoped locks, using the SCOPE_CAPABILITY annotation (see the included mutex.h example).
Scoped locks by themselves cannot statically rule out the same kinds of errors that this analysis can, in any case. Even if you only use scoped locks -- an obvious (but trivial) example is when you have a lock held but then attempt to take a lock again in some control path (perhaps through a recursive or deep call stack, or, in your case, calling another member function which also takes the lock unconditionally.) This is a trivial case for this analysis to detect but std::lock_guard<> alone will not save you.
The annotations also serve as useful commentary and 'guiding tools' for refactoring as well, in my experience. If you annotate a function as yielding or taking a lock, and later change this -- the compiler will warn you that your annotation is out of sync with the implementation. In practice I find this quite useful to keep code clean, but it also obviously helps tell you if your change is correct, too.
I normally just turn this analysis on in Clang, because IME it has very little negative effect and is very useful in practice.
> when you have a lock held but then attempt to take a lock again in some control path (perhaps through a recursive or deep call stack, or, in your case, calling another member function
The standard library provides std::recursive_mutex which supports being locked multiple times by the same thread. Unfortunately, it doesn't provide an std::recursive_shared_timed_mutex for the general case where you might want to support multiple simultaneous read operations. The SaferCPlusPlus library does implement such a mutex, and, like I mentioned in another comment, provides data types that automatically control access to shared objects. I think for most cases, it's the safest and easiest way to share objects asynchronously (in C++). (The data types are kind of analogous to Arc<> and Arc< Mutex<> > for those familiar with Rust.)
Btw, the implementation is self-contained, so you can use the safe sharing data types without having to include the rest of the library. Or you can just copy the source code and make your own safe sharing data types.
There are many people who would largely consider recursive mutexes a complete mistake, for the most part. I'm one of those people. :)
But I admit I sort of willfully skipped over this point precisely because I dislike recursive mutexes. But they are there for this case, you are 100% right.
The very fact that atomics do lock means that you cannot properly write lock-free data structures with them.
"Lock free" typically refers to code that can prevent blocking operations; atomics do not do that, though some people think of atomics as a simple lock-free method, it really isn't. Lock-free is about algorithm design that can prevent locking in the first place, and atomics still lock, even if they hide that from you. If you mentioned atomics as a lock-free tool in an interview, you'd likely be chided.
Looks interesting, especially as a way to introduce some extra thread safety checking into legacy codebases. Or improve the quality of new ones if C++ is the target (yes, I know that Rust can do similar things without extra tools).
If anyone is familiar with it I directly have a question regarding it: An often used pattern besides locking is that data is owned by a particular thread, and only that thread is allowed to mutate the data. E.g. in GUIs only the main thread can mutate it. I guess that can also be expressed with the tool, but examples would be welcome. I guess the owning thread could have the capability, but a challenge could be that the start and end of the owning thread might be encapsulated by a library (e.g. QT or Windows mainloop), and be slightly hidden from the implementation of components. The goal should be that if I do
field = value;
inside an arbitrary thread I get a warning. And if I do
owningThreadWithEventLoop.post([](){ field = value; });
If a method requires that it's called while a mutex is held, I make it explicit by declaring the function as `void f(std::unique_lock<std::mutex>&)` [requiring a reference ensures that you have to have a constructed object before function entry]. Granted, this approach doesn't associate the mutex with its data.
Having recently done some coding in Swift to implement a little toy Hypervisor, I sorely missed these annotations. By and large the errors they catch are ones I'd be embarrassed to have pointed out in a code review; I don't mind nearly so much when the compiler catches me without my mutex locked.
When data races are the concern, SaferCPlusPlus provides nice data types for safely sharing objects asynchronously[1]. Basically, glorified shared_ptrs that automatically handle all the necessary locking (and blocking).
The checker framework is also nice for helping with thread safety. https://checkerframework.org/ its an extend type system for the Java Language using annotated variables and methods. Its really nice and helped catch quite a few subtle bugs at low cost.
Rust is lots of things, not the least of which includes static guarantees of safety, including safe concurrent memory usage.
If you asked them, I'd bet the C++ design team would admit lots of faults in the language design. Some of those are due to a need to preserve compatibility with C and older C++ code. Some are just lessons learned as the language grows and finds new uses. Others are because computers used for software development and targets on which binaries would execute were very different from computers today.
I think there's a case for looking back and asking "if we were to invent a language like C or C++ today, knowing what we know now, what would it look like?" I don't think that Rust's origin quite spawned from that, but it certainly looks like it fits that bill.
This particular feature is yet-another-static-analysis-addendum-to-C++. It's a great complement to TSan. But IMO if there were ever something in this vein that could be considered "a Rust killer" it would have been TSan. No one likes going back and annotating their code. People can barely be bothered to make a new build target with "-fsanitize=thread".
> So why do have to manually mutex or semaphore your threads? This is not a type system with concurrency capabilities is meant for.
You don't have to use mutex ! The Rust type system has two traits to ensure thread-safety : Send and Sync, and you can safely use anything which implement those traits. Mutex is one easy way to have these capabilities, but it's not the only one. You can build your own thread-safe data structure if you want, or even use third-party ones like crossbeam[1]. The compiler will then ensure that you never use non-thread-safe objects accross threads.
> At least parrot and pony do away without that nonsense.
What is parrot ? I've read about pony but not parrot, and Google isn't helping.
> Oh my, those rusters, still downvoting what they have no idea about.
Please don't include these sorts of things in comments. They're worse than unnecessary: they take us further away from the kinds of thoughtful, civil discussions we're after.
What specifically do people not have an idea about?
> So why do have to manually mutex or semaphore your threads?
You don't _have_ to, but it is an option. Rust, as a low-level language, provides a number of options for dealing with concurrency. Frankly, I wrote that chapter above and in a hurry; it's my least favorite out of the whole book.
Rust can prove a lack of data-races at compile time, which is a big deal.
This is interesting, but I am curious why the effort has gone into this C++ tool when Google's own multi-threading-first language, Go, is based on Tony Hoare's work from decades ago, except that it omits the most interesting feature of CSP, the specialized calculus theorem that can prove the correctness of all your multihreaded logic. It would seem that those interesting ideas were ignored in favor of providing tooling for C++.
For the record, Go has no static thread safety guaranty. It's easy to write a data-race in Go (and it occurs frequently). Fortunately the go compiler has a tool that help detecting data-races[1]. The only two languages I know that offer data-race-free multi-threading are Pony[2] and Rust.
Additionally, Go isn't really a C++ competitor: you'll probably never see Google Chrome using Go instead of C++. Go is a good language for back-end micro-services and cli tools, not for building performance-sensitive native desktop applications.
> The only two languages I know that offer data-race-free multi-threading are Pony[2] and Rust.
I believe Erlang also does for the most part, though some of its built-in constructs are racey (the process registry if you query it, shared ETS tables, and dirty Mnesia operations).
Correct, the point of my post is that Go does not have these guarantees even though it is one of the few (only?) languages based on a methodology that actually has a proven theorem for statically guaranteeing thread safety; yes, that was never implemented in Go despite that all the other aspects of the methodology were. Which I just find interesting.
> The only two languages I know that offer data-race-free multi-threading are Pony[2] and Rust.
I should have added «with shared memory», otherwise any language that purely rely on message-passing is also free from data-races (even JavaScript is), but it lacks this important feature.