Hacker News new | comments | ask | show | jobs | submit login
Threads Cannot Be Implemented as a Library (2005) [pdf] (nyu.edu)
93 points by pmoriarty 61 days ago | hide | past | web | favorite | 21 comments

(2005). The quoted "effort to revise the C++ language standard to better accommodate threads" was eventually achieved in C++11, which uses a separate type for atomic variables (std::atomic) rather than relying on volatile, and has a formal memory model with a variety of memory orderings of different strengths (relaxed, acquire/release, sequentially consistent).

Yes. Bohem was one of those leading the effort.

This reminds me of ... pthread api implemented as a standalone library!


> The thread scheduling itself is done in a cooperative way, i.e., the threads are managed and dispatched by a priority- and event-driven non-preemptive scheduler.

They directly address that library in the paper.

Threads can be implemented as a library if the compiler and hardware don't perform any reordering of memory instructions.

It is hard to define what "ordering of memory operations" actually means if there is no memory model that takes multi-threading into account. Most styles of formal semantics will allow operations to be reordered as long as the final state is the same.

Atomicity and visibility are also an issue.

You still need synchronization instructions, but you can handle them through library function calls (i.e., you don't need to change the language specification for them).

Thats what C++ thought too :). I believe the whole point of this paper is to argue your point (minus the whole lets simplify the compiler so that it cant do a ton of optimizations part since nobody would agree to that)

Well, in my book correctness comes first. So if you want to do multi-threading without changing the language, then you'll have to fix the compiler (to make it preserve correctness).

C++ did change the language - its specification of correctness. The definition of what is correct well-defined behavior is part of the language.

Yes, the article is about not changing the language, though.

Yes, (2005). Hans Boehm is the guy who pioneered conservative garbage collection.

Yet the bumblebee flies.

Threads were implemented as libraries for a long time. Things worked. Yes, C++ and other languages have formal memory models makes things better. But you can, in fact, do useful work with threads when the threading library is a library.

Don't tell me that things that actually work are impossible.

The paper is exactly about those libraries: “We first review why the approach almost works, and then examine some of the surprising behavior it may entail.”

Bohems probably wrote a few of those thread libraries himself.

BTW, his name is not Bohem but Boehm.

Embarrassing, I managed to get it wrong multiple times. Thanks for the correction!

Were they real threads that would take advantage of multiple cores or green threads?

I remember the days of the green threads lib on Linux. It abused fork to use multiple cores, so it was really just pushing the problem back to the kernel's process scheduler. It also made 'ps' output ugly as hell on machines running multithreaded code. This was the late 1990s.

I wouldn't go so far as to say threads can't be implemented as a library, but to implement real threads efficiently in a library would require some way for the kernel to expose the scheduler. It would probably be possible for the kernel to provide extremely basic low-level scheduler syscalls and push everything else down into user-space in a more microkernel-ish system. These would be things like "get core count," "get current core," "start code on core X," etc. Given the higher overhead of syscalls vs. user-mode-only code, this might perform worse than threads in the kernel due to a larger number of kernel/user context switches.

I don't think linux ever had green threads (as in purely userspace threads).

LinuxThreads, as you described, implemented threads as processes sharing the whole address space.

The PTNG project attempted to add a full N:M hybrid scheduler just around the time M:N threading was going outof fashion.

In the end NPTL (basically extending fork to cover the differences between posix and linux semantics and adding futexes for fast aignaling) won.

How do you think threads work on Linux today? Libc creates threads with clone, a variant of fork.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact