
Linus Torvalds on semaphores (1999) - rainbowgarden
http://yarchive.net/comp/linux/semaphores.html
======
Animats
Here's Dijkstra's original paper on P and V (in Dutch), from about 1963.

[http://www.cs.utexas.edu/users/EWD/transcriptions/EWD00xx/EW...](http://www.cs.utexas.edu/users/EWD/transcriptions/EWD00xx/EWD35.html)

Here is a implementation of P and V, the original counted semaphore
primitives, from 1972.

[http://www.fourmilab.ch/documents/univac/fang/](http://www.fourmilab.ch/documents/univac/fang/)

This is UNIVAC 1108 assembly code. Along with P and V is the code for bounded
buffers, with the operations "PUT" and "GET". Bounded buffers are what Go
calls "channels". Note how simple they are if you have P and V. That code even
works on multiprocessors. There's one semaphore for "queue full" and one for
"queue empty". PUT does a P on "queue full", puts on an item, and does a V on
"queue empty". GET does a P on "queue empty", takes off an item, and does a V
on "queue full". It's very simple. That's the real use case for P and V.
Linus' note indicates that in 1999 he didn't know this.

(I didn't write those primitives, but I've used that code, and once ported it
to a Pascal compiler I adapted to handle concurrency.)

This stuff was all well understood four decades ago. Much of it was forgotten
outside the mainframe world, because threads and multiprocessors didn't make
it to microprocessors for several more decades. UNIX, for a long time, had
very primitive synchronization primitives. Early UNIX didn't have threads, and
even after it got threads, it took years before the locking primitives settled
down. The DOS/Windows world didn't get them until Windows NT, circa 1993.

It's been amusing to me to see bounded buffers resurface in Go. They're quite
useful, and I've been using them in concurrent programs for many years.

~~~
roel_v
For those wondering, P comes from 'Passering', roughly translated 'pass' (as a
noun), and V from 'Vrijgave' ('release'). Apparently somehow this terminology
comes from train systems but there's not a lot of context on that etymology.

As an aside, this paper (it's actually the transcription of a lecture) has
some great metaphors that explain problems with concurrence and issues with
synchronisation. At the risk of losing much of the nuances, he essentially
illustrates the synchronisation using the example of a teacher who needs to
find a pupil in a class. When pupils are free to choose a seat, she needs to
scan all seats when she is looking for a particular pupil; but when at the
same time pupils are free to change seats as she's scanning, there is no
guarantee that she will ever find the pupil she's looking for as he might run
from the back of the class to the front as soon as she's done scanning the
front.

~~~
sgt
From what I've heard, the P comes from "Probeer" (to "try"), and the V from
"verhoog" (increase).

This makes more sense to me.

~~~
pdw
That's Dijkstra's later terminology. In EWD35 he introduced the operations as
"passering" and "vrijgave". Then in EWD51 he uses "verhogen" and the neologism
"prolagen" ("probeer te verlagen"). In EWD74 he switches to "proberen".

These documents are available from the Dijkstra archive:
[http://www.cs.utexas.edu/users/EWD/welcome.html](http://www.cs.utexas.edu/users/EWD/welcome.html)

------
robert_tweed
_" Dijkstra was probably a bit heavy on drugs or something (I think the
official explanation is that P and V are the first letters in some Dutch
words, but I personally find the drug overdose story much more believable)."_

Quotes like this are why I always read what Linus has to say, regardless of
whether the subject is relevant to my life in the slightest.

Edit: Yes people, those Dutch words exist! I get it! I'm sure Linus was well
aware of that when he made this remark. However, the point Linus was making
(in a humorous way) was that like most computer scientists / mathematicians,
Dijkstra was overly fond of obscure, single-letter names, which nobody ever
intuitively understands. Whereas the names "up" and "down" (see the rest of
Linus' quote) are far more intuitive. Personally, I would argue that "hold"
and "release" convey the intent in a clearer, more abstract way.

~~~
outworlder
> "Dijkstra was probably a bit heavy on drugs or something (I think the
> official explanation is that P and V are the first letters in some Dutch
> words, but I personally find the drug overdose story much more believable)."

I personally have made remarks that were WAY more potentially offensive than
that, verbally, with friends. But, accusing Dijkstra of being a drug user, on
the Linux kernel mailing list, from a @transmeta address?

That's grounds for termination in many companies, even if you don't get
lawyers involved. I think he's only able to get away with it because he's
Linus.

~~~
ANTSANTS
Technically, you're right. Today, any junior employee that made a joke like
that would be crucified and practically blacklisted from the industry. What
you should be thinking, however, is not "why does Linus get away with saying
things like that?" but rather "why do I consider it normal for someone's
livelihood to be permanently destroyed over a stupid joke?"

~~~
GFK_of_xmaspast
What do you think an appropriate response is?

~~~
jpitz
How about public shaming instead of the destruction of a career?

~~~
CamperBob2
That's the idea. Any developer or researcher who uses obscure or meaningless
names like "P" and "V" needs to be publicly shamed for it, no matter how well-
respected they are. Clarity is vital in this business.

Or were you talking about Linus?

------
WallWextra
Semaphores are basically unused in the kernel these days, abandoned in favor
of mutexes. I really recommend reading kernel/locking/mutex.c. You can learn a
lot about the details of low-level synchronization, and it's also just a
really impressive piece of ruthlessly-optimised code. Tricks like reading the
lock's 'owner' field without any sort of synchronization, to decide whether to
spin on the lock or to sleep.

~~~
julenx
For the sake of completeness:
[https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux....](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/kernel/locking/mutex.c)

------
wazari972
it's surprising that Linus answers peacefully and pedagogically ! and thus
it's a nice read to refresh the definition of semaphores, spinlocks and
mutexes.

Maybe you can edit the title and add a little [199] :-)

~~~
Perceptes
He did start out by saying Peter Samuelson's CS education was bad, so there
was certainly some of the infamous Linus in there.

~~~
rainbowgarden
Can you blame Linus for saying that? I can't speak with precision about 1999,
but someone assumed to have a CS education should know the difference between
semaphore and spinlock.

~~~
zo1
Whether they did or didn't know about the definitions, they should have at
least looked it up to confirm before posting to the Linux Kernel Dev list or
taking on Linus on the matter.

------
PhantomGremlin
Link is to a collection of Linus's comments, from between 1999 and 2008, about
how to efficiently implement mutex and semaphores in the Linux kernel.

Linus is the BDFL of the Linux kernel, and he obviously needs to think about
the big picture. And yet in these comments he gets into nitty-gritty assembly
language.

I love it when a big picture guy also sweats the details.

------
PinguTS
Nice read, but still some misconceptions and misunderstandings.

If I am in a multiprocessor design, where each processor runs completely
independent from each other and there is no central organizational unit like a
central OS, like in many embedded designs, there is no such thing as a
spinlock. A semaphore is also not used for sending processes to sleep.

Linus explanation makes sense, speaking only about a single OS, which may is
comprised of multiple processors. But makes no sense at all for real multi-
processor designs. I don't blame him for his narrow view, more his professors
that he never experienced real embedded design.

I have worked in the past with multi-processors designs, where one processor
was using one OS, like OS9, and the other processor had no OS at all running.
Both processors where exchanging data via a dual ported RAM. So basically both
processes where competing for the same resource. But even than, using a
semaphore does not prevent race-conditions. Even then it is very tricky to use
them. Because dual ported RAM allowed real concurrent access to the same
resource.

In summary, the semaphore definition is the more broad definition where 2 (or
more) processes are competing for a single resource. The naming convention
within Linux is more specific for this requirements. The "invented" spinlock
is just a renamed version for a fast semaphore, AFAIK. Even a mutex is just a
special case of a semaphore.

~~~
caretcaret
You should realize that the post was written in 1999.

~~~
PinguTS
And I have done my work on this back around 1997/1998 as part of my diploma
thesis.

------
herf
The "benchmarks game" has a thread ring benchmark which can compare
conditions/mutexes/semaphores, because for some applications they're
interchangeable:

4-thread x64:

mutex (480s):
[http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...](http://benchmarksgame.alioth.debian.org/u64q/program.php?test=threadring&lang=gcc&id=1)

sem_wait (476s):
[http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...](http://benchmarksgame.alioth.debian.org/u64q/program.php?test=threadring&lang=gcc&id=2)

cond_wait (271s - runs single-threaded?)
[http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...](http://benchmarksgame.alioth.debian.org/u64q/program.php?test=threadring&lang=gcc&id=3)

1-thread x86:

mutex (136s):
[http://benchmarksgame.alioth.debian.org/u32/program.php?test...](http://benchmarksgame.alioth.debian.org/u32/program.php?test=threadring&lang=gcc&id=1)

sem_wait (131s):
[http://benchmarksgame.alioth.debian.org/u32/program.php?test...](http://benchmarksgame.alioth.debian.org/u32/program.php?test=threadring&lang=gcc&id=2)

cond_wait (156s):
[http://benchmarksgame.alioth.debian.org/u32/program.php?test...](http://benchmarksgame.alioth.debian.org/u32/program.php?test=threadring&lang=gcc&id=3)

------
gsg
If you haven't already, follow the 'index' link (to
[http://yarchive.net/comp/index.html](http://yarchive.net/comp/index.html))
and bookmark that. Some very worthwhile reading there.

~~~
trentnelson
That whole site is fascinating, I just lost like 2 hours of my life. Late 90s
threads about TLB strategies from @sgi.com e-mail addresses? F&$% me, I could
read that sort of stuff all day. (I started
[http://www.snakebite.org](http://www.snakebite.org), for reference.)

------
caf
The possible future improvement that Linus mentions here:

    
    
      For example, the per-VM memory management semaphore could very usefully
      be a blocking read-write lock, but without heavy thread contention a
      mutex semaphore is basically equivalent.
    

has actually come to pass: the mm_struct is now protected by struct
rw_semaphore mmap_sem.

------
fleitz
This is why I _LOVE_ reading what Linus has to say, it's all very practical,
and pretty much ignores theory, because he has actual evidence to support his
claims.

------
jgrahamc
For those that care why Dijskstra used P() and V(): P for Passering; V for
Vrijgave (release)

~~~
barrystaes
Thanks. Fun quote FTA:

"" They originally had operations called "P()" and "V()", but nobody ever
remembers whether P() was down() or up(), so nobody uses those names any more.
Dijkstra was probably a bit heavy on drugs or something (I think the official
explanation is that P and V are the first letters in some Dutch words, but I
personally find the drug overdose story much more believable). ""

------
xyzzyz
Ah, semaphores. I recall in my operating systems course we had to implement
some simple concurrency patterns using semaphores as an exercise -- for
instance synchronize a group of processes so that idle processes gather into
groups of N, and when a group is ready, N-1 threads of the group does TaskX(),
and 1 thread does TaskY(), and when they're done, they go back to the pool.
Then we implemented the same using usual mutexes and condition variables. I
encourage everyone to do that, to see how one of the approaches is much more
complicated than the other. Linus says that almost all practical uses of
semaphores is when they are just used as mutexes, and rightly so --
implementing concurrency patterns based only on semaphores is a total pain.

~~~
exDM69
> Linus says that almost all practical uses of semaphores is when they are
> just used as mutexes, and rightly so -- implementing concurrency patterns
> based only on semaphores is a total pain.

Semaphores are not very good in practical programming but they're still
valuable as a mental model and reasoning about algorithms.

It is a lot easier to prove by induction that an algorithm implemented with
semaphores works than it is to deal with mutexes and conditions in a formal
proof.

So semaphores are very useful in theoretical work, mutexes and conditions are
more useful in practice.

------
michaelsbradley
There's a free book available on the subject of semaphores:

 _The Little Book of Semaphores_ by Allen Downey

[http://greenteapress.com/semaphores/](http://greenteapress.com/semaphores/)

------
asimpletune
Semaphores make a lot more sense if you know their literal translation of the
Spanish word "semaforo", which is a "signal". It's also used in Spanish to
describe a traffic light.

~~~
darylteo
I thought it got its name from the Semaphore Flag Signalling System.

[http://www.anbg.gov.au/flags/semaphore.html](http://www.anbg.gov.au/flags/semaphore.html)

------
ww520
Semaphore and spinlock sure are different. People rarely use spinlock in user
mode programs since there are better synchronizing mechanisms in user mode
than busy waiting with spinlock. However, in kernel mode program spinlock is
invaluable since it's the only synchronizing mechanism that works in any
interrupt level.

------
darylteo
Wow I remember some of this class from my OS course 10 years after this was
written.

We were just told about mutexes, but I never knew that it stood for Mutual
Exclusion. (I did correctly intuit that a mutex was simply a specific use case
of semaphore though so I'm happy and managed to pass the course in the end :D)

------
knutsonbradacnl
Coming from embedded systems and MCU RTOS development, mutexes and semaphores
are the name of the game.

------
esaym
Man every time I read linux kernel stuff I want to join the dev team. Kernel
development sounds awesome/fun. But I can never seem to make time.

------
pwelch
That was a pretty interesting read.

Random question, does anyone know what some more active newsgroups lists are
today or has it all slowed down?

------
CmonDev
Love vintage concurrency techniques :).

~~~
sz4kerto
There's nothing vintage about them. Understanding these is crucial to
understand how and why modern concurrency tools work. As I noted above in
another comment, the issues that need mutexes or semaphores have never went
away, only you might not realize that they're there.

And please, don't come up with async/await, callback/continuation, node.js'
async stuff. They are not a replacement for mutual exclusion, etc.

~~~
icebraining
Vintage doesn't mean obsolete.

~~~
twic
Vintage means "from a particularly good year, the like of which we have not
seen for some time". Seems about right for an invention of EWD's.

------
haileyr
nice

