
Why Threads are a Bad Idea (for most purposes) (1995) [pdf] - erwan
https://web.stanford.edu/~ouster/cgi-bin/papers/threads.pdf
======
skybrian
The morning paper had a nice set of blog posts about this:

Why they're equivalent (duals): [https://blog.acolyer.org/2014/12/08/on-the-
duality-of-operat...](https://blog.acolyer.org/2014/12/08/on-the-duality-of-
operating-system-structures/)

Why threads are a bad idea: [https://blog.acolyer.org/2014/12/09/why-threads-
are-a-bad-id...](https://blog.acolyer.org/2014/12/09/why-threads-are-a-bad-
idea/)

Why events are a bad idea: [https://blog.acolyer.org/2014/12/10/why-events-
are-a-bad-ide...](https://blog.acolyer.org/2014/12/10/why-events-are-a-bad-
idea/)

Unifying events and threads (in Haskell):
[https://blog.acolyer.org/2014/12/11/a-language-based-
approac...](https://blog.acolyer.org/2014/12/11/a-language-based-approach-to-
unifying-events-and-threads/)

Unifying events and treads (in Scala):
[https://blog.acolyer.org/2014/12/12/scala-actors-unifying-
th...](https://blog.acolyer.org/2014/12/12/scala-actors-unifying-thread-based-
and-event-based-programming/)

~~~
MichaelMoser123
he doesn't seem to have mentioned cooperative threads/green threads/non
preemptive threads in this series
[http://wiki.c2.com/?CooperativeThreading](http://wiki.c2.com/?CooperativeThreading)

They are easier to program than events, they still have problems like its easy
to run out of a very limited stack and you have to yield your cooperative
thread often to other tasks, but they are easier to maintain than a state
machine (that is often done as a very big switch statement).

~~~
veli_joza
The cooperative multitasking is mentioned in "Why events are a bad idea" as a
counter-argument #3. They are considered under threads category. It's
unfortunate that only some languages support them.

~~~
MichaelMoser123
> It's unfortunate that only some languages support them.

yes, you need runtime support for this feature. If it is impossible to change
the runtime than it can't be done for the language; now the runtime may have
problem to work with a very limited stack for example, or the runtime is big
and difficult to change (like the JVM), or it has long library functions that
cannot yield to other co threads.

However C/C++ or other languages that do not sit on top of a big runtime dont
have that problem.

------
atemerev
Yes, might have worked in 1995. Now, however, when even your lowly phone has 4
(or more) processor cores and a full-fledged GPU...

Learn concurrent programming techniques — or perish. Threads and sync
primitives are low-level, but important, and you have to understand them to
figure out what compromises and biases were taken in higher-level models.

And, frankly, it isn't that bad (debugging existing code is bad, but playing
with monitors and semaphores and critical sections is easy, until code is
small and isolated).

~~~
kbutler
Yes, and no. Concurrency and parallelism are important, but threads
(lightweight shared memory processes) are not the only solution.

Successful, popular concurrent platforms like Erlang/OTP, Nginx, and node.js,
eschew threads in favor a single-threaded, async/non-blocking code model.
These platforms simplify application-level code by avoiding the issues of
thread synchronization and contention in a shared memory space, and instead
provide/require isolated processes to exploit CPU-level parallelism.

The threading "isn't that bad" viewpoint generally comes from a limited
understanding of the things that can go wrong - for instance, add "memory
barriers" to the list of things to understand.

There's a good explanation of the problems with a common Java idiom "double-
checked locking" at
[https://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedL...](https://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html)

Of course, when things get complicated with lots of interactions between
processes, or significant amounts of computation in a single-threaded process,
some platforms require the user to manually yield, etc., basically trying to
re-create preemptive threading provided by the OS.

~~~
slackingoff2017
Nginx and Erlang single threaded? Uhhhhh, that's just not true.

Nginx and Erlang most certainly use multiple threads by default. They both use
thread pools. I'm not sure about Erlang but Nginx has a single master thread
that does async select on the main socket. Once it receives a connection it
hands off to worker threads. This is a traditional multithreading setup with
control/worker threads.

Node doesn't support threads because JavaScript doesn't. Performance would
certainly be better for many workloads if it followed a traditional
multithreading approach like Nginx. Async != Single threaded. They're not
related concepts. You get the best performance using both non-blocking calls
and multithreading , an approach used by Nginx and Netty. They both use
nonblocking IO on the main thread and thread pools.

Node requires you to manually yield by calling Response.end() because it
doesn't support multithreading. This basically duplicates OS level pre-emption
manually and is my single biggest criticism of node. The worker/master thread
model allows the runtime to manage worker resources or just defer to the OS in
a flexible way.

~~~
kbutler
You are confusing threads with processes.

Processes are isolated and do not share memory. Threads within a process share
memory. For example, see [https://msdn.microsoft.com/en-
us/library/windows/desktop/ms6...](https://msdn.microsoft.com/en-
us/library/windows/desktop/ms684841\(v=vs.85\).aspx)

Nginx is single-threaded, multiple process: "Each worker process is single-
threaded and runs independently" [https://www.nginx.com/blog/inside-nginx-how-
we-designed-for-...](https://www.nginx.com/blog/inside-nginx-how-we-designed-
for-performance-scale/)

Erlang code is single threaded. Each Erlang "process" is an isolated, share-
nothing single sequence of instructions, and you don't use semaphores, locks,
or critical sections in writing Erlang code. The BEAM VM is implemented as a
multithreaded process, but this is not exposed to the Erlang code, and Erlang
code can be transparently distributed across multiple instances of the BEAM VM
(processes) even on separate hosts.

You're correct on node.js, and "basically trying to re-create preemptive
threading provided by the OS."

~~~
slackingoff2017
Nginx recently implemented thread pools for blocking operations. The
distinction between threads and processes depends on the OS.

The "nothing shared" model used in Linux forking actually shares the
underlying memory until the processes diverge (copy on write). The difference
between this and threads is that threads expose the shared memory by default.

IMO the difference between threads and processes is somewhat academic.

I don't know much about BEAM

~~~
robotresearcher
> IMO the difference between threads and processes is somewhat academic.

Shared memory vs. isolated memory is a huge difference. You use completely
different sync/protection/IPC approaches with each, so programs look very
different. And the copy-on-write optimization of logically isolated memory
doesn't effect that. The beauty of it is that it speeds up forking and saves
memory without changing your IPC-based programming model.

------
notacoward
Note that it's from 1995. Back then, many people still thought that machines
with multiple processors were exotic things of little interest even in the
enterprise. That list most notably included one Linus Torvalds. It also
included OP author John Osterhout, who should also have known better - even
more so, since Stanford was one of the places where such things were not so
exotic. He even says, right up front, that threads still have their uses when
you need true CPU concurrency. Now that's a common case. Generalizing from
this presentation to the current day is probably a worse idea than threads
ever were.

~~~
greglindahl
Today, there are many places that have big clusters of multi-processor
machines, and none of their code is threaded. They get cpu concurrency by
message passing between heavy-weight processes and with other nodes.

~~~
tgamblin
If you're talking about MPI and its use in HPC, you might've been mostly
correct 5 years ago. Now, HPC applications at the largest sites are hybrid
MPI+OpenMP (or some other threading model, or they use GPUs). On Xeon Phi
systems like the latest Crays, you pretty much _need_ OpenMP to take advantage
of all the cores. The PPC A2 processors on the (formerly #1, now #4) Blue
Gene/Q system at LLNL won't get full instruction issue per cycle unless you
use multiple hardware threads (the threads are exposed to the hardware and
each core will only pull instructions from a _single_ process at a time, so
this is somewhat unlike hyperthreading on Intel chips).

There can be overheads to running very large numbers of MPI processes,
depending on how scalable the MPI implementation is, and there are overheads
to having too many threads contend for memory on a single node. At least in
the current state of things in HPC, there is a delicate balance to choosing
the right process-thread ratio, and it differs from application to application
and depends on the node count for the job. People struggle to find the sweet
spot, but we're definitely not running all MPI anymore.

~~~
greglindahl
I wasn't talking about MPI, but if you want to talk about MPI, note that
OmniPath gets faster as you have more cores talking to it, and it takes full
advantage of delivering messages to the memory near the receiving process, so
you'd better not move threads or processes very far.

If you look in your HPC historybook, you'll notice that I was the system
architect for InfiniPath.

------
thesz
John Osterhout is a creator of Tcl, which embraces event-driven model for
programming. I think it gives more perspective into his opinion.

That said, Tcl was one of the first scripting languages which got very nice
thread model - see AOLServer [1].

[1]
[https://en.wikipedia.org/wiki/AOLserver](https://en.wikipedia.org/wiki/AOLserver)

We used Tcl threads in one of our programs to control various hardware things
while main UI responded to events sent from the threads. Everything worked
very well, especially for a program in scripting language like Tcl.

------
milesvp
Twenty years later this seems to still be good advice. Martin Thompson talks
about this in his Mechanical Sympathy talk.

He says the first thing he does, as a performance consultant, is turn off
threading. Claims that's often all he needs to achieve the desired
improvements...

It's a good talk, I highly recommend it.

[https://www.infoq.com/presentations/mechanical-
sympathy](https://www.infoq.com/presentations/mechanical-sympathy)

~~~
ttoinou
Who creates threads just for fun ? I only use them when I really need them (in
C++), so there's really no alternative for me

~~~
YZF
Yep. If all you have is a blocking sys call and you want performance you gotta
have threads. If you want to use more than a single core at a time for
computation, you gotta have threads. [And I use the word threads loosely, you
could have multiple processes as well or whatever OS primitive let's you get
concurrency].

A 20 core CPU can do some things 20 times faster than a single core (multiply
matrices e.g.). If you're trying to do those things and you limit yourself to
a single thread - good luck!

------
Animats
The bad idea is taking a threaded language and retrofitting events, which
Python is doing. This results in an even worse mess. Python now has two kinds
of blocking, one for threads and one for events. If an async event blocks on a
thread lock, the program stalls.

Or taking a event-driven language and retrofitting concurrency, which
Javascript is doing. That results in something that looks like
intercommunicating processes with message passing. That's fine, but has
scaling problems, because there's so much per-process state.

Languages which started with support for both threads and events, such as Go,
do this better. If a goroutine, which is mostly an event construct, blocks on
a mutex, the underlying thread gets redispatched. There's only one kind of
blocking.

~~~
srean
Almost no one likes Tcl, but I think it used(uses) a nice model for
concurrency. Much better than Python's.

Interpreter as a library, all state encapsulated in a reentrant interpreter
object, different interpreters running in different threads that can send
messages to each other, no need for expensive serialization deserialization
(like in Python) because the interpreters are in the same process, ... so much
nicer.

------
outworlder
From a Linux perspective, threads and processes are essentially the same
construct. The major difference is the set of flags that are passed when the
process is created. Oversimplifying, if shared memory is requested, then it's
a thread. Otherwise, it's a process. Meaning forking servers are multi-
threaded.

On other operating systems, specifically those starting with the letter W,
there's a major distinction. There are other constructs as well, such as
"fibers".

Now, today's world is different from what it was in 1995. We used to have a
single core, so threads and multiple processes were only a logical construct.
Now, we have multiple cores, so we shold, at a bare minimum spawn multiple
processes/threads. What's running inside them can then be debated as if it
were 1995.

------
gens
Obligatory CSP[0] reference.

There are only a handful of examples, that i can think of, where
threading(multiprocessing, concurrency, and other names for it) is useful.

[0] [http://www.usingcsp.com/](http://www.usingcsp.com/)

------
vyodaiken
The key point is unstructured shared data is a source of errors in a threaded
program.

------
cjensen
If I had a dollar for every time I heard "thread programming is hard."

I've programmed using threads for 23 years. I've never had a non-trivial debug
issue caused by trouble using semaphores, mutexes, and shared data. It's no
harder than writing a hash table or balancing a tree.

~~~
julian_1
Do you have proofs for your code that it can't deadlock or livelock? How
strong are your claims - would you trust it in a life support system, or
aircraft auto-pilot, or similar role?

~~~
cjensen
Nope. Nor do I have any proofs that my balanced binary tree implementation is
correct, or that my hash implementation is correct.

Given the number of stories about equator issues and negative altitude issues
in autopilots, I'm unconvinced there are proofs involved in those either.

~~~
julian_1
I would trust my own multi-threaded code too, but it requires much stronger
knowledge of expected behaviors and code paths. I have had the misfortune to
try and debug other's multithreaded code. Having a debugger connected to a
production app, while waiting a week for it to deadlock is tedious.

------
dgreensp
I'd rather have real threads available in my language, and use shared state
sparingly. Your N single-threaded processes have to talk to each other anyway,
or at least to a master, and might even share memory through memory-mapping.
Threads are just a tool, and they give you options. You can use them in a
share-nothing way if you want.

As someone who grew up on the Java VM and started my start-up/web career on
it, I've always felt like Java programmers have a different relationship with
threads than C programmers. Java gives you cross-platform, usable, debuggable
native threads; it basically makes them free if you want them. In C/C++, on
the other hand, threads are a library, and using them is a grungy affair. If
you grew up on Rails, meanwhile, threads don't exist ("when you say worker
thread, do you mean worker process? I'm confused").

Node.js was created by C programmers and launched with a lot of anti-thread
propaganda, much like the link. They equated threads with shared state, and
also said threads were too memory inefficient to be practical for a high-
performance server (they meant that holding a native thread for every open
connection would require too much stack space, which is true, but that's not
what they said).

~~~
valarauca1
>As someone who grew up on the Java VM and started my start-up/web career on
it, I've always felt like Java programmers have a different relationship with
threads than C programmers. Java gives you cross-platform, usable, debuggable
native threads; it basically makes them free if you want them. In C/C++, on
the other hand, threads are a library, and using them is a grungy affair.

The difference is Java threads work within the constraints of the JVM memory
model. Which is very well specified and understood. It is implemented in the
JVM and the JVM guarantees your platform will follow it.

While C/C++ do what ever the hardware dictates.

[https://dzone.com/articles/java-memory-model-
programer%E2%80...](https://dzone.com/articles/java-memory-model-
programer%E2%80%99s)

~~~
dgreensp
That's part of it, for sure.

------
woliveirajr
> Where threads needed, isolate usage in threaded application kernel: keep
> most of code single-threaded.

This is the point where performance tops: each CPU is filled with operations,
and operations that don't need to wait for the result of other threads.

------
sbov
I'm sure many people don't think this applies to them because they don't use
threads. However, in the modern day, replace "thread" with "process" and
"memory" with "database", and many web applications have very similar
problems. They just never actually manifest because of the small number of
requests per second.

~~~
gnaritas
No, the problems of threads don't carry over to processes, what you just said
is exactly wrong and misunderstands the issues involved. Databases have
transactions and processes don't share memory, those solve the problem with
threading which is shared non transactional memory.

~~~
zzzeek
Transactions lock and conflict with each other, so the metaphor holds fairly
closely though not completely equivalent of course

~~~
gnaritas
Yes but these are recoverable states whereas threads suffer from partial reads
in the middle of updates resulting in undefined unpredictable behavior. It's
not a good metaphor to compare threads to processes as they suffer from vastly
different failure modes and problems and are quite normally contrasted to each
other as different solutions to concurrency problems.

~~~
zzzeek
the reason people don't want to use threads is because they're afraid of
having to understand race conditions. At that level, "you will have to
understand race conditions no matter what" is the wisdom I derive from this
particular metaphor.

------
EGreg
I agree. The actor model where actors can be scheduled on any thread is the
best (Erlang, Goroutines). Second best is the node.js model of single threaded
evented programming.

~~~
chrisseaton
In terms of safety I think fork-join is better than the actor model. The actor
model is all about mutable state - the state of all the actors. Fork-join
removes state entirely - it's just about passing values.

~~~
dragonwriter
> The actor model is all about mutable state

One of the more popular actor model languages (Erlang) also generally avoids
mutable state (it's available in the form of the process dictionary, but
recognized best practice is to avoid using it.)

~~~
chrisseaton
That's not quite what I mean.

I mean an Erlang actor is a state machine isn't it? It accepts messages, and
each messages causes the actor to either return some data based on its state,
or to enter a new state.

State machine - so it's holding state isn't it? And it's mutable because you
can cause it to enter a new state by sending messages.

And you have lots of these mutable state machines in your Erlang process -
lots of mutable state.

See what I mean? I know they don't have shared memory, but each process is
effectively a little piece of global shared mutable state. And that's bad!

In other parallelism models like fork-join, there is no mutable state. Tasks
get input parameters, and produce a result. They don't have any state in any
meaningful sense because you can't observe their state externally and you
can't ask them to change state externally.

------
joosters
IMO, the main problem with using threads is that they are such an 'all or
nothing' approach to sharing data.

If you want to make use of multiprocessing, the traditional choice is either
to use two separate processes (sharing nothing), or to use threads (and share
everything). But for most tasks, these opposite ends of the spectrum are not
what you need. There's plenty of data and state in most programs that doesn't
need to be shared, and a huge source of threading bugs is through mistakenly
altering some data that another thread was using.

The problem is that sharing partial state between processes is painful and
many languages and OSs make it difficult to do. You have to play around with
mmap() or other shared memory tools, and then pay great attention to not mix
pointers or other incompatible data between the processes.

------
geezerjay
There's a link to a related discussion on HN entitled "Why Events Are a Bad
Idea (for high-concurrency servers) (2003)"

[https://news.ycombinator.com/item?id=14548487](https://news.ycombinator.com/item?id=14548487)

------
ilaksh
Threads have been useful for me in my latest experiment. Its a C++ application
that runs mame_libretro dll (copies) each in their own thread, while a BASIC
intepreter runs in another thread, and the main game engine (Irrlicht) runs in
the main thread. Irrlicht isn't multithreaded so I just put commands from the
BASIC thread into an outgoing deque which I lock and unlock as I access it.
Then there are mutexes for the video data.

I think that threads are definitely a bit tricky though since it is easy to
mess up locking/unlocking or not lock things necessary and if you do then you
have debugging headaches. So when not needed they should be avoided I think.

------
faragon
I can not agree more. Most people should not write threaded code at all.

------
kulu2002
Whatever... But Multithreaded systems are fun to design, develop and most
importantly - troubleshoot... more harder the 'real time'ness more the fun :)

------
dreamdu5t
Threads are a bad idea for the same reason manual handling of memory space is
a bad idea. Languages should provide primitives that only allow for safe
construction of expressions that are run concurrently by the runtime.

~~~
pyre
> _Languages should provide primitives that only allow for safe construction
> of expressions that are run concurrently by the runtime._

Then you run into issues like that post recently where you can't have
different threads running in different context, because the _runtime_ is the
one deciding/controlling when new threads are spawned.

------
ythn
What if I have a blocking function call (i.e. listening on a socket)? Seems
like there is no choice but to put the blocking call in its own thread...

~~~
littlestymaar
Why should you use a blocking mecanism for listening on a socket ? Non-
blocking I/Os[1] (epoll on Linux) have been a thing for a long time now.

[1] :
[https://en.wikipedia.org/wiki/Asynchronous_I/O](https://en.wikipedia.org/wiki/Asynchronous_I/O)

~~~
kbwt
Except when the non-blocking file I/O API[1] is actually a blocking one. Yes,
I realize in this case it's the API that needs fixing, but it isn't looking
like that will happen anytime soon.

[1] :
[https://lwn.net/Articles/723752/#724198](https://lwn.net/Articles/723752/#724198)

------
Kenji
It's the same with every tool: The more powerful it is, the worse the
consequences are when it's abused. That applies to programming in particular.

~~~
marcosdumay
With great powers come great needs of formal verification.

~~~
zzz95
Need is there, but tools are not.

~~~
Kenji
It's not a tooling issue (see my comment above)

------
in9
Loved the programmer comparison slide. Is today's python programmer 1995's
visual basic programmer? :D

~~~
milesvp
You've never worked with "wordpress programmers" have you?

~~~
sethrin
It's possible to be a "wordpress programmer" in any language, but man, I have
read a lifetime's worth of bad PHP code. WordPress is of course the worst of
all possible PHP worlds since it was a fork of an even more ancient project
and thus had to maintain backwards compatibility with a purely procedural
codebase.

The phrase "wordpress programmer" conjures up the person for whom that is
their sole form of programming expression. Some part of me acknowledges that
people like that must exist: the rest recoils in horror.

~~~
frozenport
"drupal programmer"

