
Why Threads Are a Bad Idea (1995) [pdf] - smartmic
https://www.cc.gatech.edu/classes/AY2010/cs4210_fall/papers/ousterhout-threads.pdf
======
oldgeezr
Oh gee I guess I'm a wizard.

Lots of systems/embedded programmers roll their eyes at this kind of talk.
Threads aren't really that hard.

Event queues do have benefits in certain situations. They pair nicely with
state machines. You can easily end up in callback hell though, and it is often
difficult to integrate some long-running, atomic tasks into your event loop.
You end up doing things like having a thread pool, at which point you have to
wonder why you stopped using threads in the first place. Oftentimes a threaded
approach is a cleaner approach. Just get the locking granularity right - it's
not that difficult.

~~~
nostrademons
Systems/embedded programmers roll their eyes at this kind of talk because they
usually control (or at least have visibility into) all of the code that goes
into their stack. Threads aren't that hard under these conditions.

The main problem with threads is that they're non-composable: the set of locks
that a thread holds is basically an implicit dynamically-scoped global
variable that can affect the correctness of the program. If you call into an
opaque third-party library, you have no idea what locks it may take. If it
then invokes a callback into your own code, and you then call back into the
library, there is a good chance that your callback will block on some lock
that a framework thread holds, that framework thread will block on a lock you
hold, and then the code that releases that lock will never execute. Deadlock.

If you control all of the code in your project, this does not affect you:
define an order in which locks must be acquired and released and stick to it.
If all of your dependencies have no shared data and never acquire locks
themselves, this does not affect you (and indeed, this is recommended best
practice for reusable libraries). If you never call back into third-party
libraries from callbacks, this does not affect you, but it severely limits the
set of programs you can write. If all of your dependencies thoroughly document
the locks they take and in which order, this affects you but you can at least
work around the problem areas and avoid surprise deadlocks.

Most application developers do not work under conditions where _any_ of these
are true, let alone all of them. Application development today largely
consists of cobbling together third-party libraries and frameworks, many of
which are undocumented, many of which are thread-unsafe, and many of which
spawn their own threads and invoke callbacks on an arbitrary thread.

~~~
marshray
> the set of locks that a thread holds is basically an implicit dynamically-
> scoped global variable that can affect the correctness of the program

One technique to get a handle on this situation is making the mutexes actual
explicit global variables.

"But global variables are bad" they will say. Yeah. And it reflects the
reality.

"But I need a separate mutex for each object instance like they recommended in
1995
[https://docs.oracle.com/javase/tutorial/essential/concurrenc...](https://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html)
" they will say. Have fun with that.

Python and early Linux kernels use a single global mutex for access to all
shared mutable state. In my experience, this is an entirely reasonable design
decision for a huge majority of applications.

------
jeffreyrogers
There is a response to this: Why Events Are A Bad Idea (for high-concurrency
servers)[0]

[0]: [https://people.eecs.berkeley.edu/~brewer/papers/threads-
hoto...](https://people.eecs.berkeley.edu/~brewer/papers/threads-
hotos-2003.pdf)

------
bunderbunder
Oh, to return to what life was like 23 years ago, when GUI applications were
so simple that you could get away with fitting all their work onto a single
thread.

Nowadays, a great many GUI apps have a lot of data crunching to do in the
background. You've really got two options for how to handle that:

    
    
      1. Be intermittently unresponsive, like iTunes.
    
      2. Do work on a background thread, like decent software.
    

(Intentionally omitting the option of breaking your work into a bunch of tiny
bits that can be handled on a single thread's event queue like some sort of
deranged node.js app from hell, on the grounds that please no I can't even.)

~~~
pjmlp
3\. Work on a task handled by a thread pool, like scalable software

4\. Work on another process, like safe software

------
Roboprog
The source of the article, Sun, is interesting.

I guess the author knew what was about to be foisted upon the world.

Kudos for trying to warn us.

(I remember reading Novell and OS/2 documentation in the late 80s / early 90s
about threads and recoiling in horror. Of course, all real men must use
threads, cuz they’re faster, even if stupefyingly dangerous)

~~~
J-Kuhn
John Ousterhout is the inventor of the Tcl language.

~~~
yellowapple
Which includes Tk, notable for being a relatively-easy-to-use GUI toolkit that
embraces events as described in these slides.

(You probably already know this; just elaborating for those who might not be
familiar with Tcl/Tk)

~~~
Roboprog
I’ve not used tcl/tk since around 2000, but it was a nice thing to quickly
code up some dialog boxes for operations oriented scripts back in the day.

------
cautionarytale
A serious practical problem with threads mirrors the same problem with C++,
which is that many programmers reach for it _first_ when they should be
reaching for it _last_. Both of these technologies are like swallowing glass,
and the wise programmer will avoid them if at all possible.

~~~
geezerjay
> that many programmers reach for it first when they should be reaching for it
> last.

What's the go-to solution to get a UI to not block when running a
computationally expensive task that takes a lot of time to finish?

~~~
dottrap
I don't claim these are "go-to" solutions, but only that there are multiple
solutions to pick from.

One solution is processes (mentioned in the post). Fork a process which does
your computationally expensive thing and then get the result when you are
done. For the security minded, we've seen this make a bit of a come back
because separate processes can be run with more restrictions and can crash
without corrupting the caller. We see this in things like Chrome where the
browser, renderers, and plugins are split up into separate processes. And many
of Apple's frameworks have been refactored under the hood to use separate
processes to try to further fortify the OS against exploits.

Another solution is break up the work and processing in increments. For
example, rather than trying to load a data file in one shot, read a fraction
of the bytes, then on the next event loop, read some more. Repeat until done.
This can work with both async (like in Javascript) or you can do a poll model.
Additionally, if you have coroutines (like in Lua), they are great for this
because each coroutine has its own encapsulated state so you don't have to
manually track how far along you are in your execution state.

~~~
geezerjay
> One solution is processes

More expensive to start than threads, and far more expensive and complex and
restrictive to move data around. Sounds like with the exception of some
specific corner cases, threads are a better solution.

> Another solution is break up the work and processing in increments

Either the tasks aee broken into ridiculously fine-grained bits that are hard
to make sense or keep track,or you still get a blocking UI. Furthermore, the
solution is computationally more expensive.

~~~
cautionarytale
Fork/exec time for extra processes is usually unimportant. If data transfer is
truly a bottleneck, shared memory is as fast as threading.

These costs, though, are generally trivial compared to the lifecycle costs of
dealing with multithreaded code. Isolation in processes greatly enhances
debuggability, and it's almost impossible to produce a truly bug-free threaded
program. Even a heavily tested threaded program will often break mysteriously
when compiled with a different compiler/libraries, or even when seemingly
irrelevant code changes are made. It's a tar pit.

------
bitcharmer
I have seen that argument in the past but somehow still can't see how event-
based approach saves us from the headaches of concurrency.

If you need to deal with parallel processing (which is relatively often in the
real world) you WILL have to face the problems of consistency, visibility and
program order.

Many languages don't even require programmers to have much exposure to
threading mechanics. It's an OS responsibility, and that's not necessarily a
bad thing.

~~~
Spooky23
1995 was a different era.

~~~
pjmlp
Yep, processes were too heavy for 1995 hardware and threads were seen as the
solution for everything.

------
twtw
> Only use threads where true CPU concurrency is needed.

This is the case much more often now than it was in 1995.

~~~
Roboprog
Somebody better tell the Node.js cluster guys :-)

They manage without threads pretty well, I think. Shared state is deliberate,
outside of individual processes, rather than accidental in-process. As it
should be unless you are doing some serious systems level programming.

~~~
oldmanhorton
To be fair, node.js just merged the --experimental-workers module for actual
webworker-style threading.

~~~
amelius
Does it allow shared (immutable) data structures?

I.e. does it allow passing large parts of data structures without copying?

------
outworlder
Under Linux, you don't _need_ threads. Threads and processes are basically the
same thing. You just provide different flags which tell the kernel how it
should view that process, which will impact things like memory isolation, copy
on write, etc.

Under Windows, it is a different story. Threads and processes are wildly
different constructs, and threads are more lightweight. Sometimes, still not
lightweight enough, so they came up with fibers.

~~~
pjmlp
Linux is the exception on how most UNIXes implement threads.

------
devxpy
Threads are not hard. In fact, threads are extremely easy to implement.

However, real Threading code is just incredibly difficult to reason, just by
looking at it. This makes it easy for you to introduce race conditions without
even knowing that there is one!

There is also the fact that locks don't lock anything! They are just a flag,
that a any code may choose to ignore.

They are a not an enforcing tool, just a cooperative one.

(More here:
[https://www.youtube.com/watch?v=9zinZmE3Ogk](https://www.youtube.com/watch?v=9zinZmE3Ogk))

P.S. I created a library, that makes it easier to write safer multiprocessing
code

[https://github.com/pycampers/zproc](https://github.com/pycampers/zproc)

------
pjungwir
I've built things with pthreads a few times, and also used threading in Java,
Rust, Python, and Ruby. (Edit: C# and Perl too IIRC. :-) The best book I've
read about using threading safely was the O'Reilly _Java Threads_ book. It's
been about 16 years, but I remember it being a great "teaching the concepts"
book, taking you through lots of pitfalls and showing how many ways you can
mess up. It taught me way more than just Java. Like oldgeezr I kind of roll my
eyes at the "you must be this tall to use threading" stuff, but I think I
largely owe to that book both my confidence and my wariness. I bet it is still
worth reading today.

~~~
MrBuddyCasino
Java Concurrency In Practise served the same purpose for me. A very good book
indeed.

------
bartread
Also, I have bad news for you if you are a web developer: unless all you're
serving up is HTML (either static or generated on the server[1]) you are
developing a multi-threaded app - it just happens to be the case that the
threads are running on different machines. Race conditions between client and
server code are a thing, so you'd do well to understand the concepts of multi-
threading.

[1] And even in this case you're probably still multi-threaded, although in
most cases it won't feel like it because your server side threads won't share
state.

------
dgreensp
Threads were overprescribed in the 90s. I’m pretty sure the original example
code for drawing an image in Java involved typing “new Thread” so that the
image could be loaded over the network in the background. Truly the way to
write apps in the Internet age!

At the same time, Java’s threads are so easy to use — without of the
portability or debuggability issues of native threads — that threads don’t
seem that bad to Java programmers. Yeah, shared state can be a foot gun, but
so can global variables. You just keep things as pure and easy to reason about
as possible. And Java has had concurrency primitives that keep you from having
to deal directly with threads and locks for over a decade.

I don’t think “events” and threads solve the same problem. If your program
would work just as well doing all its work in a single thread then yeah, you
don’t really need threads. If we’re comparing “events and callbacks” async
style to async/await style where you write your code as if it were running in
a thread (even if it isn’t), I think the latter wins.

------
jdonaldson
Why Ideas are Bad Ideas (1995-2018)

------
zaarn
Like in almost any functionality of a computer or programming language, it
helps to understand them, being aware of the risks and knowing alternative
approaches.

Threads can be a bad idea but if you keep in mind what variables you use and
guard shared memory, it's fine. Sometimes you might prefer a process instead
for security/resistance.

------
jlv2
As mentioned in other comments, Ousterhout is the inventor of Tcl/Tk (among
other things). At around the time this was published, Tcl was my favorite
"play with" language, and it naturally lacked sort of built-in threads
abstraction. Also at the same time, Tcl/Tk had just become a project at
Sunlabs. One of their early projects was to take the event-loop that was the
underpinning of Tk and add it to Tcl.

I started a new project back around then to build a system for deep caching of
web sites to give time consistent access offline. I implemented as a web proxy
with an online/offline button. As you browsed web sites, it would crawl
recursively following a set of rules. The intent was to precache content near
what you already explicitly accessed, to make it available offline later on
(we called this the "detachable web").

While not the primary purpose of our project, I put together a demo to
optimize the Alta Vista search results page, which at the bottom only had a
"Next" button (unlike the "1 2 3 4 5..." you see at places like Google today).
When you clicked "Next", it took Alta Vista a few seconds (4-5) to return the
next page of search results. My system would prefetch the 10 pages of results
by POSTing the "Next" for you, basically while you were still reading the
first page resuls. The result was "Next" became instantaneous. Again, this is
not why we built this system; this was just one novel approach I used it for.

I mentioned all this because the entire project was implemented in Tcl. Being
influenced by the lack of thread support in Tcl and by the paper mentioned in
the OP, my project utilized a event-driven model for everything, since every
inbound user require could fire off dozens of background fetches, all of which
needed to be done in parallel. Events (and continuations) worked well for
this. I have a paper up from the 5th Tcl/Tk workshop:

[https://www.usenix.org/conference/5th-annual-tcltk-
workshop-...](https://www.usenix.org/conference/5th-annual-tcltk-
workshop-1997/presentation/caubweb-detaching-web-tcl)

I had used for the project Tcl because it let me support all three prevalent
platforms of the time: UNIX, Windows 95, and MacOS 9. Day-to-day work was done
on FreeBSD.

I think have some commentary in the paper on the effects of the event-driven
approach. What's funny is that I was taken off the project for v2, which the
team then decided would be written in Java using threads, because, well, Tcl
wasn't mainstream enough. In 1997, Java was the rage. The downside is that
they could never get v2 working reliably enough because of the explosion in
memory and processing power it required to accomplish the same work. In Tcl,
having 60 traversals active when it was just 60 continuations (events) just
worked. In contrast, the Java implementation needed 2-3 threads per traversal,
and it just couldn't scale up to that.

~~~
dgsb
Nice story. Interesting notes in your paper on the lack of a standard library.
In the end, I think this is what have killed the language.

------
girzel
Just FMI: the "events" approach that's recommended in the article over
threads, that's how Python libraries like tornado and twisted work, right? And
to what extent does the new asyncio Python library assume that functionality?

~~~
nine_k
Events don't share mutable state (at least implicitly), they carry a copy of
data. This eliminates a huge class of thread-related errors.

The canonical thing that works this way is Erlang (and its modern cousin,
Elixir). See also "actor model" (e.g. Akka). It is approximately how Windows
and Mac GUI used to work (back in the day; did not look at these APIs for ~20
years).

Python async is coroutines, a different kind of concurrency. In it, the event
loop is hidden, and coroutines just yield control, implicitly or explicitly,
to allow other coroutines proceed. In Python, a CPU-bound task can only run on
a single thread, due to the Global Interpreter Lock preventing concurrent
modification of data. Coroutines are still useful both for IO and as a general
way to describe intertwined, mutually dependent computations. (The earliest,
limited Python coroutines were generators.)

~~~
AnimalMuppet
> Events don't share mutable state (at least implicitly), they carry a copy of
> data. This eliminates a huge class of thread-related errors.

I'm not sure that's true, at least not in my problem space. Copying data
leaves you the possibility of operating on stale data, which will result in
the computation returning the wrong answer. To avoid that, you have to let the
event handler know somehow when the data has changed. How are you going to do
that?

------
rother
I was not coding in 95, and therefore don’t understand the perspective of the
author back then, but it seems clear from the presentation that the culprit
was “shared mutable state”, not “threads”. Wasn’t functional programming a
thing back then?

~~~
insulanus
Yes, but threads (the popular APIs - eg. POSIX threads, Win32 Threads) imply
the availability of of shared mutable state, concurrency, and custom written
locks. And when something is available, it will be used. Heck, you could even
pass a pointer to an address in some other thread's stack with ease.

The amount of experience you needed to program that, while dealing with
structuring the rest of your program could be large, especially if you were
adding threads to an existing program.

Functional programming is cumbersome to pull off in the systems programming
languages available at the time (C).

~~~
rurban
Threads are good, shared state is good if hidden behind a proper protocol,
just locks are evil. Windows and POSIX are to blame.

Nowadays nobody should use locks anyway, as there are much better, faster and
safer variants for concurrency with native threads, based on actor
capabilities and ownership, and avoid blocking IO like hell. No, not Rust.
Rust did it wrong.

Those who do it right are so far Pony, Midori/Singularity, and parrot with
native kernel threads. With simple green threads there are some more, but they
are only usable for fast IO, not fast CPU tasks.

------
urda
Is this still true today? Even back when I was a young coder threads really
weren't that difficult for me to understand and develop. With many of the
modern languages, threading is even easier than before.

------
phkahler
Oh look, an ad for Visual Basic from 1995! Event driven code sucks and is not
better. I learned this writing VB code. Others learned the hazards of event
driven code in the Therac 25.

~~~
Jtsummers
Being event driven was not the problem with the Therac 25. No QA, no real
testing, and the elimination of hardware locks which would’ve entirely
prevented the problem in the first place were the problem with that system.

------
ertucetin
Use Clojure, it has built-in Software Transactional Memory support.

------
mesozoic
Must be real wizards these days boys. Today we deal with hundreds of instances
of an application each running many threads across multiple cpu cores.

~~~
pjmlp
And then spend days trying to understand why there is a deadlock when there is
an Eclipse.

------
mailslot
To me, serious work in an event loop feels like a more convoluted form of
cooperative multitasking.

------
jeffrallen
Still are. But goroutines are o.k. :)

~~~
krylon
From the programmer's point of view, goroutines are pretty much the same as
threads.

And they allow you to make the same mistakes you can make with threads.

Don't get me wrong, I love Go. But it does not free you from having to think
about what you are doing.

------
fsnarskiy
OK threads aren't "BAD" or "GOOD", threads are a tool to be used correctly.

Threads are like a data super-highway and all the incorrect uses of them arise
from using them for way to little data. Akin to building a 5 lane highway for
5 cars to pass.

A thread has some amazing things of being able to switch an execution very
fast (built into things on the CPU level) and memory caching/storing
advantages. Aka a thread is meant for a compute heavy task like rendering
something, or running a decode in the background, mainly doing heavy math.
Threads provide great things but at a cost. just like a highway they cost a
lot ( a lot of memory in your ram) and require some maintenance and management
(locking mechanisms)

The problems with threads arise when people think its ok to use them
everywhere for all tasks parallel or async.

Example Apache used to start a thread for each connection to server which at
that time took 40 MB + .5 sec and this allowed a myriad of attacks on, one of
them being slow loris.

In java-script if you start a new web worker thread, that's actually a new V8
instance and costs you again a lot in memory and startup time.

This "start a thread for everything" was definitely the prevelant thinking in
the first decade of 2000, and people were not really thinking about hidden
costs.

Come along Ryan Dahl with node.js in 2009 and "OMG everyone forgot there are
such things as event loops"

An event loop is basically a much cheaper single threaded async way of
processing events in an event queue, the big idea here was that in most other
languages threads waited for any time consuming I/O to network or hard disk
and let other threads run in the meantime.

Ryan combined the async nature of event loops with async I/O... rightfully a
very clever move. (also I/O locks is what often causes thread locks in multi-
threaded environments)

This allowed the single threaded event loop to never really lock up with any
time consuming, but not CPU related task, freeing the CPU to constantly
process the event queue, in a way emulating multi-threading on a single
thread.

Going back to the highway metaphor, this would be more like an elevated city
bike path, it cant take heavy trucks (heavy CPU loads) but it can take a huge
amount of light processing request and never lock up, freeing up your city
streets from bikers and leaving them more free to run the heavy trucks.

This is how node js can handle 600k concurent connections -
[https://blog.jayway.com/2015/04/13/600k-concurrent-
websocket...](https://blog.jayway.com/2015/04/13/600k-concurrent-websocket-
connections-on-aws-using-node-js/)

something you would never be able to achieve if u started a thread for each
one.

basically this is akin to building 1 dense bike path for 600k bikers or
building 600k 5 lane highways down each only 1 biker would go.

Where node.js falls short is if u give it heavy math tasks, the event loop
will lock up.

So in my analytics processing server i had a node.js main loop with a bunch of
V8 web-worker thread pools, to do the heavy math and statistics, while the
main thread just routed requests and served cached data.

Another consideration however is memory leaks, threaded environments tend to
clean up well after themselves, because if there is a leak in a thread it gets
wiped when the thread dies. But node.js is very susceptible to memory leaks.

All these things are just tools, you have to learn when to use the right tool
for the right job.

But i think there are much more pitfalls in building threaded environments
then there are using event loops. I got node js concepts within a week or two,
however i still struggle with some thread lock concepts even after taking
clases, and shit is way harder to debug properly too. Its that high abstract
level of thinking that i have a hard time visualizing in my head, and i am
never sure that i though EVERY scenario through.

~~~
imtringued
>Ryan combined the async nature of event loops with async I/O... rightfully a
very clever move.

How is it clever? You cannot have async IO without an event loop. Async IO was
a pretty mature technology long before nodejs came out. Netty did this back in
2004. The only special thing about nodejs is that it's culture is to be async
by default.

------
pjmlp
Yep, they might be a good solution on resource constrained hardware, but we
learned hard how bad they are from security and stability point of view.

In that regard process are much better solution.

~~~
jstewartmobile
I suspect today's downvotes will be tomorrow's i-told-you-so points. The async
cult's days are numbered.

~~~
squirrelicus
What does multiprocessed model have to do with the "async cult"? And who
exactly is in the async cult? Because "async" to me means "async io" like
epoll, kqueue etc, which are pretty much necessary to go from ~100 concurrent
connections to ~10000 in a performant manner. That will never go away while
we're on x86-based architecture.

~~~
jstewartmobile
Async cult (mostly node people) proseltyzing that promises/callbacks are _the
one true way_ while glossing over many many scenarios where that execution
pattern is less than desirable.

Wasn't projecting async all the way down to IO primitives.

Commented because most of the debate seems to be threads vs callbacks--with
processes being unwisely overlooked.

~~~
squirrelicus
Interesting. Fundamentally, in order to achieve async io, a continuation is
necessary. Syntax may hide this (e.g. async/await), or make it apparent as in
callbacks/promises. I can't blame JavaScript for not having syntax that makes
async pretty.

That being said, if we're talking not about IO, but about cpu/memory bound
problems... Well I'd be lying if I said it was uncommon in my career to come
across people who assumed, to the detriment of simplicity, quality, and
performance, that a calculation (e.g. process a list mapping op with
AsParallel/parallelStream) would be aided by parallelism. That is just
ignorance by Dunning-Kreuger devs who don't apply a critical eye to their own
experiences.

~~~
jstewartmobile
You're focusing on performance. Performance is not the world. Parallelism is
not the only reason to use cooperating processes. There are other
considerations--like task fairness, and isolation.

We _desperately_ need more isolation in software.

~~~
squirrelicus
Absolutely. I agree.

