
Must Be This Tall to Write Multi-Threaded Code - mnemonik
http://bholley.net/blog/2015/must-be-this-tall-to-write-multi-threaded-code.html
======
jandrewrogers
The problems with traditional multi-threaded concurrency go beyond just
complexity and safety. They also offer relatively poor performance on modern
hardware due to the necessarily poor locality of shared structures and context
switching, which causes unnecessary data motion down in the silicon. Whether
or not "message passing" avoids this is dependent on the implementation.

Ironically, the fastest model today on typical multi-core silicon looks a lot
like old school single-process, single-core event-driven models that you used
to see when servers actually had a single core and no threads. One process per
physical core, locked to the core, that has complete ownership of its
resources. Other processes/cores on the same machine are logically treated
little different than if they were on another server. As a bonus, it is very
easy to distribute software designed this way.

People used to design software this way back before multithreading took off,
and in high-performance computing world they still do because it has higher
throughput and better scalability than either lock-based concurrency or lock-
free structures by a substantial margin. It has been interesting to see it
make a comeback as a model for high concurrency server software, albeit with
some distributed systems flavor that was not there in the first go around.

~~~
pcwalton
Shared-nothing is great when you can do it. But sometimes the cost of copying
is too high, and that's what shared memory is for.

Take, for example, a simple texturing fragment shader in GLSL. You're not
going to copy the entire texture to every single GPU unit; it might be a
4096x4096 texture you're rendering only a dozen pixels of. Rather, you take
advantage of the caching behavior of the memory hierarchy to have each shading
unit only cache the part it needs. This is what shared memory can do for you:
it enables you to use the hardware to _dynamically_ distribute the data around
to maximize locality.

~~~
jandrewrogers
I did not mean to imply that you are giving every process a copy of all the
data. The main trick is decomposition of the application, data model, and
operations such that every process may have a thousand discrete and disjoint
shards of "stuff" it shares with no other process. The large number of shards
per process mean that average load across shards will be relatively balanced.
The "one shard per server/core" model is popular but poor architecture
precisely because it is expensive to keep balanced.

However, in these models you rarely move data between cores because it is
expensive, both due to NUMA and cache effects. Instead, you move the
operations to the data, just like you would in a big distributed system. This
is the part most software engineers are not used to -- you move the operations
to the threads that own the data rather moving to the data to the threads
(traditional multithreading) that own the operations. Moving operations is
almost always much cheaper than moving data, even within a single server, and
operations are not updatable shared state.

This turns out to be a very effective architecture for highly concurrent,
write heavy software like database engines. It is much faster than, for
example, the currently trendy lock-free architectures. Most of the performance
benefit is much better locality and fewer stalls or context switches, but it
has the added benefit of implementation simplicity since your main execution
path is not sharing anything.

~~~
arielby
Don't you lose all of your performance gains in RPC overhead? How do you avoid
latency in the data thread (do you have one thread per lockable object? won't
that be more than 1 thread per core?) - these are the reasons lock-free is so
popular.

~~~
simoncion
> Don't you lose all of your performance gains in RPC overhead?

If one did, then why would anyone who knew what they were talking about (or
even just knew how to write and use a decent performance test) advocate this
method? :)

------
bsder
Sigh. What's wrong with using lock-free data structures?

Go study java.util.concurrent. It's one of the absolute best libraries ever
written by some of the smartest programmers I have ever seen.

The primary question is "Do I really need to _wait_ or do I just need to be
_consistent_?" 90% of the time the answer is that _consistent_ is good enough.

Lock-free data structures are not a panacea. They don't always do as well as
locks in the face of contention. However, if you have that much contention,
congratulations, you have an actual spot you really need to optimize.

By default, though, lock-free data structures protect you from so much fail
it's ridiculous. I don't dread concurrent programming if I have a good lock-
free data structure library.

That having been said, if you _really_ have to wait (normally for hardware
access), then you _MUST_ do certain things. Your "lock" _MUST_ be as small as
possible--if it isn't "lock with timeout", "single small action that always
goes to completion even if error occurs", "unlock"\-- _YOU HAVE FAILED. START
OVER_. Also, note the "timeout" portion of the lock. "Timeout" _MUST_ be
handled and _IS NOT NECESSARILY AN ERROR_.

Now, these don't get all the situations. People who need "transactions" have
hard problems. People who have high contention have hard problems.

However, I can count the number of times I genuinely needed to deal with
contention or transactions on one hand and still have two fingers left over.

Whereas, I have lost count of the number of times that I cleared out all
manner of bugs simply by switching to a lock-free data structure.

~~~
kabdib
I'm terrified of lock-free stuff because it typically depends on nasty things
like memory order, cache behavior and other subtle things, mostly nonportable
and simply awful to get right on a new platform.

Also, I keep finding bugs in lock free structures. That's annoying.

~~~
raverbashing
This is the kind of thing you abstract on a library

~~~
kabdib
Except that libraries are written by people, who have assumptions and probably
limited budgets, and the sheer inventiveness of hardware engineers at making
things like memory ordering different, or even buggy [game consoles are famous
for this] . . . well, libraries aren't magic. You have to test them somewhere,
and the failures are often subtle, happen sporadically, are hard to reproduce
and are very difficult to debug.

I've used them. But in limited areas, where the win is great and they are
explicitly called out as portability hazards, with alternate implementations
of the areas in case the lock-free stuff goes pear-shaped. And from time to
time we look at these things and wonder if the performance gain is (a) real,
and (b) worth the headache.

------
smegel
I've written a lot of multi-threaded code, but I don't think I've written ANY
multi-threaded code that doesn't involve queues passing objects (hopefully
safely) between threads. As in, no locks, no semaphores, no accessing shared
state between threads (OK apart from an global flag structure that just set
various error conditions encountered and was only interacted with by atomic
get/set operations and where order of access was never important). Adding a
lock to a program is like a huge red flag - stop everything and really think
about what you are doing.

~~~
JoeAltmaier
Some ivory tower nut proved in the 70s that semaphore-and-thread was logically
equivalent to queue-and-message, i.e. the same results could be obtained by
each.

But queue-and-message is superior in very many ways. Done carefully, you have
only the queue as a shared structure. If message processors are non-blocking
then the entire system is non-blocking. Deadlock can be statically determined
by examination of the message vectors.

And last but certainly not least, in the debugger, all important state is in
the message or local variables in a message processor. No enormous stacks to
dive through, trying to find who did what to whom. Simple single-threaded
message processors have straightforward logic. And a message-aware operating
system can make the work queues transparent.

------
nickpsecurity
Sounds like they're just using the wrong tools. Eiffel's SCOOP model[1] made
things a lot easier back around '95\. They've been improving on it since, with
a lot of great work recently [2]. Various modifications proved absence of
deadlocks, absence of livelocks, or guarantee of progress. I believe a version
was ported to Java. A few modern variants have performance along lines of C++
w/ TBB and Go.

What are the odds that it could be ported to a restricted use of C++ language,
I wonder?

Note: Ada's concurrency strategy also prevented quite a few types of errors.
They're described in this article [3] on ParaSail, a language designed for
easy concurrency that goes much further.

[1] [http://cme.ethz.ch/scoop/](http://cme.ethz.ch/scoop/)

[2] [http://cme.ethz.ch/publications/](http://cme.ethz.ch/publications/)

[3] [http://www.embedded.com/design/programming-languages-and-
too...](http://www.embedded.com/design/programming-languages-and-
tools/4375616/ParaSail--Less-is-more-with-multicore)

------
tormeh
People: There are solutions to this shit. If you're building a distributed
system or need to deal with loss of data regardless, use actor systems (Erlang
or Akka). If you need something that's not quite as probabilistic and are
willing to handle deadlocks use CSP (Go or Rust). If you need absolute
determinism and you're willing to pay for it in performance use SRP (Esterel
or possibly maybe at your own risk Céu).

If you need shitstains in your underwear use locks and semaphores.

~~~
nicerobot
Agreed. Now days, you should not even think about threads. If you are, you're
doing it wrong. You should only be concerning yourself with immutability or
not sharing state. Hardly rocket science. Here's how easy it is in Go:
[https://github.com/Spatially/go-workgroup](https://github.com/Spatially/go-
workgroup) just write a function and send it data. Done. Granted, to do that
distributed requires some thought for the remoting but still not a concern for
the developer.

~~~
mike_hearn
I get the feeling a lot of developers have only ever written one or two very
specific kinds of software and think that their experiences generalise to all
kinds.

Threading with locks isn't going to go away any time soon no matter how
religiously one states their opposition to it. Take the case of mobile or
indeed desktop app programming:

• Memory usage is important

• Real-time responsiveness is important

• Avoiding slow operations on the main thread is important

Some consequences of these constraints is that if you have a simple actor like
design with a GUI thread (frontend) and a backend thread (network, other
expensive operations) you can easily write crappy software. If the GUI needs a
bit of data and needs it _fast_ because the user just navigated to a new
screen, sending a message to the backend actor and waiting whilst it finishes
off whatever task it's doing isn't going to cut it. You need fine grained
access to a subset of the data being managed by the backend, and you need it
now, without yielding to some other thread that might not get scheduled
quickly. And you need to avoid delays due to duplicating a large object graph
then garbage collecting that immutable copy or (worse) running out of RAM and
having your app be killed by the OS.

In some kinds of web server I've worked on, you're serving a large mostly-but-
not-quite immutable data store. That data set must be held in RAM and you
cannot have one copy for each thread because that'd not fit into the servers
you use. And the data set must be hot-updateable whilst the server is running.
You cannot just code around these requirements, they're fundamental to the
product.

You can sometimes accept doubling your memory usage so the serving copy can be
immutable whilst a new copy is created and updated. but sometimes that just
makes you more expensive than your competitors.

There are lots of types of programming where for performance, cost or other
reasons, you simply cannot say "shared state is hard so we will never do it".
You just have to bite the bullet and do it.

~~~
nicerobot
Actors/messaging doesn't imply a remote-backend architecture. It only implies
messaging and mutually-exclusive _writable_ state.

GUIs is a poor example as they are inherently single-threaded frontends so
performing simultaneous actions is already often implemented utilizing
message-passing to backend threads which report back to a single-threaded
event-loop. That architecture can be local, backend "threads" or remote
processes, as it doesn't much matter to the GUI.

With regard to in-memory persistence, again, you almost certainly would not
copy data per thread. For a situation like you're describing, a Redis-like
architecture is all you need with a few atomic primitives. But, again,
incredibly easy to implement in Go and is certainly _not_ rocket science.

Of course, someone is going to still be using locks, but it's a diminishing
number of developers that need to code at that level since there are better
higher-level techniques that serve many purposes and protect against many
types of errors.

------
Animats
The main problem with multithreaded programming is that most languages are
clueless about threads and locks. They were an afterthought in C, and were
viewed as an operating system primitive, not a language primitive. The
language has no clue what data is locked by which lock. There's no syntax to
even talk about that. Of course concurrency in C and C++ gets messed up.

Rust has a rational approach to shared data protection. Shared data is owned
by a mutex, and you must borrow that mutex to access the data. You can have N
read-only borrowers, or one read-write borrower. The borrow checker checks
that at compile time. This gives us an enforceable way to think about who can
access what.

~~~
gopalv
> Shared data is owned by a mutex, and you must borrow that mutex to access
> the data.

This is how sane locking code works in C or C++.

The critical issue is that we have a lot of code already written which doesn't
respect this simple rule.

90% of the time, that C/C++ code needs to be re-architected to make the locks
either more comprehensive (fix races) or less (performance).

Rewriting in Rust generally achieves that, because it is a re-architecting and
refactoring step with concurrency in mind.

And if done neatly, this leaves no room for the next guy to come in and undo
parts of that design, because there's violating that takes more work than
keeping it - the wrong approach is suddenly the harder one to get to, unlike
C/C++.

~~~
Gankro
The fundamental problem with this pattern in C++ is that you can always just
keep around a pointer into it and then the "mutex guard" thing falls apart.
There's no way to communicate to the language that this pointer can't hang
around any longer than the mutex guard (because that's the only thing
preventing the pointer from being racy).

You basically can't write safe abstractions in C++ in the face of pointers.

~~~
robotresearcher
> You basically can't write safe abstractions in C++ in the face of pointers.

Sure you can, with accessor methods (get/set). Don't expose the address of the
underlying data structure. Don't expose the lock object either.

~~~
jacques_chester
I think you're talking past each other.

In both cases you need discipline to correctly interact with shared data.

The point is: some languages enforce discipline with compilers. Some rely on
engineers to enforce it themselves.

Of the two, compilers appear so far to be more consistent.

~~~
robotresearcher
> compilers appear so far to be more consistent.

That's true. Which is why object membership is private by default in C++. You
can't get the address of an object member unless the author expressly allows
you. That's why I think the GP's complaints about pointers making safe shared
data nigh-impossible in C++ was maybe overstated. The language/compiler has
things to help you. Much more than C, anyway.

------
SCHiM
I got quite frustrated when I read this article. That's because this article,
and the many others like this, confuse the real issue.

This article, and those like it, all state that the problem with multi-
threading and synchronization is inherent to the programing
paradigm/language/architecture you're using:

> "Buggy multi-threaded code creates race conditions, which are the most
> dangerous and time-consuming class of bugs in software"

> "because the traditional synchronization primitives are inadequate for
> large-scale systems."

Ok. Fair enough, now tell us why that is so.

I get quite annoyed when the author then proceeds to turn it all around by
saying this:

> "Locks don’t lend themselves to these sorts of elegant principles. The
> programmer needs to scope the lock just right so as to protect the data from
> races, while simultaneously avoiding (a) the deadlocks that arise from
> overlapping locks and (b) the erasure of parallelism that arise from
> megalocks. The resulting invariants end up being documented in comments:"

> "And so on. When that code is undergoing frequent changes by multiple
> people, the chances of it being correct and the comments being up to date
> are slim."

Implying that the real problem with locks/threading/synchronization is
actually communication, proper documentation discipline, programmer skill
(soft and hard).

Of-course I'm not saying that the process of using primitive synchronization
methods can't be abstracted over to make it easier to write _proper_ multi
threaded code. It's just that this really feels like subjective politicking
very much like the aversion to (proper use of) goto() in C/C++ code.

~~~
Demiurge
This made me think. I can't actually agree that it's a matter of language
primitives. It's very much about the logical complexity of sharing anything.
It seems to be a fundamental, recurring theme in CS, data and logic. In a
single thread, in imperative language, there is a sequential list of logical
steps and manipulation of data, or state. With two or more threads, you have
multiple sequence of statements, but they can try to manipulate same data,
it's effectively mashing two sequences together in random order! If that is
what you want to do, I don't see anything you can really do to make it not
rocket science. I'm sure it can be made easier by using some safer primitives
for data access, but I don't see how the high level logical races can be made
eliminated. As in, a language can eliminate an object being deleted when it's
going to be still accessed, but it can't eliminate a student getting an F at
midnight while his homework program is still being tested.

------
SEMW
> In this approach, threads own their data, and communicate with message-
> passing. This is easier said than done, because the language constructs,
> primitives, and design patterns for building system software this way are
> still in their infancy

"Still in their infancy"? That's basically a description of Erlang's
concurrency model, almost three decades old now.

Is there a concurrency equivalent of Spencer's law -- something along the
lines of " _Those who do not understand Erlang are doomed to reinvent it_ "?

~~~
vezzy-fnord
occam, as well.

They're not entirely wrong, though. Actor model and CSP are in their infancy
w.r.t. mainstream usage, even though they seem to be working really well.

Then Erlang/OTP gives you all sorts of other perks besides concurrency.

~~~
MichaelGG
OTP will probably really take off once there's a solid, maintained, port for a
nicer and faster language.

~~~
shepardrtc
The Erlang team is working on a JIT for the language, and the first release
should be out soon. That should help with the "faster" part. "Nicer" is rather
subjective. I happen to enjoy how Erlang looks and works, though it took me a
couple of tries before I really liked it. I'm not sure why people don't jump
on the idea of a bulletproof VM.

~~~
MichaelGG
Doesn't seem as concise as an ML is one of my factors in "nice".

I guess I don't care about a bulletproof VM because I've not had issues with
the JVM, CLR or other systems. And BEAM reeks of instability.

Furthermore the resulting apps seem to bear no relation to Erlang's
robustness. Look at RabbitMQ. Lots of stability problems... So what's Erlang
saving us from?

~~~
tormeh
Your sense of nice conflicts with Erlang's goal of supporting reliable
systems. For example, Erlang is not designed using the paradigm du jour. It
may seem like Erlang borrows from functional programming, but Erlang's
designers invented all that independently because it supported writing
reliable systems. Concise and maintainable are antagonistic qualities, and so
Erlang's designers sacrificed all the conciseness they could get away with.

Erlang is not deisgned to be nice, elegant or have any other property people
normally advertise their language as having. Erlang is designed so that
programmers at Ericsson could write better code for phone switches, with
constant feedback from said programmers. That's Erlang's mission statement,
originally.

If you want concise actors, Scala with Akka allows you to write pretty
unreadable code if you want to.

~~~
SEMW
> Your sense of nice conflicts with Erlang's goal of supporting reliable
> systems.

Obvious counterexample: Elixir. Plenty of modern niceties that Erlang lacks
(rubyesque syntactic sugar, scheme-style hygenic macros, the pipe operator,
mix, etc), but shares most of Erlang's semantics and ultimately compiles down
to the same bytecode, so AFAICS doesn't sacrifice any of the things that make
Erlang good for building reliable systems.

~~~
tormeh
Syntax is crucial when developing for reliability. Brainfuck trivially proves
this. Both semantics and syntax need to support reliability.

~~~
SEMW
I'm not sure if you're implying that Elixir's syntax makes it less good at
developing for reliability than Erlang. If so, I'd appreciate if you could
explain your thinking further -- if the existence of brainfuck proves that
Erlang's syntax is better than Elixir's at developing for reliability, I'm
afraid the nature of that proof is eluding me.

------
rayiner
It's much easier to use threads than to use them properly. Arguably, the
stricture's of Rust's type system makes it harder to use threads. But it makes
it almost impossible to use threads improperly. Both are probably good things.

I have seen some real doosies writing multithreaded code. We had a relatively
simple data analysis project that took in spectrum measurements from a piece
of hardware, logged them, did some basic visualizations, and allowed for
controlling the hardware. Each of these functions ran in one or more threads.
Imagine my surprise when I saw lots of uses of CreateThread but nary a call to
WaitForSingleObject or even EnterCriticalSection. I think there may have been
a Boolean flag to "coordinate" a pair of producer/consumer threads.

~~~
ArkyBeagle
The visualization end should have been in mostly only one direction and pretty
much non-blocking ( except for when you're out of data ).

The other side - configuring the hardware - is my bread and butter, and there
are a few simple abstractions that are pretty old now ( I have been using them
since the late '80s ) that will help greatly.

Sadly, for inexplicable economic reason(s), these are less well known year by
year.

Also - for extra points, think about how you'd do that in _one thread_. Betcha
can... although multiprocessor can be pretty cool. Now write it to where it
doesn't matter if it's one or multiple threads...

~~~
rayiner
My first inclination, before putting locks everwhere, was to rewrite it as an
event-driven application using WaitForMultipleObjects. I can't remember now
what that didn't work.

------
nbardy
It seems like the key to writing current code is to abandon the idea of
understanding how to construct a concurrent architecture and to figure out how
to adopt a pattern concurrent which provides certain guarantees. Often it is
something baked into the language, but frequently it is a library. This is one
of the reasons I'm so thrilled with Clojure.

1) Because it has STM baked in and there is a core library for CSP.

2) Because it is a lisp so adding foreign syntax is as simple as a library and
doesn't need to be a language extension.

------
chipsy
The article's style got me into a ranting mood. I don't want "allusion links"
that surface vapid text like "new superpowers" or "never ending stream of
goodness". You are forcing me to click on them to know WTF you mean.

~~~
woah
It's amusing that an article's style can get you into a "ranting mood".

------
steven2012
I hate blog posts like this.

You get one guy, who is seemingly very smart, and he says basically "Don't do
multithreading, it's very hard. Only an elite few, such as me, can do this
right, so most of you out there DON'T DO IT!"

It's bullshit. Mainly because it's no harder than anything else, and has just
as much pitfalls as every other type of programming. Yes, to a certain degree
multithreading is hard, but it's not rocket science. But PROGRAMMING is hard.
Not just multithreaded programming. There's nothing very special about
multithreaded programming that should scare off people from trying it. Sure,
you might fuck up, but that's

For example, our entire company was almost completely brought down a few
months ago by our "architect" implementing a feature so poorly that it caused
massive system instability. What was this feature? It essentially boiled down
to a 1 or a 2. Customer accounts were tagged with either a 1 or a 2, an it's
supposed to take a different code path for each, but he made it so fucking
complicated and he didn't do his due diligence, the entire weight of his code
cause significant downtime, and a customer that accounts for 60% of our
revenues almost walked. And none of this is rocket science.

Of course, I worked at another company where one engineer thought "oh,
asynchronous APIs are faster than synchronous APIs" so they implemented the
entire API asynchronously. Of course, that required mutexes on the server
side. And then more mutexes. And it got to the point where the performance was
hell because of the unintended consequences of trying to make things faster.
You would write a new API and the server would barf saying "You took the locks
in the wrong order" but there was no indication of you ever doing anything
wrong. It was a mess. So I get what the OP is saying, but it's not specific to
just multithreadedness. I bet the same programmer would have made a mess of a
single-threaded app as well. They are just shitty or careless programmers.

If you're careful, multithreaded programming is helpful and you can see some
significant performance boosts from it. But like _every other paradigm in
programming_ , don't overuse it. A judicious use of simple multithreaded
programming might help a lot, but there are few apps that benefit from an
extremely complex system with hundreds of threads, massive amounts of mutexes,
etc.

~~~
hamburglar
The reason I bristle about blog posts like this is that there are two wholly
disparate types of multi-threaded programming:

First, there's the type that's _hard_ and should probably be avoided except by
supergeniuses. This involves big hairy lock graphs where locks are held across
complex operations that may involve other locks, and swirling dependencies of
doom. This shit is nasty, and I completely agree that you must be "this tall"
to be trusted with it.

The _other_ kind of multi-threaded programming is the simple kind, where you
need to have a threadsafe interface to a module, the locking is dead simple if
you know what you're doing, and your lock graph has about three states, all of
which clear in constant time and have no dependencies. In this case, there is
no excuse for not having this basic competency, and the attitude that we
should all just throw up our hands and never hope to write multi-threaded code
again is massively counterproductive. This shit is _not_ brain-bendingly hard,
it just takes a small amount of practice and discipline.

Let's stop pretending all multi-threaded programming is wizardry.

~~~
bholley
I think you're both misinterpreting the title.

The sign is near the ceiling. It's not a question of some people being taller
than others. Nobody is that tall - not even dbaron (pictured in the photo),
who is one of Mozilla's three Distinguished Engineers.

Carefully balancing swirling dependencies of doom doesn't make you a great
programmer, at least not in the world of large-scale systems. Choosing the
right design to avoid or eliminate those swirling dependencies is much more
important.

~~~
hamburglar
That's fine, and I even agree -- perhaps nobody should venture into the
swirling dependencies of doom territory with multi-threaded programming.
However, my point is that this type of argument makes it seem as though
_nobody_ should attempt the more mundane, turn-the-crank, not-very-hard multi-
threaded programming either, which is a bad attitude to perpetuate.

------
mannykannot
"The resulting invariants end up being documented in comments."

There's your problem. If you are going to use locks, you need a wider view of
the system than you get at the source-code level. It is doable, but there is a
big impedance mismatch between this approach to software development and agile
methods.

------
tsotha
It's really not that hard to write multi-threaded code. I just laugh when I
read articles like this - I've been doing it for more than fifteen years now.
By taking a tool like that away from your team you're stunting their growth
and your product.

~~~
angry_octet
You really didn't read the article did you?

He makes it quite clear that they (i.e. mozilla) have tried to multithread
things _many times_ and the resulting complexity has led to bit rot and
increased bugginess. In that context your boasting seems inflated and inane.

~~~
tsotha
To what should I give more credence - an article in which someone says
something is impossible, or large, multi-threaded applications at my workplace
grinding through data around the clock?

I feel like a blacksmith who's been told horseshoes are impossible to make.

------
jondubois
Thread-based concurrency has no future. It's complex and it doesn't scale
beyond a certain point. Process-based concurrency is relatively simple
(especially if your programming language has good async support) and it can
scale indefinitely.

The one advantage of threads is that the overhead is lower when operating at
low concurrency. But it's like algorithmic complexity, people only care about
growth in complexity not about the initial offset.

~~~
incepted
> Thread-based concurrency has no future. It's complex and it doesn't scale
> beyond a certain point.

Beyond what point? Every single company that handles petabytes of data,
starting with Google, seems to be scaling just fine with thread-based
concurrency.

And why shouldn't they? Thread-based concurrency has decades of study behind
it, it's very well understood, the tooling is terrific (IDE's even tell you
ahead of time when deadlocks can happen and when they do, they can tell you
exactly why) and the performances are unmatched.

I'd say thread-based concurrency is going to be around for a while, as opposed
to the many fads that come and go trying to replace it, starting with actor-
based concurrency and transactional memory.

~~~
jondubois
We haven't yet seen the limits of thread-based concurrency because CPUs have
only just started to scale-out (by adding more cores). You're not going to
experience any issues if you just have a small number of threads spread out
over a small number of cores.

If you had like 100+ cores (just guessing) and several of them tried to access
or write to a specific shared memory location, they would spend a lot of time
busy-waiting for each other to finish (assuming you're using mutexes).

Maybe using semaphores could work but your code will end up looking like a
mess.

With process-based concurrency, each process can only rely on its own memory
pool, so that does use-up more memory, but processes are fully independent
from each other (fully parallel) so no time is wasted busy-waiting for
anything.

See
[https://en.wikipedia.org/wiki/Amdahl%27s_law](https://en.wikipedia.org/wiki/Amdahl%27s_law)

------
MCRed
Erlang and Elixir solved this problem. I only write multi-threaded code in
very limited cases when it makes sense to split processing out of UI on mobile
devices.

Everywhere else I use Elixir, and I write multi-process code and I don't think
twice about it.

And I never run into problems.

I'm really feeling like people keep choosing tools that haven't solved the
problem, or even tried to, and then thinking that the problem is perennial.

It was solved a long time ago by erlang.

~~~
iso8859-1
Can I make a 3D shooter using Erlang?

~~~
Profan
Certainly there's nothing stopping you in particular, (expect maybe the
current lack of library support for it, but that could be fixed), not that
Erlang is an especially good platform for the task.

So while some people seem to claim Erlang/Elixir may have solved the problem,
they shouldn't lose sight of that there are other domains than their own. But
still, for a long time there has been support for message-passing systems of
different kinds even in languages like C and C++ and its siblings though, just
not integrated into the language.

(I love Erlang/LFE personally, but wouldn't write a 3D shooter in it)

------
zzzcpan
> However, these programmers aren’t fleeing concurrency itself - they’re
> fleeing concurrent access to the same data.

He's not wrong.

Modern real world example: Golang authors designed net library in a such way,
that everyone who uses it has to think about concurrent access to shared
mutable states. Which is hard and unnecessary. Event loops never had this
problem, but for some reason got labeled "non idiomatic" by Golang folks. So I
had to implement event loop myself.

------
atsaloli
Sigh. No mention of logic verification that there are no race conditions.
Problem has been solved by Dr Holzmann at JPL
[http://www.verticalsysadmin.com/making_robust_software/](http://www.verticalsysadmin.com/making_robust_software/)

------
zubirus
>> In this approach, threads own their data, and communicate with message-
passing.

This is the same paradigm as MPI, the message parsing interface. Using it, you
also get for free the ability to deploy your "threaded" code in distributed
memory architectures. But any person who had just a bit of experience with
this standard can tell you how tedious is to develop a parallel code with it.
Maybe this is product of the paradigm or just the verbosity of the API (see
for example:
[http://www.mpich.org/static/docs/v3.1/www3/MPI_Alltoallv.htm...](http://www.mpich.org/static/docs/v3.1/www3/MPI_Alltoallv.html)).I
wish there was some sort of OpenMP or Intel TBB equivalent for MPI to ease out
the pain.

------
aidenn0
My dad said he had to read Hoare when he got his M.S. in the eary '80s, and
that half the people who read it didn't understand it, and half the people who
understood it ignored it. It's 30 years later and people are still using
crappy synchronization primitives.

------
ddmills
There is a relatively new actor-like language called paninij[1] which uses the
idea of 'capsules'. I have been developing a java annotation based version of
it called `@PaniniJ`. Capsule oriented programming enforces modular reasoning,
which in turn allows the code to be transformed automatically into
multithreaded goodness.

[1] [http://paninij.org/](http://paninij.org/) [2]
[https://github.com/hridesh/panini](https://github.com/hridesh/panini)

~~~
ArkyBeagle
Event-driven with formally designated actors and protocols between actors is
pretty old. '90s at least. An old tool, ObjecTime, deferred decisions about
which actors went on what threads until the very last thing. Default was that
they were round robin, run-to-completion.

It took some measure of work to do so , but people ran ObjecTime code on bare
metal.

Pannini looks very nice.

------
rpcope1
I think you really ought to have to read Little Book of Semaphores before
you're allowed to touch multi-threaded code. [1]

[1] -
[http://www.greenteapress.com/semaphores/downey05semaphores.p...](http://www.greenteapress.com/semaphores/downey05semaphores.pdf)

------
ArkyBeagle
I am utterly ignorant of what Gecko looks like, but in largerish realtime
embedded work, things always seem to end up in more formal design
methodologies utilizing transactional models such as message sequences (
frequently expressed in charts. )

------
kabdib
Coroutines are great stuff. Being able to yield for an async result and then
wake up later, without having to do expensive and buggy lock rendezvous
nonsense, is manageable and scalable.

------
hyperpallium
Shared-nothing message passing is an answer, as used in Erlang, but I seem to
recall reading that race and deadlock can still occur, just at a higher level.

------
opnitro
Just so you know, this is broken on ios-safari. I love the title though.

------
zobzu
Love this sign, too bad I heard its gone

------
bronz
What is racing?

~~~
AnimalMuppet
Race conditions.

For example, thread 1 produces a value, and writes it into a member variable
of a class. Thread 2 uses that value, but does not synchronize with thread 1
to make sure that it's been produced. But that's fine, because thread 1
finishes before thread 2 uses the value. That is, thread 1 _almost always_
finishes first. But if it doesn't (if it loses the race), then chaos happens -
chaos that is very hard to reproduce or debug.

~~~
bronz
Thank you.

------
batou
Run this in your browser's console window to make it actually scrollable
without playing find the sodding scrollbar:

    
    
       document.getElementById("contentpane").style.width = "100%"

~~~
urda
It was scrollable for me using standard keyboard and mouse controls. I did not
have to play find the scrollbar. Sounds like a local client issue to me on
your end.

Edit: yeah no, still working just fine. Downvoting should not be used when
confirm a problem exists or not. What's the deal today Hacker News?

~~~
Nadya
Move your mouse outside of the white blog area and try to scroll. That's the
problem being addressed by making the white area take up the full width rather
than 70% of it.

