Hacker News new | past | comments | ask | show | jobs | submit login
Threads Are a Bad Idea for Most Purposes (1995) [pdf] (stanford.edu)
248 points by ptx on Jan 28, 2020 | hide | past | favorite | 234 comments

I started doing massively parallel programming on SGI systems around the time this paper was published. SGI's at the time could have 64 CPUs in a single system image, which was very novel. Sun was working on its early multi core workstations, and companies like Cray were pushing different models of distributed computation.

This paper came at a time when threads were really painful to work with. POSIX threads were still new and mostly unsupported, so you were stuck with whatever your OS exposed. On IRIX, you would do threads yourself by forking and setting up a shared memory pool, on Solaris, you had the best early pthread support, in Java, you used the native Thread classes which only really worked on Solaris at the time. It was a mess!

This mess is now solved. Pthreads are everywhere, C++ has std::thread, Java threads work everywhere, and we've had many new language some out which handle parallelism beautifully - for example, the consumer/producer model built into channels in Go is very elegant. The odd one out is Win32, but it's close enough to pthreads for the same concepts to apply.

Event driven programming has also become threaded, this is how the whole reactive family of frameworks, node.js, etc, handle parallelism.

As someone who's been doing this for a long time, what I find confusing is higher level constructs that try to hide the notion of asynchronous operations, such as futures, promises. As a concept, they're fine, but they're difficult to debug because most developer tools seem not to care about making debugging threads easier.

> This mess is now solved.

If you say so. I can't count the dollars I've made fixing other people's poorly written multithreaded code in my entire career, including in the last 3 years.

Thread support is [more or less] standardised in all major OS-es now, sure. Doesn't change the fact that it's an extremely bad fit for the human brain to think about parallelism.

Stuff like actors with a message inbox (Erlang/Elixir's preemptive green threads) or parallel iterators transparently multiplexing work on all CPU cores (Rust's `rayon` comes to mind) are much better abstractions for us the poor humans to think in terms with. Golang's goroutines are... okay. Far from amazing. Still a big improvement over multithreaded code as you pointed out though, I fully agree with that.

I might be projecting here and please accept my sincere apologies if so, but it seems to me you are a bit elitistic in your comment. Multithreaded programming is still one of the most problematic activities even for senior programmers, to this day. Multithreading bugs get written and fixed every day.

At this point I believe we should just move to hardware-enabled preemptive actors where message passing is optimised on the hardware level, and just end all parallelism disputes forever since they are an eternal distraction. (The overhead when utilising message passing today is of course not acceptable in no small amount of projects. Hence the hardware suggestion.)

> Doesn't change the fact that it's an extremely bad fit for the human brain to think about parallelism.

I'm curious why you feel this way? Certainly anything with multiple (more than one) thread is a lot harder than single threaded. Given.

But I think the human experience is rich with real world counterparts that can make multithreaded programming "natural" (not necessarily easy, there's a difference). Basically any process in real life you do that involves collaborating with multiple people is a collaborative multi-threaded process. When forced to grapple with multi-threaded solutions, I often ask myself, "if I had a room full of people working on this problem, what would have to be in place to make the operation flow smoothly?" For me, this anthropomorophisation of the process makes it a "good fit" for how my brain is used to solving lots of real world problems.

On the flip side, I haven't had a lot of luck finding real world experiences that I can model coroutines/async/etc on. So they may be ultimately easier, but if our rubrik for "fit for the human brain to think about" is what the wealth of human experience has evolved our brain to handle well, I'm less convinced.

But I like learning new things. Maybe I just haven't seen the lightbulb yet. Help me see your point of view?

> On the flip side, I haven't had a lot of luck finding real world experiences that I can model coroutines/async/etc on.

I feel we might be having too incompatible perceptions of the world but I'll try.

Every communication (human or otherwise) is exactly actors with message inboxes. We the humans can't react to 3 incoming conversations at the same time. Our brain queues up what it hears or reads and then responds serially, one by one (not necessarily in order of arrival). Basically an actor with a message inbox, like in Erlang/Elixir.

The way you assert multithreading mimics the real world to me is not convincing. One possible example in favour of your point of view I could think of is probably a table full of food and 20 people reaching and grabbing from it at the same time. But even then, two people cannot successfully get the same piece of meat. So not a true multithreading in the sense of N readers/writers competing for the same resource; it's more like a shared memory area where parallel writes are okay as long as all writers reference different parts of it.

Sure, a lot of stuff collides in real time without synchronisation out there, that much is true. But you likely noticed we the people don't cope with the Universe's chaos very well and rarely manage to grasp it properly. So is it really a good way to model deterministic and non-chaotic systems with programming? To me it's not.

I am curious why do you even think multithreading mimics the real world at all. Elaborate, if you are willing? (We might think of different definitions of multithreading in this instance.)

> Basically any process in real life you do that involves collaborating with multiple people is a collaborative multi-threaded process.

That would be actors or processes communicating with each other, not a collaborative multi-threaded process.

Real world has pretty much no natural notion of threads, only of actors.

Have you ever seen those assembly lines where a bunch of people sit around the conveyer belt and each one grabs a product, makes an addition, and sets it back down? That's fairly close.

Each person on an assembly line operates independent of the rest, there’s no “coordination” as such. A person picks up an incoming object, transforms it and sets it down. For that person, for all practical purposes, the rest of the people working on conveyor belt don’t even exist. I suppose that (not having to coordinate) played a key role in the popularity of conveyor systems. It’s simple enough to debug, address bottlenecks etc.

The closest s/w equivalent I could think of is Unix pipes. When I do “cat foo.csv | sort | uniq -c”, cat/sort/unique aren’t coordinating with each other, each takes the content from stdin, transforms and place it in stdout.

That's what optimum parallelism is. They are collaborating to process N number as many items at the same time in the most optimum way. They also are coordinating to prevent someone from grabbing the same item or placing the same item in the same place. The conveyor belt itself is a ring buffer.

The assembly line principle can also be applied to the concept of threads. Workers collaborate by handing items off to one or more workers(threads) down the line. Thread pools are just micromanagers constantly reassigning the poor workers, but at least the make a lot of friens collaborating.

> "I'm curious why you feel this way? Certainly anything with multiple (more than one) thread is a lot harder than single threaded. Given."

Not the OP, but in my experience most devs are bad enough dealing with state and flow in single-threaded code already. Including myself. But I've at least accepted that and try to mitigate it by avoiding things that make it worse (like threads). Meanwhile, I watch people around me doing code by spaghetti (throwing it on a wall and seeing if it sticks). The better devs don't apply that to their code, but still end up applying it to their "reasoning" about a problem domain, especially a business one.

E.g.: "Screw it, if this variable that I need is null I'll just return a False in this function I'm writing" ... without ever thinking "why" this variable might be null and how that explanation impacts the problem they're solving. Nevermind the fact that they're fully OK with propagating such a broken state further into the program with their False response. Now some poor other dev has to figure out why this function is returning a false result. Odds are they'll do the same thing and now we've doubled the amount of bad states.

Adding in even more complexity in terms of being able to "predict" code state by adding threads is a nightmare. I would most certainly protest and heavily advise against anyone using threads unless it's really really the thing that's needed to solve the problem. Otherwise, it's just adding more a lot more complexity for very little benefit if any at all.

> Basically any process in real life you do that involves collaborating with multiple people is a collaborative multi-threaded process.

I would argue that the code equivalent of people collaborating in the real world is multiple processes with IPC (or even further afield, multiple nodes in a distributed system), not threads. You don’t share memory with other human beings. Imagine how productive you could be if you did!

If we read other people’s (say a manager, who is giving work) mind and just did that, we would be in the exact mess we are with shares mutable state.

Imagine reading someone else’s mind, only to discover that they are reading yours... Mind-stack overflow?

for certain classes of problems, sure. But how about “embarrassingly parallel” jobs (e.g. weeding a garden) or even “idempotent OLTP” jobs (e.g. studying several subjects at once)?

(Technically these aren’t best expressed as SHM but rather as a tuplespace or greedy worker-pool abstraction, but they can be achieved starting off with “just” an SHM threading primitive, so they count, I think.)

> You don’t share memory with other human beings.

This really depends on where you draw the boundaries. In a computer everything is "memory" of one kind or another, but that doesn't necessarily correspond to human memory. The analogy works if you think of the "actor" as merely the thread's context and dedicated working space (stack), with everything else being part of the surrounding environment. In the real world we don't have anything akin to process separation, except by convention. Even the body itself can be directly manipulated by others. The convention we use for communication between individuals looks more like a message-passing system, but physical manipulation of matter is more like operating on data in shared memory. There is no physical law preventing someone else from coming along and messing with an object I happen to be holding any more than the rules of a computer prevent one thread from messing with a data object held by another thread.

Collaborating humans have nothing like the problems that require memory barriers. We rarely have anything that behaves like shared memory - shared physical objects might come close, but a physical object is inherently de facto protected by a mutex.

Async, actor-like processes are everywhere in the real world - "something comes into my inbox, I do my piece of work on it, then send it to their inbox". Or "I'll send this off, do something else for now, and then go back to working on that when I get their reply".

When you write a sequential program, you are essentially specifying one order of instructions that will produce the correct output. With parallel program, you need to consider every possible orders and add the correct synchronization to prevent unwanted orders from happening. That's the difficult part.

As a programming model, threads don't really help you to achieve this. When there's a bug, your only option is pretty much to think very hard and try to figure out what happened. (Or to use gdb with scheduler locking, which is about as bad.)

There are more than one way to parallelize things as well. For example you could have and assembly line methodology where different threads are working on different stages at the same time. Using a common shared threadpool, as one stage slows down, it receives more workers to speed it up. Using producer/consumer queues to designate when data is being passed from one threads control to the other eliminates a lot of multi-threading bugs if this architecture is appropriate.

I think it's a way of thinking - I was an OS hack - Unix device drivers/etc early in my career, I could think about interrupt races/etc but it was hard, I took a detour into designing chips for a decade where EVERYTHING is parallel, by the time I got back to kernel work (Linux was around by then) I found that stuff that had been hard was now obvious.

In short if it's not a good fit for your brain, change your brain!

H/W components don't share state, which is why H/W parallelism easier. Where they do interact with each other, we have to either design fully async logic or worry about metastability. In contrast all kernels are basically all about sharing state. Even in user code and even when, a language such as Go provides channels, a lot of people seem to use shared state (and mutexes and locks).

I am curious as to how your h/w experience makes it easier hacking on the linux kernel....

There's lots of shared state, every flop is shared state, and mostly here I'm talking about synchronous parallelism in pipelines - yes we do have to deal with metastability too, and that's a whole other level of evil so that people avoid it where possible and spent lots of brainpower mitigating it (metastability is so evil you can't completely make it go away, just make it extremely unlikely).

Go is sort of like saying "we're not going to deal with parallelism directly, just throw everything into fifos to talk to other places" mostly you can't afford to waste gates like that in real hardware (there are places where it makes sense - I've built a lot of display controllers).

I think the big difference is the way you end up thinking about time and timing holes in code (dealing with interrupts), after building gates for a while i think you tend to see the timing holes more easily

Take decade to learn threads? Then expect other devs to do the same? I think that proves the point that threads aren't a good fit, even if we are technically capable of it.

No, it didn't take a decade, that was just how long I did it for (before it got boring) - I just mean that you have to learn to think about parallelism in a different way, we train hardware engineers to do this but not software ones

So it eventually got boring for you. So... threads are a bad mental model for our brains? :P

Not sure what you mean by your statement that hardware engineers are trained to think differently about it. I am not sure there's much overlap between HW and SW engineers in terms of parallelism modelling. Do you have a few examples?

No, chip making got boring - essentially it's a month a year of doing fun creative stuff and 11 months a year making sure that it makes timing and is perfect before you tape out. When you're coding on the other hand you can get something new working every day. I chose to go back to software.

However having said that I find myself designing silicon again, change is good for the brain I guess

Hardware people tend to think in terms of netlists and pipelining so they have to worry about parallism on every clock

The greatest programmers in the world cannot write bug free C level threading code. It is a task beyond human capabilities.

I'm not one of the greatest programmers in the world, but I don't find C threads to be particularly hard to deal with in a robust way.

Threads in C are hard if you have a sloppy codebase with global variables everywhere and a general lack of structure and abstraction. Unfortunately, this describes many many C codebases.

I don’t follow - are you asserting that, by contrast, the greatest programmers in the world can write bug-free single-threaded code?

I would argue that a moderately talented programmer can write code that does exactly what they think it is supposed to do (whether you call that "bug-free" is a question of semantics) if the have a language and type system (or equivalent) that allows them to express their requirements, and they don't step outside what they can express that way (i.e. they don't write code when they can't express what its semantics should be). Languages/compilers that can do that for multithreaded code are extremely niche.

For all intents and purposes, Knuth has. He's certainly more careful than most but I don't think he's unique among all of humanity here.

I think the implication is to stop using C.

I'd interpret it to stop using the pthreads model of the parallel coding, in general. Because the pthreads model exists and is used in many programming languages.

But stopping to use C is a good start (for whoever has the choice)!

Maybe the functional programming/managed code fans should step up to the plate and rewrite the operating systems, desktop environments, and hundreds of thousands of command line tools etc that have all been implemented in C or C++.

Not as toy, proof-of-concepts. Fully fledged replacements for all that stuff that can be used in our our daily work. Build distros of the stuff so that we don't have to pick and choose this stuff and replace the bits of our systems in a piecemeal fashion.

They could show us how it's done instead of talking endlessly about it in online forums. So let's face it, it's never going to happen.

Hm, where did I say anything about FP?

As for managed code, it's being used with huge success in a lot of places (not in OS-es or drivers, yes) but that's quite the huge topic by itself.

I never implied you did, they are merely one of the two major major groups of programmers I see consistently deriding C based infrastructure that presumably enables their paychecks to a large extent.

I'm not defending C, I'm merely sneering at its noisy detractors who spend more time complaining about it than supplanting it.

Well, people are working on it. For example, what's been happening in Firefox. C/C++ has a head start in the millions of person-hours.

One could argue as a counterpoint that the Rust community (and all the other language communities) have millions of person-hours (and lines of of C/C++ open source code) to reference as they RIIR, a luxury the C/C++ community didn't really have, much of what they've built since Linux appeared was developed from scratch. Outside of 386BSD, there wasn't a lot of unencumbered UNIX source code available to them.

Perhaps as the Rust community grows in size and momentum this will happen more. Right now it seems like a bit of a manpower problem.

@nineteen999: that all happened a long time ago. Study your history: https://en.wikipedia.org/wiki/Lisp_machine

I mean to replace the ones that we use today. Not the ones we used 30 years ago. General purpose operating systems. Perhaps with a distribution we can download for free and install on commodity computing hardware, and actually use today for our primary areas of work/interest.

Note the section 1.7 in your own link ("End of the Lisp machines"). The Lisp software ecosystem was never as rich, broad and varied as the C/C++ software ecosystem is today, before it died.

BTW, I only stumbled across your comment by accident, since you replied to the parent poster instead of me. Parenthesis mismatch?

HN thread depth reply limit!

I respectfully disagree. I have a embedded multi-threaded 'C' program running in over 11k+ retail stores in the USA right now. It's been handling multiple client requests to a Sqlite DB since 2007 without any issues. This product has made my company a lot of revenue. The secret to using threads is all in the design. Don't share resources between threads (I only had one shared resource for 50+ threads guarded by a semaphore).


That is a decent anecdote. Well, with respect, allow me to revise and qualify my read of @baggy_trough's comment:

It's perfectly fine to continue using an existing C codebase for a program, not exposed to the public internet, that's maintained by a focused group of maintainers. But on the other side of this spectrum, for large exposed projects like OpenSSL, Chromium, or even Linux, C/C++ has become risky.

Yet sometimes a simpler abstraction can't do the job well enough. Sometimes you have to do the hard thing.

IMO nobody is arguing about the sometimes part. Many programmers out there can't use Erlang/Elixir or Rust, or any other tech that makes writing parallel code much more deterministic and safe. I am aware of that and I have a deep respect for the programmers in the trenches who fight to produce bug-less parallel code in C.

My argument at least is that, given the choice, there exist much more productive ways to make your employer money than to try and learn nuclear phys... I mean multithreaded programming. :-)

Citation needed. I've yet to see a real business problem that couldn't be solved with a simpler, safer concurrency model and a bit of ingenuity.

I'm not sure I follow your terse comment. Are you inferring that because skilled programmers have bugs in code with threads, that threads are bad? Isn't that a bit non-sequitur? Skilled programmers have bugs in thread with no threads! Should we therefore infer that if only they added threads, they would be bug free?

I am not arguing in favor (or against) threads. I am not contesting that they lead to more bugs or not. What I was contesting was the assertion was that "threads are not a natural fit." My point is that procedural programming (which nearly all programming is at least small level, minus languages like Prolog) is something human beings have been doing for thousands of years. Look at "recipe" for doing something, and you have procedural program. Look at any recipe for lots of people working together to do something, and you have a pool of cooperating procedures. It's something we get schooled in our whole lives.

When I rewrote applications from Java to Erlang I was amused of how much different it was. Not only codewise but how you attack the problem.

FP makes the programmer rethink the way that the problem is solved. I think that adding threads on non FP code makes it hard from the get go.

Don't hide it on hardware, rethink the approach instead.

I mean, many people have the right idea but for now we have no choice but to emulate the right parallelism approaches on top of the wrong hardware architecture. Maybe this will change one day.

As for FP... I remember a guy saying "imperative/OOP shows you how it does stuff, while FP shows you what is it doing". It was a very eye-opening statement for me.

I'm not saying you are wrong btw, there are multiple efforts put there to create truly parallell processors. But it's not until recently that the technique exist to make them good enough. Good times ahead I would say. I personally know someone part of such effort.

I clearly remember the feeling of just getting it as well. It was a very binary moment :)

    Multithreaded programming is still one of the most problematic activities even for senior programmers
I agree wholeheartedly. FWIW I've found higher level abstractions built into the language itself, such as atoms and refs in Clojure, go a long way toward mitigating the problem once and for all. This gives you an actual harpoon to hunt Amdahl's* Great White Whale of Parallelism. Instead of a toothpick (C++) or a pocket knife (Java) to do it.

The trouble is we need to stop hunting whales if we're going to move the needle in a permanent and significant way...

* https://en.wikipedia.org/wiki/Amdahl%27s_law

> Multithreaded programming is still one of the most problematic activities even for senior programmers

My second favorite class in college was the elective for distributed computing (which I took instead of compilers, but I wish I'd taken both). It felt good to me and I came out into the job market ready to concurrent all the things.

By the time I'd worked with a disparate group of coworkers, I found out that 1) my enthusiasm was a rare thing, 2) that this was with good reason (many, many burnt fingers), and 3) that just because I love something doesn't mean that it should have a central role in a team project, even if I am the lead.

In a similar way that I 'got dumber' by learning to think like a user (one of the downsides of studying UX) I sort learned to reach for concurrency after I'd tried several other options.

... but here I am, years and years later, debugging other people's async/await code for them. Nobody likes to have to have someone else fix their mistake, but it goes a lot better when I start by telling them this shit is hard. Which it really is. Even when (or maybe especially when) I'm confident my stuff is right, I find a few boneheaded mistakes that I have expressly warned others to avoid in stuff I wrote uncomfortably recently.

I find Go's approach to be in the Goldilocks zone. Its not intrusive enough to slow me down (or guarantee correctness for that matter...) but its enough that I no longer need to save the multi-threading something I know needs to be multi-threaded for a second pass and very rarely have multi-threading issues or trouble debugging.

That being said, developers often have a bit of learning to get there, but that can greatly be accelerated by good team/mentor code review and discussion.

Goroutines are definitely an improvement, no argument from me.

It's just that there exist ways to shoot yourself in the foot quite easily.

Example: I'd find it much more intuitive if writing to a closed channel returned an error value and didn't issue a panic.

But I'm guessing that the Go programmers get used to the assumptions that must be made when working with the language so such things are likely quite fine with them and cause them no grief.

Such things bites everyone. But writing to closed channel is regarded as programming error. You potentially lose values. So logic should be watertight, which is hard with concurrency. Doing simplest approach to locking helps.

Better with panic than silent errors and flawed logic.

Yeah. I'm not opposed to such idioms. As I mentioned before, you get used to them.

I agree that multithreading bugs (eg; race conditions, invalid access, etc.) are common. Thing is, a lot of problems emerge from poor programming practices.

Threads are definitely a valuable tool for performance-oriented work and things like eg; worker pool patterns aren't always a great fit and you end up with more overhead.

There's a lot more lower-hanging fruit to be picked before you have to ditch actors with message inboxes for classic multithreading in order to gain some more oomph. And it has been my experience so far that the overhead of the other parallel coding techniques is peanuts compared to the I/O bottlenecks of the 99% of the apps out there.

I know there exist a group of programmers that really don't have a choice and have to use `pthreads`. But if you do have the choice then IMO sticking to multithreading programming is very backwards and invites a lot of suffering in your future (or that of your future colleagues).

Just use actors with message inboxes. Measure, profile, tweak. If you exhausted all other possibilities and you absolutely positively can't upgrade your server ever then sure, ditch actors and write multithreading code.

Meh, I mean, shared mutable state, to me, is the ultimate culprit of threading complexity. If you don't have that, very often many of your problems go away.

Futures and Promises go a long way in making that better. That's not to say that you can always rely on them, but it is far easier to get them right than it is to get threading right with shared mutable state right.

Actors, Channels, Futures, promises, reactive programming, etc. They all have one thing in common. They kill off shared state in favor of message passing.

> Meh, I mean, shared mutable state, to me, is the ultimate culprit of threading complexity. If you don't have that, very often many of your problems go away.

Yep, exactly.

Everybody can screw up syncing and message passing as well, given enough lack of experience, or schedule pressure, or simply not getting it. But it's much harder to screw up that while conversely, it's extremely easy to screw up multithreaded code with shared state.

> Thread support is [more or less] standardised in all major OS-es now

More-or-less, but no barriers on OSX (at least I had to hack around that for a user 6 months ago)

"I can't count the dollars I've made fixing other people's poorly written multithreaded code in my entire career, including in the last 3 years."

"Multithreaded programming is still one of the most problematic activities even for senior programmers, to this day. Multithreading bugs get written and fixed every day."

Strongly disagree. I am going to claim that threads / locking are not hard to write if one has at least a bit of common sense and discipline. Problem is that we have hobbyists without any trace of knowledge on how computers work producing commercial software.

Goes like this: I do not want to deal with memory management - it is too hard, I do not want to think about types - my poor brain can not comprehend those, pointers - OMG what are those beasts, threads and locking - gonna commit Seppuku. I will advise those to come to a logical conclusion: programming is frigging hard when one wants to produce a decent product rather then buggy POS. No matter how many concepts the language hides / converts to something else there will be always something else they will not be able to wrap their mind around. So how about trying gardening instead

A lot of things are possible for us the humans. But some your brain gets easier and works better with while others are a struggle even if it gets them right most of the time.

You don't have to make this about implying that others don't have "at least a bit of common sense and discipline".

I'll go the other way around. Some people are capable of comprehending an awful lot of things/concepts and use those productively. And some are not. It is definitely not the concept's inventors fault. Nothing to be ashamed of as we are all different. I am not shedding tears because I can not fathom quantum physics while others can.

Your comment about threading in win32 being lackluster is surprising. It was my understanding that while the OS doesn't support fork(), that it /does/ support pthreads, and that as soon as you go outside of the process itself, Windows has "better for developers" system calls. I'm speaking specifically about I/O Completion ports.

Working with epoll or select/poll is really nasty when dealing with "some data" that's coming on a socket, where as IOCP allows you to tell the kernel "as soon as you see some data destined for my thread, just populate it in the memory here, tell me when something is there and then tell me how much is there"; for high performance C++ programmers this is basically invaluable and as far as I understand there's no direct competitor in UNIX and unix-like land. (although; according to the C++ devs where I work, "kqueues are less braindead than epoll")

I never said it was lackluster, I said that it was equivalent. The differences lie around how asynchronous thread signals are delivered and handled. Win32 is less flexible in that regard, and when you're writing any kind of cross platform threaded software which includes Win32, you must make accommodations for this difference. Pthreads can't be fully emulated on Win32, so you have to have your pthread path, and your win32 path.

And yes, you correctly point out that on nix and Win32, interaction with the kernel is often different. Win32 does a lot right, I won't ever speak negatively of it, it is the desktop OS of choice by a huge margin. It's just that nix and Win32 are so different from each other, that you can't generally write a single implementation of your low level thread interaction for both classes of platforms.

If you were using signals on Win32, that's the problem.

I'm not a win32 programmer, but I think that for a long time windows didn't have os provided condition variables (events are not always an adequate replacement and had nasty footguns around signaling); turns out that iti is hard to implement a cond var with posix semantics and if you managed to get a third paarty implementation it was often buggy. I think they were eventually added in Windows Vista.

the thing that is missing (or was missing - not 100% of the current state of things right now) in nix land is the generalized concept of "waitForSomething" where "something" could include any* event/data that should/could wake a thread. kqueues help a bit, but I think it is still harder than it should be on *nix (even Linux) to write code whose semantics are "put this thread to sleep until anything happens".

If the thread is waiting on a specific set of objects (e.g. file descriptors, kqueues), then this can be done easily, but it's still not quite as simple as it has been under Windows (and by the way, IOCP is built on top of the underlying mechanism that makes this possible on Windows; they are not the mechanism itself)

Modern kernels support signalfd, timerfd and eventfd. With those things, it is possible to wait on almost anything.

With io_uring this can be done on Linux as well.

Not yet, read the LWN article and look at the todo list. There are a number of things still missing, and of course there are things like semaphores and generic eventing that still aren't part of io_uring.

io_uring is "different" and definitely has some advantages due to its shared ring buffer. but NT has a core/generic set of async completion/wait/etc functions that can be applied to basically the entire API surface.

I do not think you can wait on a critical section or a condition variable with WaitForMultipleEvents in windows. The same is true in linux. You can use select/poll/epoll on signalfd though.

Does IORING_OP_POLL_ADD work with signalfd's (I don't really know without messing with it, how about IORING_OP_EPOLL_CTL?)?

I wouldn't assume it does until I've verified and read the code, I know last year at plumbers there was a fuss about the fact that a lot of operations which should have been working wernt. Frequently in odd ways, for example direct read worked but not direct write IIRC.

My understanding is that as of Solaris 10 && AIX ...6.1? there are IOCP implementations available for use, although they might not be enabled in the OS at install.

You might find this deck interesting: https://speakerdeck.com/trent/pyparallel-how-we-removed-the-...

There is no equivalent in UNIX land to the thread-agnostic asynchronous I/O primitives available on NT. When paired with a robust threadpool API (Vista+), it is an unbeatable platform.

> Sun was working on its early multi core workstations

This article is from 1995, and they were out a bit earlier than that. Sun had been shipping 4-core workstations with thread support since 1993.

The SPARCstation 10 came out in 1992 and had two MBus slots. Each slot could take a CPU card with 1 or 2 CPUs on it.

Although SunOS 4.x had very limited multi-CPU support, SunOS 5.x had support for multiple CPUs. SunOS 5.1 (Solaris 2.1) came out in 1992 and supported SMP, and SunOS 5.2 (Solaris 2.2) came out in 1993 and introduced a thread API.

By 1995, they had also introduced the SPARCstation 20 and the 64-bit Ultra 2.

They also had some server systems (like the SPARCcenter 2000) that had a backplane with several boards that could themselves have CPU cards. I don't know the maximum number of CPUs, probably something like 20.

Info from my memory and:




> Java threads work everywhere

That depends how you define "work". Java has a decent concurrency library such that if you have a limited-scope project implemented by 1 or 2 highly-skilled programmers, you stand a good chance of avoiding thread safety issues. But as the project grows, the probability of thread safety issues approaches 1.

> Event driven programming has also become threaded, this is how the whole reactive family of frameworks, node.js, etc, handle parallelism.

Event-driven programming is way less painful than Java threads. With threads, in every place you mutate anything you have to ask yourself, "what happens if my thread is preempted here?" With event-driven programming you only do that sort of thinking in places where you see the `await` keyword.

> Event-driven programming is way less painful than Java threads. With threads, in every place you mutate anything you have to ask yourself, "what happens if my thread is preempted here?"

Event-driven programming is a different pain in the ass but still a pain in the ass.

The current status of the Web speaks for itself: it's event-driven based JavaScript, and it's full of state/transitions bugs everywhere.

In event driven you trade concurrent r/w access issues for a loosing execution contexts issues and out of order issues. The result is often a mess of spaghetti callback close to unreadable.

The less worst of the world might be simply a proper coroutine system, which seems to come to almost every programming language after 30 years of existence

You can write horrible event driven code. For example, do your events need to talk to a database? Right there, you've got a synchronization point.

Thinking about asynchronous behavior is always tricky, and code designed to run on threaded systems isn't generally arbitrary code with locks thrown into it the mix, you strive to write the largest possible reentrant sections, and only lock where you have to, and as infrequently as possible. With reentrant code, it doesn't matter where you get preempted.

> For example, do your events need to talk to a database? Right there, you've got a synchronization point.

Not if you do it right. Interacting with sync-only systems from async systems is a problem, but it's not fair to blame that on async - if you make the whole system work async, you don't have that problem.

> That depends how you define "work".

I don't disagree with what you've said, I just wanted to chime in that in the context of the conversation to this point the definition is something in the ballpark of "are available cross-platform without jumping through hoops".

I was replying in the context of "This paper came at a time when threads were really painful to work with," which seemed to imply that they're no longer really painful to work with.

That may be true in some language, but it is not generally true. The awaited task can run in parallel and both are only re-sync when you use the result. Thus anything can also be mutated in parallel with await/async style of programming.

>Event-driven programming is way less painful than Java threads. With threads, in every place you mutate anything you have to ask yourself, "what happens if my thread is preempted here?"

in 23 years programming Java (and in the all places where i've worked having been recognized as local expert on threads and synchronization, incl. in the current very large C++ platform project) i have never asked myself that question in the context of the mutation of data.

Well you either did ask that question and answered it by using appropriate locking and concurrent data structures (which is basically implicitly asking and addressing that question..) or you wrote a lot of crash prone shitty software. Which one was it?

Locking isn't about preemption. You can have a system without preemption and still have to use the locks and other concurrent primitives.

Not sure why the downvotes. If a system has enough CPUs, threads might never be preempted, and yet we must still use locks or concurrent primitives if we are sharing resources across multiple threads.

I don't think I've actually thought about preemption explicitly since the Nintendo64 days when thread priorities figured into our locking strategy. But then we'd also re-implemented the scheduler and, at times, turned off the official OS to go do something we needed to do without interruption. So, like I said, preemption not high on my radar these days, as it falls under the general category of "concurrency". Even an iPhone has 4 CPUs.

The question shouldn't have been about preëmption, but rather, "what if some other thread mutates my data here?"

The effect is much the same, whether the race is caused by task scheduling, or just because another core got there first.

Or even if some random function you called fired some callback that just mutated some state you were in the middle of changing.

Locking and preemption are different but sometimes related concepts. As you mention.

Modern kernel people still have to worry about locking as well as preemption, if you look at the NT native API's you will see dispatch levels, which control whether kernel threads can be prempted for higher priority activities, linux is similar with the _irqsave() and preempt_ calls.

Is this for example important when syncing with an interrupt handler (no preempt/rt config), which is why there are both spin_lock() and spin_lock_irqsave() which implicitly blocks irq preemption. AKA if you grab a lock that is needed by say an interrupt handler, then you take said interrupt the machine will deadlock because the scheduler won't deschedule the interrupt handler.

What purpose would locks and concurrent primitives have in a system where running code is not preempted?

In addition to parallel in all senses/levels systems mentioned by the other commenter, another example would be a truely single threaded hardware with cooperative or even sequential multithreading where you'd use locking/etc. to safeguard your memory model invariants against for example compiler, JIT and hardware optimization shenanigans.

To guarantee two different threads don't try to simultaneously mutate (or in some cases access) shared resources.

> But as the project grows, the probability of thread safety issues approaches 1.

This is exactly why I run unit tests concurrently (other than the speed boost), and try to write the tests (and the code) so that it won't break when run concurrently (like inserting something into a DB and then assuming success only if the count has gone up by exactly 1).

Are you saying that concurrency issues can reliably be found via testing?

Well, they can't be deterministically found since process scheduling is not deterministic (notable exception: Haskell, I believe, has a way to do deterministic concurrent scheduling!), but I have definitely seen fails that only crop up when the suite is run concurrently, and it usually turns out that the code failing in those circumstances has made assumptions about application state that do not hold true in a concurrent context

OK, intermittently your test suite fails nondeterministically 0.01% of the time. Do you imagine that this will be sufficient to find which of the last thousand commits introduced the bug?

I would recommend using specialized tools for the task like Intel's Thread Inspector or Coverity's Helgrind.

Or one could simply try very hard not to mutate any state that is available to another process and corral all such code that requires doing so (which most of the time ends up being only a small portion of the code; the rest can be designed functionally, just passing values in and back out with no other side-effects) into a thin I/O layer, which is (I believe) the Hexagonal Architecture and is explained quite well by Gary Bernhardt here: https://www.destroyallsoftware.com/talks/boundaries

This is not an either or, it is an and.

The one being a good idea doesn't preclude the other from being a good idea as well. Do them both.

(And also avoid concurrency wherever you can. And of the available forms of concurrency, threads are one of the worse ones.)


The big problem with native threads is its API. You have to make a callback aka function pointer. All of a sudden you have to keep track of parallel execution of these function pointers. This level of abstraction has no representation in code, only existing in the developer's mind. This doesn't scale.

There is an impedence mismatch between how code is written (sequential, top-to-bottom, left-to-right text) and how parallel code behaves. That's why people use idioms like actors or coroutines. I think to actually solve this you need a new format for writing code, like a graphical function call graph instead of a text editor.

Your win32 comment is odd. I was developing on NT during the 3.1 betas in ~1992 and we chose to go all in on multithreading. Our product launched in 1993 with the release of NT. I don't ever remember hitting an OS level thread/etc bug (fair share of footguns though).

More than a year later we were trying to port much of the codebase to solaris and it was a nightmare. Pretty much nothing worked right so we ended up bolting on the fork/mmap abstraction on top of much of it and building our own lock wrapper. While we sorta got it working, the solaris port died for internal political reasons and the NT version took off and we never looked back.

NT was designed out of the gate for heavily threaded/async workloads, that stuff wasn't bolted on like pretty much every unix clone in existence.

I worked for a major day trading software company back in 1999 or so and we built everything around NT. I had come from a Unix background, but to be honest NT was the perfect choice at the time for building the back end of a large-scale retail trading platform. And IO completion ports were amazing compared to what was available on other platforms.

> futures, promises. As a concept, they're fine, but they're difficult to debug

Might as well forget debugging in promise-land. I think this is partly because they are conceptually problematic as well, apparent once beyond the simplest use-cases, given the number of bloggers that still seek to explain it - and then are forced to edit their post because of errors. And the need to list common mistakes on MDN:


I don't think it is that bad, once you learn those mistakes (maybe a linter could help to?) and more importantly bed down the mental model I think promises are a fine way to do things up to a point.

I don't do intense concurrency in the browser, but I'll do non trivial stuff, like collating responses from a server then sending a request once I have all that information. I think promises are fine for this and it is possible for most programmers to write clean, mostly mistake-free promise based code.

As for difficult to debug? I've never had that issue, neither in JS (browser/node) or C# with the similar but different Task<>.

For example in JS, I can put breakpoints at any point of the promise chain and see what is going on. If they are network requests I'll also look at the network tab. If everything is happening real fast I might use console.log statements, but that is rare.

I think promises are OK for a lot of situations that most of us will encounter developing business software - this might be a reflection on the simplicity of the problems I end up solving. Obviously if you are creating a multi-threaded high-frequency market making trading thing then this might not apply.

It's all about tooling. There's no reason why a debugger can't make sense of "spaghetti" stacks that result from future-based async code, and some debuggers do just that (e.g. VS does it for C#).

Where did futures/promises come from? They just seemed to appear out of nowhere. I'm not particularly a fan of them either. C++'s current version of std::future is particulary useless.

Reading on wikipedia, it sounds like the 1980s this stuff came out, with support in a particular lisp released in 1980.

The promise pipelining technique (using futures to overcome latency) was invented by Barbara Liskov and Liuba Shrira in 1988.

Yes B. Liskov is the L in "SOLID", if you are wondering!

Happy 40th birthday promises! :-)

I wouldn't be surprised if mathematicians were thinking of this before transistors were invented though.

I think it takes time for things to trickle down. Maybe it takes someone from an academic background (or a curious paper reader) to be forced to use C++ or Javascript for something, but along the lines of "progress relies on unreasonable men/women" they get annoyed and write their own library so they can use the nice Lisp/Haskell/etc. feature they are used to, but often in a more limited way but better than nothing. Then from there it might get hyped to the shithouse, die, or just get used by insiders.

AFAIK the E language http://erights.org/index.html influenced several other people and projects like Twisted Python and Midori, which influenced the now-popular deployments like Javascript. (I followed E in the 90s but not so much the other projects.) There's a sketch of E's history at http://erights.org/history/index.html but it's mostly stubs there. They apparently invented promises independently of Liskov, while working on Xanadu: http://erights.org/elib/distrib/pipeline.html

Right, and MarkM is on the ECMAScript committee. Futures I think are about 20 years older, maybe from MIT.

P.S. sorry so late answering your mail!

https://en.wikipedia.org/wiki/Futures_and_promises is quite detailed and looks good. I'm sure MarkM & company knew of futures, so I should've mentioned them.

(I owe you mail too.)

Aha, Baker and Hewitt, 1977. I didn't realize Friedman had defined "promises" in 1976; I wonder if they're the same as E promises? (Which are almost precisely the same as ECMAScript promises.)

I was terribly remiss not to mention Dojo, a popular JS toolkit which got its promises from Twisted, which of course got them from E, though Twisted modified them a bit. I don't know how it slipped my mind.

I haven't read the Friedman/Wise paper. Most likely those were futures? They were doing FP work around then, like https://help.luddy.indiana.edu/techreports/TRNNN.cgi?trnum=T... (I seem to remember reading another paper of theirs which included racing suspensions till one of them completed, which would be more like futures than the streams of this paper. But if so I'm forgetting where.)

I get the impression E promises had a nicer design than JS's for handling errors -- but that's also a vague memory and I never really learned JS's.

They're the lesser of two evils: before came callback hell.

Async/await has also mostly replaced bare Promises. The result almost gives you something as usable as threads.

Fwiw I find it amusing the async is promoted as the pinnacle of elegance in concurrent programming when in reality it exists because the JS interpreter wasn't thread safe.

There were thread-safe JS interpreters, in early versions of Opera for example. But without locks it was going to be impossible to write thread-safe JS. The arguments of Ousterhout and Miller may or may not have been an influence in 2000 but certainly they were in 2010. You might be interested in Miller's dissertation.

I came along a bit after you did, arriving with Pthreads books already in bookstores, but on Linux we were just getting LinuxThreads and not yet even NPTL, so things were still pretty flaky. I remember trying to write threaded programs that used SVGAlib, and breaking my console multiple times a day because LinuxThreads used SIGUSR[12] internally and so did SVGAlib, that year magic-sysrq became my best friend.

My impression at the time was that developers had a flawed impression of threading complexity and bugginess because they were bolting threads onto existing single-process programs, and their existing hygiene was the problem, nothing inherent to threading.

If you look at the single-process C programs of the era, global variables and global state in general were extremely common. When you start adding threads to such programs to try take advantage of SMP, trying to wrap locks around heaps of unnecessarily shared, poorly encapsulated state, of course you produce a lot of buggy programs. Then everyone starts saying "multi-threading is too hard, not worth it", instead of admitting their programs are a complete mess.

Using modern standards of hygiene, even with threads-naive languages like C and good old Pthreads, I don't find it particularly challenging at all to write threaded programs.

Like you mentioned, I also struggle with the higher level abstractions attempting to make threading easier. Pthreads makes sense to me, I find threads, mutexes, condition variables, rwlocks, all very intuitive and ergonomic to use, but it's probably because I spent a lot of time using that API at a young age.

As I started taking swe jobs in silicon valley in the early-mid 2000s, it surprised me how few people had experience with Pthreads. It blew my mind, four different startups doing C programming with experienced C programmers and nobody was ever familiar enough with Pthreads to quickly review my code without having to rtfm. It's like there was such a stigma surrounding threads being "too hard" a lot of people never even attempted it. But my being a kid learning linux and just excited about new features in my unix, when LinuxThreads arrived and then NPTL, I spent years playing with Pthreads in C and getting my hands on SMP systems just to program them with Pthreads. Those years of playing paid off unexpectedly well in silicon valley, SMP was everywhere and C was still heavily in use on Linux.

I still get a bit of happy nostalgia when the opportunity arises to write some code like:

  while (!foo->ready)
          pthread_cond_wait(&foo->cond, &foo->lock);

  /* consume from foo */


Personally, my issues with threads and the various locks is that you have to manually manage resource ownership. It's kind of like programming without types, with manual memory management, or with null values. It's easy to program with 'null' if you do it every day, but it wears on you and there's always the chance that you make a dumb mistake. Moving that functionality into the type system is a relief because you can lean on the compiler to tell you when you're doing it wrong.

I like promises/futures because they tell me when resources will take a while to become available without blocking. I generally don't want to need to know that I need to lock the foo queue before consuming from the queue. I would rather have the queue handle that for me and encode the behavior I want in the type signature:

    next() -> T           sync, blocking
    poll() -> Optional<T> sync, non-blocking
    next() -> Future<T>   async

> The odd one out is Win32, but it's close enough to pthreads for the same concepts to apply.

One thing that i find very convenient in Win32 that is lacking in other platforms (at least using pthreads) is that every thread has its own message queue and you can have threads communicating with each other simply via PostMessage/GetMessage (which also handles sleeping). You can implement something similar over pthreads, but it is nice that the OS provides that out of the box.

The message queue is user-space abstraction relevant to the GUI subsystem and not something that is inherent to Win32 threading model. And in fact its implicit existence is source of completely unique Win32 class of hard to debug threading bugs.

Edit: on the other hand Win32 has one really nice feature: you can WaitForMultipleObjects() on essentially anything that has kernel handle, which includes most of IPC primitives. On the other hand this causes the native Win32 IPC to have significant overhead and is the reason why game developers often resort to userspace spin-locks and why Windows 10 had introduced NPTL/Linux-style lightweight futex-based mutexes...

It may not be inherent but IMO that is just hair-splitting, the important part is that it is there (also FWIW i had PostThreadMessage in mind, not PostMessage) and is very convenient to use. I've used it a bunch of times whenever i worked on Win32 tools that i wanted threading by sending commands to other threads to do stuff (e.g. enumerating the directory structure for an asset browser in a game editor at the background to avoid stopping the UI) and having them reply back with messages about the results.

And honestly i never had any bugs with that, if anything i've found it the easiest approach to understand when it comes to inter-thread communication.

Win32 message queues are not only relevant to the GUI subsystem. Note how PostThreadMessage doesn't even deal with HWNDs anywhere. And COM, for example, uses those same message queues for cross-apartment calls - even if it's one GUI-less service calling another one, there's a message pump in there somewhere.

I somewhat believe that the sole reason for existence of PostThreadMessage and friends is to make COM cross-apartment calls work for threads that do not own any windows. You can probably devise other uses for that, but such uses still somewhat boil down to implementing your own COM-like bidirectional IPC mechanism (GIMP's libwire comes to mind, which does essentially the same thing on top of “anything that behaves like SOCK_STREAM socket” and shared temporary directory)

Yeah, as someone using JavaScript for the first time (and not being an experienced developer either), I gave a brief look at promises, recognized that I hardly understood any of it, and quickly decided to use the bleeding edge async instead ! (And haven't regretted it.)

Curious what kind of work your were doing? I’ve always been fascinated by the golden days of SGI.

All sorts of real time graphics stuff; flight simulators, oil and gas visualization of voxel maps, general industrial visualization. These were the early days of hardware accelerated 3D and everyone thought that visualization would change the world to a greater extent than it really did.

It was great fun to write code which drove 8 displays using 8 graphics pipes, with roughly 8 cores working in concert with each pipe. All this work for something that runs faster on the latest iphone using a single thread...

I went on to work at SGI for a few years, and it was still my favorite job ever. It was pure R&D, graphics and realtime systems for their own sake. Today, this doesn't exist. 3D graphics are an applied technology that's part of an app, but not a product research area of its own.

> It was pure R&D, graphics and realtime systems for their own sake. Today, this doesn't exist. 3D graphics are an applied technology that's part of an app, but not a product research area of its own.

I'm quite certain there are teams here at Microsoft that do just this, and their counterparts in the GPU industry (AMD/Nvidia/Intel - although Nvidia seems to be the dominant one in research)

>and everyone thought that visualization would change the world to a greater extent than it really did.

c.f.: Data Scientists

On IRIX, you would do threads yourself by forking and setting up a shared memory pool

IRIX also had that weird 'sproc' API that I think was semi-lifted from something in Sequent's DYNIX.

By the way this is how threads work on Linux to this day. LinuxThreads were exactly this and NPTL is mostly about doing magic with RT signals and shadowing libc symbols such that stuff that changes global process state works reliably. Along the way the kernel got some awareness of userspace playing such tricks and got futex(), but still it is mostly implemented as bunch of userspace magic that makes essentially unrelated processes look like threads of same process. One nice consequence of this is that all the PTHREAD_WHATEVER_PSHARED stuff simply works (in contrast to BSD derivates, where you have to choose whether you want to use IPC primitive across threads or across processed and when you want both you get -ENOSYS, -EINVAL or somewhat hilariously -ENOMEM with manpage containing rationale along the lines “POSIX says that this has to be possible and that there can be global limit on number of such objects. It does not say that the limit should be larger than zero, so this always fails with -ENOMEM”)

I'm not exactly sure if events are easier to debug. I use tornado (event-based Python webserver library) extensively at work - when something goes wrong you don't get a nice stack trace, you get some random sampling of callback spaghetti. Also the default state of matters is that everything is serialized in a single thread and everything waits for their predecessor, even when they are totally unrelated, though that probably tells more about the particular framework than event-based programming in general.

I'd rather use a real multithread-based framework, honestly, though I concede that it also opens up different ways of making developers' lives miserable.

The programming paradigm provided by a threaded model (which is also there in async/await syntactic sugar) is a lot easier to follow and work with than asynchronous code paradigm IMO. I think writing code asynchronously is solving a problem all over your code base which can be solved much better by async/await or lightweight threads like golang has with goroutines and like jvm is getting with project loom - all of those solves the problem in the right place - allowing you to write code synchronously but execute it asynchronously.

Once we de-couple these two things - how we write code and how we run code - the discussions about this become clearer and easier to have.

The state of the art is improving. Trio for Python for example is much better in terms of stack traces.

> Also the default state of matters is that everything is serialized in a single thread and everything waits for their predecessor, even when they are totally unrelated...

I think this is a preferable default since bugs caused by pre-emptive multithreading are much harder to debug.

It's time to migrate to async/await. Check out asyncio.Protocols for TCP servers and aiohttp for HTTP servers. We get beautiful async stack traces delivered to Sentry. Debugging anything is a joy.

I’m always a +1 for aiohttp (and aiopg)’s ease of use

The author, in case anyone didn't recognize the name:


Specifically, he created the Tcl programming language, which had a nice event loop way back when.

It's still there! IIRC you could also have arbitrarily many nested event loops, for better or worse. And some threading support is also available [1], although it seems to be culturally disapproved of.

1. https://wiki.tcl-lang.org/page/thread

I solved some tough IPC problems using the TCL event loop. A product that runs on several thousand machines across the US was designed around a set of distinct C++ processes communicating over sockets. Using the TCL event loop greatly simplified the design, improved performance and eliminated concurrency bugs present in the bespoke IPC code it replaced.

It should have been written that way on day one but some programmers default to pounding out a bunch of spaghetti instead of learning about the tools at their disposal.

I just meant to say that it was already in place... uh...holy crap... 25 years ago.

Technically it was part of Tk. It got grafted into mainline Tcl 20 years ago.

A shame that no one has mentioned cache invalidation as further reason threaded programming is hard. One my biggest takeaways from Martin Thompson’s talk on mechanical sympathy is that the first thing he tries when brought in as a performance consultant is to turn off threading. He mentions locking as a performance problem but that these days cache locality can be the key to speeding up slow applications.

Yeah, it was hard to realize in 1995, but nowadays pretty much everyone who tried experienced performance problems with threads, or rather with shared memory multithreading concurrency model. It doesn't actually scale if you idiomatically synchronize shared memory access with locks or atomics, you need some way to batch things and amortize the cost of synchronization between cores while also preserving locality, which ultimately implies an asynchronous model where threads are just a low level implementation detail.

I've been hearing rumors that AMD's current offerings have been starting to avoid even shared cache between processors. It boggles my mind that any CPU designer would think a shared L2 cache is a good idea. Makes me wonder where my model of memory starts to break down. I always just think of L2 as being slower, less expensive memory than L1. I'm wondering if there are any benefits that actually outweigh the cache eviction penalty of multiple processors accessing it...

It's surely an engineering tradeoff resulting from weighing the different pros/cons. If you had dedicated cache per core at 1/Nth the size, much of it would be wasted when you're running less than full tilt -- eg, if you're using 2/4 cores then half of the cache is artificially unavailable instead of doubling the cache available to those 2 cores.

L2 hasn't been shared for a very long time.

For me this is the only relevant reason why you don't want to do multithreading or alternatively why you want to structure your multithreaded application as mostly independent “processes” that communicate by means of message queues. I don't view “Concurrency is hard to get right” as a valid argument.

Yes but the cache problem also exists with asynchronous programming.

True. I wonder how much event systems by their very nature simply destroy cache locality. Still, it's likely much easier to reason about cache hits, and build event handlers such that they remain local for the duration, as opposed to threading, where it's all but impossible to predict what the cache will look like.

What talk is that?

> A thread from 2017

Didn't you read the article?!

Did you click on the links in the comment above yours? ;-)


The perniciousness of winky face is on full display here

You have two choices when it comes to utilizing multiple cores: threads and processes. Threads are hard, but sharing data between processes is hardly any easier.

Maybe they're a bad idea, but these days you have no choice but to learn how to use threads. AMD's run-of-the-mill processors have 16 cores, Intel's 8. Servers have lots more. Heck even your iPhone has 6.

The clock rate on CPUs isn't getting any better. It's just more cores from here on.

It being hard to share memory is a feature, not a bug.

Threads make any single thing on your program mutable without your direct control. Processes keep the mutability scoped into a few hard to extend areas.

> Threads make any single thing on your program mutable without your direct control.

No they don't. Threads don't mutate random variables by themselves, you need actual code that does the mutation (whether it's running on a separate thread or not).

I mean, how is that statement different from "calling other functions makes any single thing on your program mutable"?

> how is that statement different from "calling other functions makes any single thing on your program mutable"?

On those languages where functions mutate things, that's basically true. But it's much more common that functions can only mutate global variables, and people keep those in low numbers, exactly for that reason. Actually, replace "functions" with "methods" and you will get into one of the largest flaws of OOP.

But anyway, mutability is much less of a problem outside of concurrent code.

'const' all the things.

He says at the end that if you want true concurrency, use threads. But the point is that many people use threads where concurrency is not a requirement, or even desirable.

> But the point is that many people use threads where concurrency is not a requirement, or even desirable. reply

Why would you write a program using threads if you don't require concurrency? The only purpose of threads is to achieve concurrency. Can you give some examples?

Not the person you were asking the question to, but I can give you tons of examples. This is particularly true in the embedded space or people who write C or C++ that doesn’t have any standard event library. Basically, threads are a lowest-common denominator way to do blocking operations without stalling everything. I’ve seen a lot of code where there’s various threads running at different intervals (low priority background thread that sleeps for 1 second then wakes up to do stuff, another thread that does nothing but blink an LED, etc.) But in reality, in most non-cpu bound workloads, using an event loop would make writing this kind of code much easier and you woulnd’t ever have to worry about mutexes, semaphores, deadlocks, etc. As a C++ programmer, using Qt (even for non-UI stuff) with its event loop is just so much easier than spinning up threads just to call a network endpoint.

That being said, threads definitely have their place and are the only real way to take advantage of all the power offered by modern multi-core CPUs (well that and multi-process but that’s not really any easier to get right). But for the basic stuff I think running 80 threads when your app is only using 5% cpu is insane (something I see a lot in the C++ code I’m exposed to).

Aren't all your examples still examples of concurrency?

> way to do blocking operations without stalling everything

Switching between multiple tasks doing IO - that's concurrency isn't it?

> various threads running at different intervals (low priority background thread that sleeps for 1 second then wakes up to do stuff

Different tasks ready to run and switching between them as needed - that's concurrency again isn't it?

Not really sure what everyone else in this thread is seeing that I'm not.

OK, sure those are actually examples of concurrency. I was thinking of concurrency more along the lines of multiple threads of execution doing useful work at the same time, which threads are actually good at doing but these examples are not that.

The confusion might be due to everyone using concurrency to mean concurrency and/or parallelism.

Think of an object oriented system. You can have a thread per object at the extreme where each object has its own thread/queue to handle messages. For most cases with synchronous calls you’re not really getting any concurrency.

Maybe it’s hard to imagine now, but in the 80s and 90s there were people that pushed this sort of architecture with a straight face. Even if not this extreme the idea of using threads for componentization rather than a focus on concurrency..which was possibly a side benefit was very much a thing (think COM/CORBA))

Hence why many articles like this and Ousterhout from the 90s, etc saying it was idiotic.

I worked on an embedded system with 1 thread per object and it was a very good ratio of code-to-expressivity (both when being used, and the implementation of the system). Each logical hardware port had its own thread, and was interacted with solely through message passing.

This is pretty much exactly the kind of robust system architecture that Erlang advocates employ.

In embedded systems, our kernels and threads are lightweight enough that we can go very fine-grained without paying a steep context-switching penalty. I'm not convinced that the penalty in Linux is all that high, either. Its only when you're going after the C10K (or C1M?) problem that you start to notice.

> In embedded systems, our kernels and threads are lightweight enough that we can go very fine-grained without paying a steep context-switching penalty.

Right, this system would run through the entire runlist at several kHz when idle and 0.5-1kHz under load on a PowerPC 405 that ran at about 200MIPS. Our shortest deadline was 10ms so it was plenty fast enough. Context switch was swapping out 12 machine words.

>why would you write a program to use threads if you don't need concurrency?

... exactly. There is no good reason. That's what I'm saying the point of the article is. However, this doesn't mean that people don't do it. He is saying that the tasks that people typically use threads to solve can actually be solved with an event loop and handlers, thus eliminating all of the messy issues with shared state, race conditions, etc. that true concurrency introduces.

I think that is referring to the case of "select() is too confusing, I'll use threads instead so I won't have to think about it."

I think I'd still call that concurrency - you want multiple continuations to preserve state between IO calls.

I think that programmers are quick to jump to concurrency when they want to make something performant as opposed to using other structures, like a cache, or researching other possible solutions, like the HN favorite bloom filter, or other algorithm.

That said, threads don't give you quite the boost you might be hoping for. Even with things that can parallelize well, it seems at around 8 or 12 threads you start to hit a wall.

How so? How many cores on the machine? Maybe you have some false-sharing, maybe thermal throttling is kicking in.

Some of what I heard is memory access bottlenecks and cache coherency bottlenecks.

It's good but not quite as good. 16 core is the high end of consumer CPUs, mid-range is probably more like 6 or 8 cores.

I think he's referring to threads as a programming model, not the physical implementation.

> Threads should be used only when true CPU concurrency is needed

> Scalable performance on multiple CPUs

the exceptions to 'when to use threads' in 1995 sound like SOP these days

Threads have some very practical advantages:

* Standard, easy way to get a backtrace

* Standard, easy way to get a list of active things going on

* In many cases, threads make it easy to follow control flow

Yet threads make backtraces and control flow much less useful, since they miss out important context from concurrent threads.

I wouldn't personally say that threads make control flow easier to follow: we might gain a little by disentangling separate activities into threads, but we lose a lot when these get interleaved in arbitrary, non-deterministic ways.

> Yet threads make backtraces and control flow much less useful, since they miss out important context from concurrent threads.

Threads' contexts should be independent from each other. If reading your stack trace relies on the state of other threads, you've got a very brittle design.

What is lost is the history of the state the current thread is working with, but the state should be fully encapsulated within the thread, except in the case of large shared read-only input buffers that are being processed in parallel. But those latter buffers aren't hidden from the current thread nor its debugging. Debug information can also be logged on state objects to show its provenance.

> but the state should be fully encapsulated within the thread, except in the case of large shared read-only input buffers

Sure. The trick is that the thread design does not ENFORCE that, threads as an abstraction involve shared memory.

> If reading your stack trace relies on the state of other threads, you've got a very brittle design.

Or a bug. And a bug or a brittle design is exactly when you need a debugger the most, right?

> Yet threads make backtraces and control flow much less useful, since they miss out important context from concurrent threads.

I mean, when a program panics, you get a stacktrace from (a consistent snapshot of) all the threads, so what's the problem?

If a highly concurrent service gets a query'o'death, it's hard to tell which threads were working on it and which were merely within the blast radius. Frameworks tend to roll their own notion of "request context" without it being strongly typed or pervasive across the language and libraries.

Now I’m curious what backtraces from Postgres look like when parallel scans are enabled. IIRC, you’ve got one isolated fork(2)ed master for the connection, which then has threads to divide work. Not too bad to debug.

A snapshot doesn't show the shared memory changes that may have caused the panic. There needs to be a synchronous log of all changes to shared memory to debug an issue.

Similar to having a snapshot of network traffic vs a recording of network traffic.

Also a way to make use of more than a tiny bit of silicon in that expensive CPU you just bought.

It's not that threads are necessarily a bad idea (though they can be for performance reasons), but that programming with most synchronization primitives is a bad idea. If you program with message passing, then it's not much different from the "events" model except that in the event-driven model you're trying to hide the underlying abstraction more (you still have concurrency, it's just baked into the I/O library).

I honestly think this presentation is confused. "Concurrency is fundamentally hard; avoid whenever possible" seems to go against their own argument. Event-driven models (which rely on message passing) are still doing concurrency, except instead of using locks and semaphores to synchronize things, you're using mailboxes and channels.

Even CPU interrupts are a form of concurrency that is similar to event-driven models. Just because you're not spawning a thread and acquiring a lock, doesn't mean you're not doing concurrency.

Event driven concurrency is not really easier than threads with primitive blocking synchronisation. Races are still possible, instead of deadlocks you can have livelocks, resource control and backpressure are non-trivial etc.

That's true, but at least you can handle backpressure at the runtime level and just choose a predetermined strategy for dealing with it (i.e. start dropping messages, exponential backoff, etc)

IMHO I think one large part of it is synchronization. If you're having to synchronize things all the time, you're probably misusing threads and should be using a different execution model.

Another way to look at it is that only infrastructure should be locking stuff, and infrastructure should be a very tiny part of the codebase. That infrastructure should probably be responsible for layering a different concurrency paradigm on top of threads...

Most platforms these days provide such things as part of the language, or in the standard library, or as a freely-available package. Writing one's own concurrency infrastructure is usually unnecessary, but when it is needed, it needs to be kept as small and as easily-auditable as possible.

A bit like `unsafe` in Rust, in fact.

Locking all over the place generally indicates that someone's trying to shotgun-debug concurrency bugs. I've had to use libraries which did that, and wished horrible things upon those responsible.

> Threads should be used only when true CPU concurrency is needed.

Which is basically any program running on any modern processor.

Not at all. There are multiple cores, sure, but why does my program have to use them? If it performs adequately using only one core, and if the nature of the problem doesn't require threads, why should I make it multithreaded just because the processor has multiple cores?

Raw threads are really hard to program correctly, however sinking parallel code into an executor or queuing framework tends to really reduce complexity and in a lot of cases get all the cores working.

In my life I must have written at least a few million lines of code. Probably more. How many of those lines have been explicitly multithreaded? A few thousand lines, at most.

I think this was truly the most amazing thing about learning Rust. After having experienced the pain (the issues brought up in the linked slide deck) of threads in C, then C++ and Java, it was wild to work with a language that provided some significant safety rails for working with data across threads.

Now async/await gives us even better options on top of that, but it’s truly what made me enjoy the language so much. This article is what resonated with me and got me to invest so much spare time over the last 5 years in working with Rust: https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.h...

Rust does a great job here, but so do languages that make multithreading safer at a higher level -- Erlang/OTP and Haskell are two great examples. Immutable data may not solve every concurrency problem, but it sure goes a long way.

I also have a lot of respect for Ada's task-based concurrency approach (independent actors, communicating by rendezvous). You don't get the flexibility to roll your own concurrency strategy in Ada, but the language's support for its chosen mechanism is truly excellent. Even if you'll never use Ada, this part of the language is worth studying just as an example of great engineering design.

There's a nice Kevlin Henney talk where he lays out this diagram:

       non-shared mutable state  |  shared mutable state
      non-shared immutable state | shared immutable state 
Each quadrant is safe, except the one in the top right: shared mutable state.

Functional languages are great at the bottom two, since they strongly encourage immutable state. Message-passing based concurrency strongly encourages non-shared state, the safe two on the left.

Rust is the only language I've seen which encourages all three safe quadrants, while making the fourth a compile-time error.

> Each quadrant is safe, except the one in the top right: shared mutable state.

No. Your database is one giant shared mutable state.

How is using a database safe then? Transactions.

Haskell has had software transactional memory for fifteen years now. Microsoft tried to copy it in .NET but it's nearly impossible to do right in a language without clear separation of pure and impure code.

"> Each quadrant is safe, except the one in the top right: shared mutable state.

No. Your database is one giant shared mutable state.

How is using a database safe then? Transactions."

Transactions are less an occasion where shared mutable state is made 'safe' during concurrency and more a situation where concurrent processes are forced to temporarily interact and operate in a sequential, non-concurrent, manner. They use locks under the covers. Databases manage to be parallel because different processes operate on different sets of data at the same time; they lock when two or more processes attempt to access the same row on a table. Transactional state access is still vulnerable to deadlocks and other difficulties of concurrent programming. This means that transactional state is not 'safe' in the same way that immutable and non-shared state are safe. It's just a much easier way of managing the kind of difficulties you have with shared mutable state than say, raw locks.

No. Please read about MVCC: https://en.m.wikipedia.org/wiki/Multiversion_concurrency_con...

The tl;dr is that

> Locks are known to create contention especially between long read transactions and update transactions. MVCC aims at solving the problem by keeping multiple copies of each data item. In this way, each user connected to the database sees a snapshot of the database at a particular instant in time. Any changes made by a writer will not be seen by other users of the database until the changes have been completed (or, in database terms: until the transaction has been committed.)

In other words, the system uses immutable state under the hood plus some atomics/locking to present the abstraction of safe shared mutable state.

> No. Your database is one giant shared mutable state.

That doesn't refute anything from the parent post. In fact you prove the parent's point in the very next sentence:

> How is using a database safe then? Transactions.

Which is to say, you can have shared mutable state, but you need some mechanism to protect access to that state. It is not safe to just access it however you like, which is why database have transactions, and why we have primitives like mutexes for dealing with OS-level threads.

> Functional languages are great at the bottom two, since they strongly encourage immutable state.

Haskell actually has safe & ergonomic shared mutable state via STM

And it has proven-to-be-not-shared mutable state via ST.

Speaking of Shared mutable state, unlearning OOP is going to be a big task for the industry to tackle. Feels like it's on its way out on the backend at least already. It's definitely been my biggest stumbling block in grokking parallel compute.

For the record, Erlang also has an internal transactional in-memory database called Mnesia, which is essentially large-scale shared mutable state.

The trick is using the three safe quadrants without bottlenecking on memory bandwidth copying the data over and over again.

If immutability is a language level guarantee and not just a convention, you can use structure sharing to address this problem pretty effectively. Just about any functional language with immutable datastructures does it.

I think the idea that finding ways of working with threads (and ownership) with more safety while minimizing performance impact is a healthy side effect of rust.

I think though that rust is one approach of many and that they aren't orthogonal.

I personally think libraries for multi threading on top of rust or C++ are necessary to really nail the problem down. Even async and await are too granular and difficult to get right in my opinion.

I think regardless of the language, queues of data chunks (not just tasks) combined with solid concurrent data structures will go a very long way. Rust may help get those libraries and data structures to be more correct with less effort, but ultimately getting threading right at the lower level just isn't practical for most programmers most of the time, and I say this as someone confident in their parallel programming skills.

Yeah, there are some pretty solid crates coming up. Obviously Tokio as a runtime, Rayon for parallel iterators,and the whole Async Std crate. Rayon is a good example of a crate that helps add concurrency without much headache for embarrassingly parallel operations.

Even still, even in languages with built-in runtimes and GC like GO, it is very easy to create deadlocks/livelocks. Programmers intuitions of concurrency are generally wrong, and the correct intuitions are complex. I don't know how we make concurrency easy. I do like Futures however, they provide a nice abstraction over threads.

I'm gonna be That Guy and point out that rayon addresses parallelism, not concurrency (which is crossbeam's domain). My favorite explanation of the difference (by the inimitable yosefk): https://yosefk.com/blog/parallelism-and-concurrency-need-dif....

Was an interesting read, thanks! Excited for the future where now I can become That Guy™!

I still feel that, while all these abstractions are very fine and handy, there is only so much you can do without actually understanding how threading and locking works. If you grow up as a developer without actually touching it I wonder how you can actually understand this.

Part of me thinks the success of javascript is due precisely for the reason that it prevents/discourages multithreaded programming.

I've found C# to be one of the easier languages to do multithreading with. I think this is owing to the capabilities of a Visual Studio like parallel stack viewer.

Also, anyone working with an OOP should really read Java Concurrency in Practice. That really helps in terms of learning how to think about multiple threads in a OOP world.

Not sure how events by themselves can solve the threading issues as events can be multi-threaded too. I've seen people write far worse event driven code than multi-threaded code. If people want to use events heavily, I think it's better to use a well known design pattern so others can understand what you are trying to do.

I haven't directly interacted with threads in a very long time, but I do use Microsoft's Task Parallel Library on a daily basis. I feel like once you understand TPL/async/await and the nuances with execution context and how to handle CPU vs IO bound operations, things come together really nicely. I do not really worry about things like cache coherency or the low-level synchronization primitives involved anymore. I can seamlessly throw in some locking with my TPL usage (typically ReaderWriterLockSlim) without much concern for strange behavior across the various tasks. It really does "just work" once you buy-in 100% (i.e. exclusively use TPL abstractions throughout).

I would say that 99% of the time I am dealing with IO (simply awaiting some asynchronous database/network operation), with the other 1% of cases being things that I actually want to explicitly spread across multiple parallel execution units - I.e. Task.Run() or Parallel.ForEach(). In either case, I am working with the sugar-coated TPL experience, and all the horrific threading code is handled automagically. If I still had to work with threads directly, I would probably have found a different career path by this point.

What problems of threads is this actually isolating you from? It seems like the usual problems of correctly protected shared mutable state and avoiding deadlock are still there if you're using low-level primitives like ReaderWriterLockSlim.

I simply cite RWLS as an example of how you can combine other primitives into the threading model afforded by TPL without much frustration. I make no claims that it somehow eliminates fundamental concerns like shared state. In practice, I use RWLS extremely rarely because I prefer to avoid shared mutable state in the first place. I will spend an entire weekend reworking architecture in order to get a lock out of a hot path if I need to. 99% of my task-related code is boring stuff like:

  var session = await _connection.QueryFirstAsync<Session>(GetSessionSql, some session token);
The biggest concern for me is simply the thread lifecycle and # of threads involved. TPL handles a thread pool for you and automatically schedules Tasks to run on these threads as appropriate. This is a non-trivial affair which would be painful to re-implement consistently and reliably in each application. I, for one, would be far too tempted to waste entire days screwing with thread pool parameters relative to environmental factors. With TPL hiding these things from me, I can focus on a level of abstraction that actually gets business features shipped. I still have not run into a scenario where I would have rather gotten my hands dirty and implemented the raw threads myself. Microsoft did a pretty damn good job. Everything "just works" and it scales very well.

In C++ it has become way easier to manage some of the issues mentioned since 1995, namely by using atomic, future or even locks.

Events are not only an alternative to threads they are also a complement. If your language environment has a complex memory model with good concurrency support and stable non-blocking IO, you can use threads! But for them to be good you need to make sure that the input and output is compatible, as a general guide:

1) Make sure your hardware interface is capable of being accessed by the kernel from multiple threads. Network cards generally allow this, while graphics cards still don't, at least without a lot of overhead.

2) Make sure your application profits form parallelism, and specifically joint parallelism; which is my term for computing that allows many threads to work on the same memory.

Bottom line is in my case only the Java server for my MMO will use threads in a "joint parallel" way. In everything else I will avoid them like a jerrycan full of gas tries to avoid fire.

So I'm transitioning to C with arrays!

Years ago, an OpenBSD developer told me, "threads are for idiots" in response to a bug I had submitted. At the time, I was a bit offended (it was a good, reproducible bug report), but today, I think he was right. They're just too complicated and 90% of the time they are not needed.

> Years ago, an OpenBSD developer told me, "threads are for idiots" in response to a bug I had submitted. At the time, I was a bit offended (it was a good, reproducible bug report), but today, I think he was right.

Whether he was right or not about threads, it's offensive to insult the person just because you don't like their idea. Likewise for this to be the response to a bug report - if you don't want to support threads, then don't support them rather than lashing out at people who notice your support is buggy.

I think in 2020 he's mostly wrong anyway. Sure, there have been many problems with threads, but...

* this presentation's "Threads should be used only when true CPU concurrency is needed" maybe meant "rarely" in "1995" but means "commonly" in 2020 when single-core performance has been mostly stalled for a while and core counts have risen dramatically.

* There are safer/easier alternate concurrency primitives than mutexes (channels) and at least partial solutions to major problems with threading. For example, in safe Rust there are no data races (even when synchronizing via mutexes). Other problems (deadlocks, contention, other types of race conditions) still exist of course.

* "Threads" vs "events (event loops + callbacks)" as described in this 1995 presentation isn't the whole world, especially today. What about communicating sequential processes with no shared mutable state (such as Erlang's actors)? So to some extent I disagree with the framing of the problem altogether.

* callbacks have their own problems beyond what's described in this presentation. Some that were widespread even in 1995: a string of operations written as a string of callbacks is a lot harder to understand than ones written with the "sequential composition operator" (;) and loops. (Callbacks are basically abandoning structured programming in favor of goto at the macro level.) Likewise harder to debug: you can't just get a stack trace and understand its current state. And some that have become more common since then. These days, event loops are usually multithreaded, so for any cross-request state you have the threading problems as well as the event loop problems. Today I'd say callbacks are the advanced, use them if you need the performance but be wary of the dangers option.

libuv (http://docs.libuv.org/en/v1.x/design.html) is pretty much this in action. Event loops are not necessarily easy to debug though and message passing between event loops will force the use of concurrency primitives such as mutexes and/or memory barriers (lock-free) anyway. Also, working with event-driven architecture requires a more functional approach since the handlers are all short-lived. Reminds me of Erlang. I think the deck downplays the complexity of building such a system.

The author of the deck is Dr. John Ousterhout, who wrote Tcl, which had an event loop way back in the day.

Sure, only use threads when you actually need them. The cases where you actually need them are quite numerous, though. E.g., you have a library that connects to something over the network but the library is blocking. Or you need to respond to events quickly but for some events you need expensive calculations. If you can get away with copying data to the threads so shared state is minimized this will help. The general principle is to make things as easy as possible instead of as difficult as possible.

It is amazing, and sad, how many people still refer to this document today when complaining about how "threads are bad" and "events are the bee's knees" or when justifying some architectural decision to avoid threads. No one prototypes anymore. No one does their own real world testing and benchmarking. They simply Google for some docs which support their already-made decision and call it a day.

Modern languages with high-level and verified async features and explicit mutability make threads much more convenient.

That paper is finally becoming dated.

Async is superior. I have done processes with locks in shared memory. I have done threads. But I predict Async will slowly start to take over. Processes are not suitable for working on shared data. Threads frequently yield race conditions and deadlocks even for experienced coders. But Async doesn't have any of these issues. So why isn't it more popular? For two reasons:

1) it completely breaks the functional programming model that we all learned as toddlers (instead of call A and then, after that's done, call B, Async is call A which just installs B as a callback, returns immediately and then an "event loop" calls B). Note that promises and tasks and futures are just "syntactic sugar". Personally I'm not a fan. I don't use any of that. I just use callbacks.

2) Even though Async it's great for concurrency, it's not great for parallelism. Everything runs with one thread. So if you want parallel processing you need workers.

But I would argue that issue 1 can be overcome. In fact, I find Async to be quite elegant. I think in the long term people are going to realize that maybe we've had it backwards all along.

Issue 2 is actually not that big of a deal for most things. It's actually somewhat unusual that you need to have some CPU intensive operation running in the background. Maybe image processing, data modelling, etc. But most blocking operations are just I/O operations which are not using CPU that much. If I needed to write some kind of network server, I would look at using libuv as a portable runtime.

I mean, threads are OK, but you need to essentially and artificially (because they're not processes) isolate them as much as possible (by minimizing shared state), and you want to do async/evented I/O so that you have as many threads as CPUs.

Hey, I get to tell m y own two 1990's era threads story.

First: in Windows 3.1, you got exactly one thread. My former company (BBN Software Products, home of the RS/1 statistical program) managed to get a version of RS/1 on Windows by splitting it into two pieces, each of which ran a single thread. On piece (RS/Client) was the UI; it talked to the "server" using TCP/IP (or a shared memory channel if the client and server were on the same machine)

Second: I also got to help port a networking program over to an SGI box. At the time, the SGI GCC-based compiler could either supports threads, or support exceptions, but not both. (And my "unsupported" I mean, "generated code that would crash even if no exception was ever actually generated"). I couldn't convince the company to keep the threads and dump the exceptions, so instead I had to convert the program to spawn new processes with shared memory (!) to emulate the threads.

TL/DR: actually programming with threads at the time was decidedly unsupported.

True, but Windows NT 3.1 came with threads and used them throughout the kernel. They were supported by nonstandard functions in the Microsoft C runtime as well as through the windows API. Windows included its own structured exception handling facility that also worked with it.

This is mildly off-topic, but how does one end up working on problems that are complex enough for things like this to even be an issue? It sounds incredibly interesting to me, but most of the software I've worked on has been at least somewhat web-based.

I know there's so much more out there, and I'm just not sure how to find relevant problems to solve...it feels like a serious case of "I don't know what I don't know."

I guess I should probably just pick some non-web concept I find interesting and start making something.

Please do that as an Ask HN, I'd be interested in the responses you hopefully get.

Edit: Ah, I see that you did, an hour ago. Very good!

Concurrency isn’t a “nice layer over pthreads” - the most important thing is isolation - anything that mucks up isolation is a mistake.

— Joe Armstrong

I'm all for getting rid of threads, but what are you going to replace them with? Traditional functional languages may be the most obvious solution, but they're also among the most impractical of solutions. Is there anything else out there that can replace threading needs, without throwing out the book on programming? It seems like what we need hasn't been invented yet.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact