- "Context Switching"
You can find 4 cores in a trivial computer these days (8 cores with hyperthreading). This means you can have 8 processes without context switching at all, and also suggests that if you don't use multiple processes, you can only reach about 10-20% capacity on a multi-core machine.
- "Per-process overhead"
It's true it has overhead, but you don't have to create thousands of processes, we have light concurrency patterns to use within the actual thread/process.
But even then, you don't even have to run more than one primary process per core. You have multiple cores. They can't together run one process.
- "We built for a single-core world" and "We lack the tools"
Those are outright PEBKAC errors, and not architectural problems. We're way past the stage when multi-process architectures were a hardly understood, confusing problem. We have the right tools, for those who look for them.
Cache misses and overly eager context switching are bad, and you can read a lot about this by Martin Thompson on his Mechanical Sympathy blog: http://mechanical-sympathy.blogspot.com/ There are ways to design multi-process systems to take best advantage of all cores, and your machine caches.
But even Thompson's Disruptor architecture is based around message passing and multiple processes. Because, again, in a multi-core world, suggesting anything else is laughable.
Plus, forget computers, multitasking is a fact of life. Animals do it, humans do it, so do computers. We have to check email and talk on the phone at the same time sometimes. We need to walk and chew gum. We constantly argue and write about how much we should multitask and how much we should focus on a single task, because people have the same context switching overhead and so on. Well for both people and computers, there's a balance, and either extreme is counter-productive. It's as simple as that.
No, you can use threads.
> Plus, forget computers, multitasking is a fact of life. Animals do it, humans do it, so do computers. [...]
That's a really bad comparison, and isn't applicable here at all. Computers are very different from humans or animals, both in how they function and in how they're "programmed". Well, at least it wasn't a car analogy.
I thought that was assumed in the sentence I wrote, as threads are even harder to deal with properly due to their shared memory space. But they still do context switch, have cache misses, require complex (way more complex) synchronization and so on.
If you can get away with separate processes, you're way better off. Threads are a last resort and come with a "use with caution" label.
> That's a really bad comparison, and isn't applicable here at all. Computers are very different from humans or animals, both in how they function and in how they're "programmed". Well, at least it wasn't a car analogy.
No, I didn't just use a random analogy. How much do you know about our brain works anyway?
Did you know our memory types are stacked like CPU caches, RAM and disk, going from fast short-term to slow long-term memory? Interesting, right? We are literally having "cache misses" when context switching, just like our computer brothers.
And as we keep evolving computer designs, they keep creeping closer and closer to how our brain works and the tradeoffs are startlingly similar. I was very deliberate in my comparison. Think about it like dolphins adopting a quite similar body design and locomotion as fish, despite their completely different origins.
Next step is the always-around-the-corner clockless CPU design. It'll come, eventually (either that, or more likely we'll have hundreds of tiny, simple, specialized, locally clocked cores with some memory on-board).
And saying that "there's a balance to be struck" in human multitasking manages to be both trite and wrong; the evidence is not that some people multitask too much and some people multitask too little. Multitasking makes you more stressed and less effective; the overwhelming majority of people should be trying to reduce or eliminate it. Which says very little about how much multitasking our CPUs should be doing.
Um, you do realize that the human brain has ten to a hundred billion neurons that all run in parallel, right?
And yet here you sit, typing and breathing and seeing and thinking about what you're having for dinner tonight while your digestive system continues to work on lunch and your heart continues to beat.
You might have assumed it, but your post didn't mention or even imply it.
> Computers keep creeping closer and closer to how our brain works and the tradeoffs are startlingly similar.
This definitely needs an explanation. In what way are computers and brains "creeping closer and closer"?
> Think about it like dolphins adopting a quite similar body design and locomotion as fish, despite their completely different origins.
This is another bad analogy. CPUs and human brains do not live in the same environment, and CPUs are not subjected to evolution (they could be considered instances of Intelligent Design, though ;-) ).
The story changes if those processes have to communicate.
Cache misses and overly eager context switching are bad,...There are ways to design multi-process systems to take best advantage of all cores, and your machine caches.
Yeah, but frankly, the situation sucks! It's still quite difficult to detect such problems with certainly, and the solutions often look like hacks.
Plus, forget computers, multitasking is a fact of life...We have to check email and talk on the phone at the same time sometimes
This kind of misses the point. We can do efficient multitasking of largely independent processes, and have been doing that for a long while. It's using multiple cores on heavily interrelated and interconnected processes where things get quite hairy. This is where current architectures suck.
In a typical user machine there can be useful independent single-threaded processes running, all serving the user's needs (and thus -- using machine's computational capacity). For servers that have to do only one thing, on the other hand -- yes, it's a problem that requires an actual engineering.
There are still plenty of tablets and phones out there which don't. (And the author is targetting those devices with the software he's writing)
"It's true it has overhead, but you don't have to create thousands of processes,"
Ah, but that might be a natural consequence of your design. I think part of the problem is that we're currently in the middle of a really awkward transition period in computer architectures.
As you point out, almost no computers are single core nowadays. So having a single-process application is obviously not taking advantage of all the computer power you have at your disposal. It's blatantly sub-optimal.
However, if you're going to design a multi-process application framework, you probably want to design it to kick off new processes whenever you're about to do something non-trivial that could introduce significant latency into the mix. But depending on what the user/application user is actually doing, that might potentially end up starting dozens, or hundreds, or even thousands of processes.
And our computer architecture is not yet at the point where we have "enough" cores in our CPUs that we can just do that and have it work. We're getting there, but it's likely a decade or maybe two away.
So, we want to create a multi-process architecture to stop wasting the computing power that exists; but we have to be careful and write extra code to managing creating these processes, because we can't yet afford to create them "on a whim".
It seems similar in some ways to segmented 16-bit memory models. In an early flat 16-bit memory model, you were very constrained (64k) but at least the environment you were working in was simple. The move to a segmented 16-bit model was in some ways a lot less constrained, but taking advantage of it meant dealing with a bunch of extra complexity. (boo!) It was the next step to flat 32-bit systems which made working with memory painless and simple again, while further lessening the constraints.
When low-end phones and tablets have >=256 cores, then we'll be able to take advantage of multi-process frameworks properly.
It occurs to me, having seen Alan Kay's talk a few weeks ago and being fascinated by the idea of "spending money to get ahead of Moore's law" to put together a computer that would allow you to write the sort of software to take advantage of computers a decade from now, but couldn't figure out what that meant if you wanted to do it now, I imagine that putting together a >=256 core system might be a good start.
Furthermore, if one observes the hardware evolution of the PC, one notices that it took the direction of heterogeneous multicore architecture (CPU, GPU, etc.) rather than the direction of an homogeneous multicore architecture: there are more "cores" of different types in your PC outside your i7 than inside. The same goes for tablets and phones that typically feature an ARM design with CPU, GPU, DSP integrated into one chip for footprint and power consumption reasons. Architectures seem to evolve into modular hardware, featuring a base CPU backbone to which specialized chips are added. There are a gazillion ARM-based designs, depending which set of functions you need.
This makes sense for the software and consumer electronics industry. Switching to completely different solutions like many-core chips (Mill CPU, GreenArray), would have a huge cost. Picking a common, general purpose "few-cores" CPU and adding in specialized chips as needed is much more affordable.
And 640k will be enough for everyone.
Hey, you might even be right about not needing them, but that doesn't mean those devices won't get them. You don't really need a 32-bit CPU to run your microwave or washing machine, but very few of the ones you buy today are running 8-bit microcontrollers with hand-coded assembler.
Though clock speeds stopped increasing some time ago, Moore's law marches on, and we're still getting more and more transistors per buck. Sure, some of them will go to more L1/L2 cache, but at some point the bandwidth to flush them to main memory becomes a bottleneck, so I think we're going to see more and more cores/chip. As that marches on, I think it's going to trickle down to even the low end of CPUs. For instance, you might even be able to buy a bunch of 64-core chips which have a bunch of cores disabled really cheap, because the disabled cores failed factory testing and were switched off. It can't be sold for full price any more, but that doesn't mean it's useless.
And once your $5 CPUs have 32 cores, well, providing the tooling is there, you might as well use a framework that makes use of them.
"many of them just run a browser and Excel."
Bad choice of examples there, I think. Browsers can be pretty multi-process heavy these days, with one main process, plus one per tab, plus extra processes for e.g. decoding streaming video in the tab, or running your (spit) EME plugins (spit) in a sandbox.
Similarly, with Excel, sure most of the time it doesn't do much. But, if you've got a spreadsheet with a bunch of dependent cells/formulas, then if someone updates the right cell, being massively parallel could really speed up value recalculations and propagation throughout the sheet. Some spreadsheets translate really well to map/reduce, and using all the cores could really help there.
If a natural consequence of your design is a need to do something that fits the target environment so poorly, then it's a bad design & should be rethought. Software isn't a Platonic ideal, it needs to account for these things if it's to run well.
This is exactly the problem process/thread pools solve. Most if not all modern programming languages include such thread pools either in their stdlib or as battle-tested external libraries. Do you have a specific complaint here?
The really hard part of multi-(thread/machine)process architectures is synchronization, not resource allocation.