Ask HN: How important is concurrency/parallelism today and in the future? - diehunde
======
ddingus
Until we can make bigger leaps in terms of sequential computation, both of
those things will continue to grow in overall importance.

One example I can think of that happened recently is geometry kernel
computing. For decades, these kernels have been largely sequential compute
bodies of software. In CAD systems, there are some opportunities for
concurrent solutions to model geometry, and those depend on the dependencies
that arise as an artifact of how they are defined, constructed, changed,
combined, assembled.

But there are limits. And those center right on sequential compute limits.
Roughly 5 Ghz is available these days. That tends to place an upper limit on
what can be resolved without worrying about dependencies.

Recently, Dyndrite has made a GPU centric geometry kernel from first
principles. It can do things in a blink of an eye that are either not
practical or very time consuming using the sequential compute kernels.

That capability is going to have profound effects on how we manufacture
things. Additive is a current target, but the idea of fast and flexible
geometry is going to have ripple effects all over the place.

I would argue this kind of innovation would see less investment and overall
demand had we been able to continue ramping up sequential compute capability.

Concurrency may mean a return to more custom hardware too. Same basic reason;
namely, sequential compute no longer seeing the massive gains it did earlier
on.

When we can package tasks up into hardware, those tasks become very efficient
and software can be simplified, and having those tasks happen at the same time
may not require as much kernel level type software managing things like
interrupts.

It's almost like we are coming around full circle!

When one takes a look back at say the 80's early computing on 8 bit machines,
the first ones didn't have much in the way of custom hardware.

The original Apple 2 was made from discrete logic and a CPU. Software drove
pretty much everything, except for the built in graphics system. And all that
was is a frame buffer. No assists of any kind.

The CPU pretty much did everything. Even reading and writing from the disk
drives, which had a simple hardware assist in the form of a state machine, and
that's it.

Despite a 1Mhz clock, having so much of the computer driven by software meant
being able to optimize it over time. Those machines were sold and used from
the late 70's through the early 90's.

Along came custom hardware, and often the same CPU at a similar clock.

Atari and C64 machines, for example, had graphics and sound devices, and on
the Atari a serial I/O system that looks a little like USB does today. Those
machines were able to do more and featured some basic capability not driven by
the CPU itself. Graphics, sound, some I/O, could all happen concurrently and
that made more things possible.

The IBM PC looked a lot like an Apple 2, lots of discrete logic, but no built
in graphics system. Add on cards, MGA, EGA, CGA, VGA and more continued to
offer more and more capability and like the Apple 2, more add in cards could
deliver more features and some concurrency where needed.

Test, measurement was one use case. Music was another where multiple devices
could be added and they could perform concurrently. Automation and control was
another case where lots of I/O could be added to the system reasonably. That
could be coupled with local compute resources too for concurrency.

The same was seen in the 16 bit era. Simple CPU and dumb graphics system was
the Atari ST, Apple Mac and the Amiga featured a similar CPU, but coupled with
custom chips that made things like video editing and production possible, for
example.

Today, we've got computers with lots of little sub-systems doing many things,
and they all use similar CPU's too.

We've topped out on CPU's, much like the older parts and computers did.

8 bit machines ran from under 1 Mhz to a few Mhz, maybe 10 tops.

16 bit machines ran from a few to 10's of Mhz.

32 bit machines and 64 bit machines today top out at 5'ish Ghz.

Until the last decade or so, this was all single core, sequential compute too.
Concurrency happened via add on dedicated sub system, or via interrupts and a
complex scheduling system able to get things done concurrently.

The only places to go are multi-core CPU, many core GPU and custom task /
process based hardware.

Again, that is given we do not somehow find a way to very significantly
improve single thread performance.

Software will continue to evolve. More and more things will be done in ways
that can take advantage of simultaneous execution and or dedicated hardware.

And it can play out in a lot of ways!

Cell phones are super interesting. In terms of general compute performance,
some of them approach laptop speeds. But they are super efficient and have
custom hardware assists for a lot of things.

Getting things done in real time, and with sufficient fidelity as to not
degrade the task in a meaningful way on an increasingly small power budget
means doing more on a smaller battery, potentially not even using a battery,
or making devices smaller, more transparent to their users.

Desktop computers are fading fast, but for those brute force high demand tasks
that require tons of RAM, CPU cores, GPUs, many I/O channels, etc...

Laptops continue to get more lean and mean, while remaining performant for
many tasks.

More is possible on a cell phone every cycle.

Wearables are now a thing.

Wow, I just rambled on a lot.

Really, I think your question is better expressed in terms of:

Multi-core CPU

Role of the GPU

Custom dedicated to task hardware.

Those things may contribute in a parallel sense, a concurrent one, or be mixed
mode, depending.

Mix in the power budget, and that's going to speak more to what is important
for the masses than anything else is.

Your question is different in some niches where it really is all about peak
sequential compute limits and the slow move of software over to multi-core
computing vs increasingly complex custom hardware.

And on that last note, think about instructions. Just adding one or two that
package up whole loops, or groups of instructions are a powerful way to get
more sequential compute power these days.

