Software engineers are still obsessed with squeezing every last drop of performance from a single core, adding multicore or distributed load support as an afterthought.
Sorry, it doesn't work this way anymore. There will be no more single core performance increases — laws of physics forbid it. Instead, we will see more and more cores in common CPUs (64, then 256, then 1024 — then it will be merged with GPGPUs and FPGAs with their stream processing approach).
Learn distributed programming or perish.
It is $CURRENT_YEAR, yes.
> Software engineers are still obsessed with squeezing every last drop of performance from a single core, adding multicore or distributed load support as an afterthought.
Normally I'd agree, but if you had bothered reading the abstract the performance losses were not negligible and the kernel scheduler was responsible for the losses. This has nothing to do with application programmers not understanding "distributed programming" as you put it.
From the article:
> As a central part of resource management, the OS thread scheduler must maintain the following, simple, invariant: make sure that ready threads are scheduled on available cores.
> As simple as it may seem, we found that this invariant is often broken in Linux. Cores may stay idle for seconds while ready threads are waiting in runqueues.
> In our experiments, these performance bugs caused many-fold performance
degradation for synchronization-heavy scientific applications, 13% higher latency for kernel make, and a 14-23% decrease in TPC-H throughput for a widely used commercial database.
Or put in another way: If we want more raw processing power, we need more cores, but we don't want more cores because software is optimised for a single core.
This is just plain wrong. Look at any Intel CPU generation and you will find that the new one is faster than the old one clock-to-clock.
Kinda looking forward to lowRISC with minion cores, though - if we ever ditch x86 for games.
You're in luck:
Intel's purchase of Altera is sure to lead to all kinds of innovation in this area. This is hopefully only the start.
To me it just seemed silly that we'd continually upgrade to get more dedicated circuitry for things like low-power video decoding. I know programming an FGPA still wouldn't be as efficient as it could be, but I think in the future it might be nice to have an FGPA available for adding things like x265 hardware decoding a year after purchase.
What I'd REALLY love to see is an FPGA in a Chromebook for students. School will be so awesome in the next decade.
Multicore ARM is still inferior to multicore x86.
assuming which software load?