Hacker News new | past | comments | ask | show | jobs | submit login

The year is 2016.

Software engineers are still obsessed with squeezing every last drop of performance from a single core, adding multicore or distributed load support as an afterthought.

Sorry, it doesn't work this way anymore. There will be no more single core performance increases — laws of physics forbid it. Instead, we will see more and more cores in common CPUs (64, then 256, then 1024 — then it will be merged with GPGPUs and FPGAs with their stream processing approach).

Learn distributed programming or perish.




> The year is 2016.

It is $CURRENT_YEAR, yes.

> Software engineers are still obsessed with squeezing every last drop of performance from a single core, adding multicore or distributed load support as an afterthought.

Normally I'd agree, but if you had bothered reading the abstract the performance losses were not negligible and the kernel scheduler was responsible for the losses. This has nothing to do with application programmers not understanding "distributed programming" as you put it.

From the article:

> As a central part of resource management, the OS thread scheduler must maintain the following, simple, invariant: make sure that ready threads are scheduled on available cores.

> As simple as it may seem, we found that this invariant is often broken in Linux. Cores may stay idle for seconds while ready threads are waiting in runqueues.

> In our experiments, these performance bugs caused many-fold performance degradation for synchronization-heavy scientific applications, 13% higher latency for kernel make, and a 14-23% decrease in TPC-H throughput for a widely used commercial database.


I don't think this is a good way to look at it. In my opinion, having a "perfect" single core is much more valuable than an octo-core CPU. Take a look at the iPhone 6s. Two cores, and it runs faster than some octo-core Androids. I may be a bit of a noob in this area, but something tells me that is significant.


I'm also a noob, but according to parent, the reason why the iPhone runs faster is that all optimisation is done for 1 core, which would then leave the 8 cores doing nothing, despite having much, much more raw power available than the single iPhone core.

Or put in another way: If we want more raw processing power, we need more cores, but we don't want more cores because software is optimised for a single core.


Not true, even in multithreaded scenarios the A9 is more performant than the multicore android phones or it has a comparable performance in the worst case if I remember correctly. In single threaded scenarios it simply destroys all the other mobile chips and it is comparable with some low power Intel chips.


That is the point. If software can't use multithreads, then strong single threads will win.


Yes — because most applications are still written as if for single core. This is the vicious cycle I am talking about.


Still, your earlier comment completely misses the mark. Distributed algorithms rely more heavily on a well-functioning scheduler than single-thread solutions. I fail to see how your comment about "squeezing every last drop of performance from a single core" matches the article being discussed.


There will be no more single core performance increases — laws of physics forbid it.

This is just plain wrong. Look at any Intel CPU generation and you will find that the new one is faster than the old one clock-to-clock.


Most recent Intel cores actually appear to be slightly slower at integer performance. http://imgur.com/a/2fiLF


I would fucking love to have an FGPA included with the newest Intel procs or something. :D

Kinda looking forward to lowRISC with minion cores, though - if we ever ditch x86 for games.


> I would fucking love to have an FGPA included with the newest Intel procs or something. :D

You're in luck:

http://www.theregister.co.uk/2016/03/14/intel_xeon_fpga/

Intel's purchase of Altera is sure to lead to all kinds of innovation in this area. This is hopefully only the start.


That's awesome! :D

To me it just seemed silly that we'd continually upgrade to get more dedicated circuitry for things like low-power video decoding. I know programming an FGPA still wouldn't be as efficient as it could be, but I think in the future it might be nice to have an FGPA available for adding things like x265 hardware decoding a year after purchase.

What I'd REALLY love to see is an FPGA in a Chromebook for students. School will be so awesome in the next decade.


We are stuck with x86 for good. Intel is doing impossible things both in engineering and business execution, and it's really hard to compete with them for anyone.

Multicore ARM is still inferior to multicore x86.


for which purpose?

assuming which software load?




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: