

Ask HN: about multi-core computers and OSs - zxcvvcxz

What are the limits and problems associated with multi-core computers and the software that controls them?<p>Currently the top-of-the-line consumer products seem to be quad core computers, and I've read about octo-core and up computers existing (Intel's products: http://en.wikipedia.org/wiki/List_of_Intel_Xeon_microprocessors#.22Westmere-EX.22_.2832_nm.29)<p>I suppose one big problem on the software side is that most operating systems wouldn't be set up in a way to optimize performance gains from these CPUs, making them an expensive waste of money. I'm also curious as to what some of the hardware limitations are (floor planning and signal timing constraints perhaps?).<p>I'm thinking of pursuing research opportunities in systems programming for multi-core computers, so I wanted to try and get some insightful info and input if that's going to be (or already is) an important field.<p>Cheers.
======
andyzweb
Concurrency is the biggest area of study and contention here.

I still think an exciting problem that hasn't received much attention is
thread/process affinity.

When scheduling threads the operating system picks a logical execution unit to
do the actual processing. Logical execution units have some kind of shared
resource between them. Let's say we had a machine with 2 Intel Core i7-2600K
in it. That computer would have 16 logical execution units (2 CPUs x 4 cores x
2 threads per core) All 16 units would share the same RAM. 8 logical units on
the same CPU would share the same L3 cache. 2 logical units on the same CPU
and same core would share the same L1 and L2 cache.

Now I want to apply Fast-Fourier transformation to a stereo audio signal and
want to process each channel in it's own thread. When the operating system
schedules those threads to run it it might be best to pick two logical
execution units on the same CPU core and reschedule it there every time it
executes. This would maximize hits on instruction caches and data caches, with
the benefit of not polluting unnecessary caches. The opposite of this would be
having the threads run on logical execution units on different CPUs and never
rescheduling them to the same logical execution unit twice in a row. This
would drastically reduce the likelihood of cache-hits and possibly pollute the
caches of every logical execution unit.

I lost my train of thought here. In any event. Processor affinity and
scheduling are two things ripe for improvement on the operating system level.
That said most modern operating systems handle SMP pretty well and there is
far more improvement to be made in the userland.

