

Parallel Programming: Understanding the impact of Critical Sections - suprgeek
http://www.futurechips.org/tips-for-power-coders/parallel-programming-understanding-impact-critical-sections.html

======
JoeAltmaier
For update critical sections, its important to execute the non-contention case
at lightning speed. This reduces the chance of collision (and associated
blocking, thread switching, kernel trips) and moves the 'saturation point'
(where there are so many threads that there's always somebody blocking) up in
threadcount by an order of magnitude.

In my measurement, OS X and linux both simply implement "critical section" as
a semaphore, with two full kernel trips to take and release the cs. Since the
code being protected is usually less than 10 instructions, the ratio of
overhead to code protected is 1,000 or so. The saturation point can be a
couple score threads.

In Windows (yes! Windows!) they use special opcodes that lock/unlock the bus
to provide atomic update of in-memory variables. The ratio is about 1 or 2.
The saturation point is a couple orders of magnitude more threads before
thrashing occurs.

Windows is optimized for the hardware, yes. An unfair advantage. But there it
is.

I sincerely wish somebody in linux library design would add the same unfair
special case. Since its killing me in my large software project, dominating
design decisions and completely unnecessary to be suffering like this.

