

Scaling Existing Lock-based Applications with Lock Elision  - __Joker
http://queue.acm.org/detail.cfm?id=2579227

======
YZF
How is this different than your standard critical section that uses an atomic
(locked) test and set to determine if there is contention before trying to
acquire the lock? Is it just that you get rid of the latency involved with
test and set? (which is still much much smaller than the OS synchronization
objects)

It sounds like the new hardware support allows you to speculatively continue
execution and fail later? Seems like the wrapping approach described in this
paper doesn't take advantage of that?

EDIT:
[http://en.wikipedia.org/wiki/Transactional_Synchronization_E...](http://en.wikipedia.org/wiki/Transactional_Synchronization_Extensions)

~~~
cwzwarich
With a traditional spin lock and critical section, only one core can actually
be executing code in the critical section at a time, and ownership of the
cache line containing the lock bounces around between cores.

With lock elision, two cores can both optimistically execute code within the
critical section and only abort when a dynamic data race (or some other abort
condition, like running out of space in a hardware buffer) occurs. In that
case, one core may take the lock and force the other to wait, just like the
traditional lock.

