I explained that in the original version, but the part disappeared in the LWN edits somehow.
lock elision is a general concept that can be implemented in different ways.
XACQUIRE/XRELEASE (HLE) is just one possible way to do it using hardware.
The example code in the article is another more explicit way.
Explicit transactions with RTM is generally more flexible and more powerful and glibc doesn't need (and in fact cannot use) the compatibility advantages.
The lock-appears-unlocked thing is a relatively obscure minor drawback for pthreads.
Posted Jan 31, 2013 19:38 UTC (Thu) by luto (subscriber, #39314)
[Link]
Now I'm curious. Why can't glibc use XACQUIRE/XRELEASE?
Lock elision in the GNU C library
Posted Jan 31, 2013 21:35 UTC (Thu) by andikleen2 (guest, #52506)
[Link]
The main advantage of HLE is compatibility, that you don't need a separate code path for elision. But it requires no extra writes to the lock cache line for successful elision. glibc always writes to the lock's owner field in the default code path, so you would need a separate code path anyways to disable that.
With a separate code path (lock type) it's possible to use HLE. I did it, but removed that support later because it was redundant with RTM and RTM is better here because it's more flexible.
Lock elision in the GNU C library
Posted Jan 31, 2013 23:08 UTC (Thu) by hpa (subscriber, #48575)
[Link]
TSX exports two interfaces - XACQUIRE/XRELEASE (Hardware Lock Elision, HLE) is actually the less powerful of the two. What Andi describes here uses the other interface, Restricted Transactional Memory (RTM) which is the far more powerful interface.
The one advantage of HLE as opposed to RTM is that you can add HLE to existing locking code so that code that runs both on TSX-enabled hardware and non-TSX-enabled hardware still can only use one code path.
So the question isn't "why can't glibc use XACQUIRE/XRELEASE", but rather "why doesn't it have to", and the answer, of course, is that glibc already supports multiple kinds of locks.
Lock elision in the GNU C library
Posted Jan 31, 2013 23:11 UTC (Thu) by luto (subscriber, #39314)
[Link]
Except that, if I remember correctly, the HLE interface maintains the illusion locally that you've locked the lock, which would allow recursion detection to work without adding a second cache line to the lock.