I explained that in the original version, but the part disappeared in the LWN edits somehow.
lock elision is a general concept that can be implemented in different ways.
XACQUIRE/XRELEASE (HLE) is just one possible way to do it using hardware.
The example code in the article is another more explicit way.
Explicit transactions with RTM is generally more flexible and more powerful and glibc doesn't need (and in fact cannot use) the compatibility advantages.
The lock-appears-unlocked thing is a relatively obscure minor drawback for pthreads.