>6.4.2 Atomicity Optimizations
>
>Adding a relatively expensive operation like a conditional jump (expensive in case the branch
prediction fails) seems to be counter productive.
The Linux kernel uses a smarter strategy, where it patches away the 'lock' insn by modifying
its own code and replace it with NOPs, making it run through neither a jump nor an
unconditional 'lock'. ( http://lwn.net/Articles/164121/ ) It does not work reliably for
userspace due to W^X and such, oh well, but at least it does for the kernel.