Hazard pointers

Posted Sep 13, 2024 19:59 UTC (Fri) by pbonzini (subscriber, #60935)
Parent article: The RCU API, 2024 edition

What is the intended use case for hazard pointers? Do you envision them used by "normal" code just like RCU, or only under special circumstances?

Hazard pointers

Posted Sep 14, 2024 13:28 UTC (Sat) by PaulMcKenney (✭ supporter ✭, #9624) [Link] (6 responses)

The thing that made it clear that hazard pointers are needed was the recent reworking of reference-count acquisition to better handle high contention for a case in which a reference needed to be acquired within an RCU read-side critical section.

Of course, we have other ways of dealing with this, for example per-CPU refcounts. Which work very well and thus see a lot of use. But in this case, their per-object memory overhead was too much for traffic to bear. A similar effect may often be achieved using SRCU, but only when it is OK for long readers to block updates. All in all, we have taken RCU quite a bit farther than I would have guessed possible 25 years ago, but we are starting to find areas where more is needed.

Hazard pointers is a good match for this situation. A reference may be acquired without contention, and no per-object storage is required.

To be sure, hazard pointers is limited in its own way: (1) Heavy barriers are needed for acquiring a reference, or, alternatively, IPIs or RCU readers. (2) Unlike RCU, hazard pointers can say "no" to a reference acquisition. But that was already the case in the aforementioned situation. (3) Like RCU, hazard pointers involves deferred reclamation, but is often more able to do emergency reclamations. (4) Hazard pointers is new code and will require some work to stabilize, just like RCU did, and sometimes still does.

So, I expect hazard pointers to initially be used only under special circumstances. Longer term, who knows?

Hazard pointers

Posted Oct 1, 2024 19:48 UTC (Tue) by jseigh (guest, #173778) [Link] (5 responses)

The implementation of an asymmetric memory barrier for hazard pointers, as well as other things, is going to have a vcpu preemption problem also.

Asymmetric memory barriers, that are used to eliminate the need for the expensive memory barrier in a hazard pointer read, look for events on the other processors which would constitute a memory barrier on the other processors. Examples would be context switches, cpu in a wait state, and ipi interrupt handling. The former 2 being used as quiescent states by RCU (if I'm guessing correctly), hence the use of RCU in an asymmetric memory barrier implementation. A vcpu wait state would also constitute a memory barrier, so if they were observable, they to could be used to avoid vcpu preemption issues in a asymmetric memory barrier implementation.

vcpu wait state detection should also be used for IPI implementations since the IPI's are also affected by vcpu preemption.

Hazard pointers continued and vcpu preemption

Posted Oct 2, 2024 14:27 UTC (Wed) by jseigh (guest, #173778) [Link]

A few additional comments.

The hazard pointer implementation in the kernel. Is there an API for that? Also if it's new are they looking for rust or c?

For the vm ballooning problem, some vm's are using pause loop detection for spin locks (although they apparently are not that good at figuring out the right target vcpu to switch to. If it was a global spin lock eventually most of the CPUs would trigger it so the vm would find the target by process of elimination. For RCU if a processor looked unresponsive you could do a spin loop with pauses on the vcpu's quiescent state until it changed or just long enough to trigger the PLE event. You might have to do this for all vcpus via IPI to get the vm to find the right target vcpu.

Hazard pointers

Posted Oct 2, 2024 19:43 UTC (Wed) by PaulMcKenney (✭ supporter ✭, #9624) [Link] (3 responses)

Here is the most recent patch series: https://lore.kernel.org/all/20241002010205.1341915-1-math...

There are earlier ones from Boqun Feng and Neeraj Upadhyay.

Mathieu is also working on a userspace hazard-pointer implementation, which I would expect to use sys_membarrier().

If I understand correctly, for hazard pointers and vCPU preemption, this is a performance problem rather than a correctness problem. If a vCPU is preempted, this will delay the RCU grace period, the IPI response, or the sys_membarrier() response, as the case may be. There have been a number of prototyped solutions to vCPU preemption, for example, based on priority boosting. So, should vCPU preemption become a problem in practice, there are a number of starting points for a solution.

But what is your preferred starting point?

Hazard pointers

Posted Oct 2, 2024 22:39 UTC (Wed) by jseigh (guest, #173778) [Link] (2 responses)

Re hazard pointer API. I was mainly curious since the c++26 hazard pointer API seems to be more complex than I would expect. If I did anything it would be in user space although I have a hazard pointer based proxy collector implementation that suites most of my needs. I don't use deferred reclamation with data structures requiring high rates of modifications, e.g. a linked list lock-free queue because
1) deferred reclamation would kill throughput
2) I already have a lock-free queue which is ABA proof, i.e. doesn't require deferred reclamation to work correctly.

Re vcpu preemption. I don't really have any preferred starting point. Just curious about that also. I did propose a new machine instruction for restartable sequences about 20 years ago. If we had something like that any deferred reclamation schemes based on that, it would have automatically worked in vm's much the same way LL/SC works in vm's without any special handling.

Hazard pointers

Posted Oct 3, 2024 12:01 UTC (Thu) by PaulMcKenney (✭ supporter ✭, #9624) [Link] (1 responses)

The C++ Hazard Pointers API was informed by some years of production experience with the implementation in the Folly library. As you might guess, Maged Michael was the force behind this implementation, the experience with it, and of course with the adjustments undertaken in response to that experience. On the other hand, and to your point, it is not unusual for such experience to result in increased complexity. Linux-kernel RCU is most certainly another example of this tendency, after all. ;-)

That said, I completely agree that if you can achieve your goals simply, then embrace simplicity. In particular, if a simple lock-free queue does the job for you, then by all means use that queue!

At one point, I had hoped that the various hardware transactional memory mechanisms would be useful for many things, including vCPU preemption. And perhaps someday they will. But at present, they seem to have fallen victim to the same "complexity surprise" called out above.

Hazard pointers

Posted Oct 17, 2024 12:32 UTC (Thu) by jseigh (guest, #173778) [Link]

I gave up on the idea of a hazard pointer implementation as soon as I realized it would involve writing a tracing garbage collector of sorts as part of that.

I did remove the hazard pointer logic from smrproxy. Absent the load of the address of the reader lock object, a lock action is 2 machine instructions, a load register followed by a store register. No loop so it is formally wait-free. Unlock is same as before, a store of 0 to memory. I'm starting on porting it to c++ which also involves figuring out what a c++ deferred reclamation API looks like.