User: Password:
Subscribe / Log in / New account

Sleepable RCU

Sleepable RCU

Posted Mar 9, 2009 13:42 UTC (Mon) by dvyukov (subscriber, #57055)
Parent article: Sleepable RCU

Hi Paul,

I am curious as to why you allow only single outstanding SRCU callback per thread. The problem with RCU is that it allows basically unbounded number of outstanding callbacks, so why just not bound number of outstanding callbacks in SRCU? Memory blocks are frequently quite small, so that subsystem can tolerate up to let's say 1000 pending memory blocks. Restriction on single pending callback looks quite severe (may cause unnecessary blocking), why not provide:
int init_srcu_struct(struct srcu_struct *sp, int limit_of_pending_callbacks);
While limit is not reached synchronize_srcu() is non blocking, otherwise it waits for grace period. I think in many situation it will make synchronize_srcu() practically non-blocking.
Or it's just not worth doing (because of the additional implementation complexity)?

One more question (it does not directly relates to SRCU, but I remember you were providing some computations regarding required number of grace periods somewhere, I hope I am not mixing up your reasoning now).
You are removing all memory fences from reader side, including release fence in read_unlock(). In order to compensate this you are waiting for additional grace periods before executing callbacks. But on some architectures (IA-32, Intel 64, SPARC TSO) release fence is implied with every store, so isn't it possible to reduce the number of required grace periods before executing callbacks on these architectures?
I.e. something like:
#ifdef ACQUIRE_RELEASE_FENCES_ARE_IMPLIED_ON_ARCH // defined for x86 etc
Have you considered such variant? Is it worth doing?

Thank you.

Dmitriy V'jukov

(Log in to post comments)

Sleepable RCU

Posted Jun 17, 2009 17:37 UTC (Wed) by PaulMcKenney (subscriber, #9624) [Link]

Hello, Dmitriy!

Sorry for the delay, but I don't receive email notification of new top-level comments, and so I just now saw your comment.

Your suggestion is quite reasonable, and perhaps someday something similar will be implemented. However, none of the people using or thinking of using SRCU (at least none that I am aware of) need more than one callback per thread. One way that they can get the effect with the current API is to remove a number of items in one pass, use one synchronize_srcu() to wait, then free up the items.

I did submit a patch in late 2006 that reduced the number of synchronize_sched() calls in synchronize_srcu() at the expense of barriers on the read side:, but the speedup was insufficient, so Oleg Nesterov did QRCU instead. :-)

I would expect that SRCU might get more attention as it becomes more heavily used, and people have additional things that they need it to do. RCU has always been very usage-driven:

Thanx, Paul

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds