Your statements seem reasonable but then I read the part about the reference counting in sysfs which I have actually worked with in detail and I stare in disbelief.
Placing the extra bit of information in s_flags does not work because that breaks the atomic nature of the current test.
Furthermore you have missed one of the most interesting bits about the reference counting in sysfs. Holding just about any lock when removing a sysfs file and by extention a kobject or struct device is a fun way to dead lock the kernel without lockdep having a clue.
Posted Jun 9, 2009 12:15 UTC (Tue) by neilbrown (subscriber, #359)
[Link]
Hi.
Thanks for your comments!
Your "furthermore" observation about locks is probably quite true. However I cannot see exactly how it is relevant. I'm not advocating adding any locks in there.
Your statement that placing the information in s_flags "does not work" is one that I don't agree with. Seeing I didn't include this among the exercises, I will give the reason here.
Let us hypothesise a flag SYS_FLAG_ACTIVE which is set at the same time
that s_active is set to 0. This flag indicates that the entry is 'active'
and it implies a counted reference on the sysfs_dirent. So instead
of initialising s_active to 0, we initialise it to 1 (which is more in
keeping with the kref pattern anyway).
In place of the (rather complex) loop in sysfs_get_active, we have the
code
if (test_bit(SYS_FLAG_ACTIVE, &sd->s_flags)
&& atomic_inc_not_zero(&sd->s_active))
return sd;
else
return NULL;
Note that clearing the ACTIVE bit requires that we drop the reference
that it implies. If that was the last reference we don't need to
wait for any completion.
Finally sysfs_put_active becomes:
if (atomic_dec_and_test(&sd->s_active)) {
struct completion *cmpl = (void*)sd->s_sibling;
complete(cmpl);
}
A few moments thought shows that this now allows us to make sysfs_deactivate even
simpler.
This calls "complete" and then "wait_for_completion" in the same
thread, which is a bit unusual, but should work perfectly.
Providing support for atomic_inc_not_zero were added to kref (which
I have already advocated), this would allow s_active to become
a kref.
I believe the net result of these changes would be that the code is easier
to understand, and hence harder to break at a later date. They do not
change the functionality at all.