LWN: Comments on "Linux kernel design patterns - part 1" http://lwn.net/Articles/336224/ This is a special feed containing comments posted to the individual LWN article titled "Linux kernel design patterns - part 1". hourly 2 It's a comma operator. http://lwn.net/Articles/492559/rss 2012-04-15T23:47:57+00:00 alison <div class="FormattedComment"> Posted Jun 11, 2009 0:56 UTC (Thu) by xoddam notes<br> "the infamous <a href="http://en.wikipedia.org/wiki/Comma_operator">http://en.wikipedia.org/wiki/Comma_operator</a>".<br> <p> Humorously, I just had a look at the Wikipedia article. "Huh, I thought I understood the comma operator!" Copied the code in question out, compiled and tested it, and sure enough the latest revision to the article was erroneous. So yeah, I guess the Comma Operator causes some confusion!<br> <p> -- Alison<br> </div> One use for reference-count biases... http://lwn.net/Articles/473452/rss 2011-12-23T05:30:26+00:00 atiqure <div class="FormattedComment"> hii.. i dnt know who r u but i need help... please tell me something about how to develop a kernel without using any source i just want to use programming language like c,c++,java and assembly language can i do that with these languages ..if yes then please tell me from where to start and how ...pls help it will be christmas gift for me from you ... <br> </div> Linux Kernel Design Patterns - Part 1 http://lwn.net/Articles/337248/rss 2009-06-12T12:39:09+00:00 smitty_one_each <div class="FormattedComment"> Luke Palmer, Dana:<br> <a href="http://lukepalmer.wordpress.com/2009/06/12/the-role-of-a-core-calculus/">http://lukepalmer.wordpress.com/2009/06/12/the-role-of-a-...</a><br> </div> Linux Kernel Design Patterns - Part 1 http://lwn.net/Articles/337244/rss 2009-06-12T12:05:50+00:00 intgr <div class="FormattedComment"> <font class="QuotedText">&gt; Sometimes I dream of a kernel written in a high-level programming language...</font><br> There are many such projects, including Inferno from Bell Labs, Singularity from Microsoft (!) and a whole lot of others.<br> It's an interesting research area because it completely throws away decades-old CPU memory protection, because typesafe virtual machines guarantee that you can't access anything you don't have a reference to.<br> <p> Obviously none of these is usable as a desktop OS though.<br> <p> </div> Linux Kernel Design Patterns - Part 1 http://lwn.net/Articles/337077/rss 2009-06-11T14:41:48+00:00 marcH <div class="FormattedComment"> &lt;troll&gt;<br> Ha! Now I know why kernel development is an order of magnitude harder than developing user level applications. This is not just the intrinsic problem of most bugs crashing the whole system. This is also the problem that advanced software engineering concepts like code reuse are only starting to reach these shores!<br> &lt;/troll&gt;<br> <p> Just for fun: <a href="http://lwn.net/Articles/306843/">http://lwn.net/Articles/306843/</a><br> <p> Sometimes I dream of a kernel written in a high-level programming language...<br> <p> </div> It's a comma operator. http://lwn.net/Articles/336966/rss 2009-06-11T00:56:15+00:00 xoddam <div class="FormattedComment"> This is C. In C, we have many ways of camouflaging incorrect code, including the infamous <a href="http://en.wikipedia.org/wiki/Comma_operator">http://en.wikipedia.org/wiki/Comma_operator</a>.<br> </div> Typo http://lwn.net/Articles/336924/rss 2009-06-10T20:32:50+00:00 intgr <pre> 3 if (atomic_dec_and_lock(&amp;obj->refcnt), &amp;subsystem_lock) { </pre> Either we've got variable argument if statements now, or someone put the closing parenthese in the wrong place. kref_get_not_zero() http://lwn.net/Articles/336796/rss 2009-06-09T20:42:15+00:00 neilbrown <p> Thanks for that reference. <p> I cannot comment on the code under the microscope in that thread as I am not familiar with it at all. However your observation that kref_get_not_zero() can be safe when used under a lock that excludes the object from being freed is one that I completely agree with. It is therefore tempting to call the function kref_get_locked() to match the inode_get_locked that already exists. <p> However it can be valid to use atomic_inc_not_zero() when not holding a lock, as in the example in a previous thread. In that case it is safe not because a lock is held, but because a secondary reference is held (s_count). <p> Either way, the complete pattern description for kref would need to spell out how and when to use kref_get_not_zero() (or whatever it gets called). <p> I must say that Greg's 'ick' in the reply to your linked post surprised me. kref_get_not_zero is (in my view) much less 'ick' than kref_set which is part of the current kref API (exercise: find all 3 places that use kref_set and replace them with calls to other parts of the api). Unfortunate wording http://lwn.net/Articles/336762/rss 2009-06-09T14:44:28+00:00 jake <div class="FormattedComment"> <font class="QuotedText">&gt; Is there a missing "not"?</font><br> <p> oops ... fixed now, thanks!<br> <p> jake<br> </div> Unfortunate wording http://lwn.net/Articles/336756/rss 2009-06-09T14:21:34+00:00 jreiser <i>Sadly, the kref package does make use of this primitive either.</i> [in section <b>The "kref" style</b> at the end of paragraph three.] <p>Please correct this confusing usage. Is there a missing "not"? kref_get_not_zero() http://lwn.net/Articles/336753/rss 2009-06-09T14:11:19+00:00 abatters <div class="FormattedComment"> <font class="QuotedText">&gt; Sadly, the kref package does make use of this primitive either.</font><br> <p> Heh. I suggested introducing kref_get_not_zero() (based on atomic_inc_not_zero()) back in January, but got a lot of resistance. Better luck to the next guy.<br> <p> <a href="http://marc.info/?l=linux-scsi&amp;m=123196509202149&amp;w=4">http://marc.info/?l=linux-scsi&amp;m=123196509202149&amp;w=4</a><br> </div> One use for reference-count biases... http://lwn.net/Articles/336752/rss 2009-06-09T13:50:50+00:00 PaulMcKenney Hmmmm... I was thinking more in terms of something like the following:<P> <code> preempt_disable();<br> if (per_cpu_counter > 0)<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;per_cpu_counter++;<br> else<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;do some costly global-lock-and-variable thing<br> preempt_enable();<br> </code> <p>But it has been a good ten years since I messed with this, so I should take another look at it. <p>In any case, I very much agree with your overall premise that higher-level primitives are a very great improvement over continually re-inventing the wheel, most especially for those wheels with a strong history of being re-invented badly!!! Linux Kernel Design Patterns - Part 1 http://lwn.net/Articles/336747/rss 2009-06-09T12:15:07+00:00 neilbrown <p> Hi. <p> Thanks for your comments! <p> Your "furthermore" observation about locks is probably quite true. However I cannot see exactly how it is relevant. I'm not advocating adding any locks in there. <p> Your statement that placing the information in s_flags "does not work" is one that I don't agree with. Seeing I didn't include this among the exercises, I will give the reason here. <p> Let us hypothesise a flag SYS_FLAG_ACTIVE which is set at the same time that s_active is set to 0. This flag indicates that the entry is 'active' and it implies a counted reference on the sysfs_dirent. So instead of initialising s_active to 0, we initialise it to 1 (which is more in keeping with the kref pattern anyway). <p> In place of the (rather complex) loop in sysfs_get_active, we have the code <pre> if (test_bit(SYS_FLAG_ACTIVE, &amp;sd->s_flags) &amp;&amp; atomic_inc_not_zero(&amp;sd->s_active)) return sd; else return NULL; </pre> For me, that is easier to understand. <p> sysfs_deactivate then becomes <pre> sd->s_sibling = (void *)&amp;wait; clear_bit(SYS_FLAG_ACTIVE, &amp;sd->s_flags); if (! atomic_dec_and_test(&amp;sd->s_active)) wait_for_completion(&amp;wait); sd->s_sibling = NULL; </pre> Note that clearing the ACTIVE bit requires that we drop the reference that it implies. If that was the last reference we don't need to wait for any completion. <p> Finally sysfs_put_active becomes: <pre> if (atomic_dec_and_test(&amp;sd->s_active)) { struct completion *cmpl = (void*)sd->s_sibling; complete(cmpl); } </pre> A few moments thought shows that this now allows us to make sysfs_deactivate even simpler. <pre> sd->s_sibling = (void *)&amp;wait; clear_bit(SYS_FLAG_ACTIVE, &amp;sd->s_flags); sysfs_put_active(sd); wait_for_completion(&amp;wait); </pre> This calls "complete" and then "wait_for_completion" in the same thread, which is a bit unusual, but should work perfectly. <p> Providing support for atomic_inc_not_zero were added to kref (which I have already advocated), this would allow s_active to become a kref. <p> I believe the net result of these changes would be that the code is easier to understand, and hence harder to break at a later date. They do not change the functionality at all. Linux Kernel Design Patterns - Part 1 http://lwn.net/Articles/336744/rss 2009-06-09T10:24:37+00:00 ebiederm <div class="FormattedComment"> Your statements seem reasonable but then I read the part about the reference counting in sysfs which I have actually worked with in detail and I stare in disbelief.<br> <p> Placing the extra bit of information in s_flags does not work because that breaks the atomic nature of the current test.<br> <p> Furthermore you have missed one of the most interesting bits about the reference counting in sysfs. Holding just about any lock when removing a sysfs file and by extention a kobject or struct device is a fun way to dead lock the kernel without lockdep having a clue.<br> <p> </div> One use for reference-count biases... http://lwn.net/Articles/336720/rss 2009-06-09T05:39:48+00:00 neilbrown <p> Yes, I wouldn't be surprised if reference counts that are spread across multiple per-cpu variables would have very different trade-offs and hence different Patterns to the more traditional kinds! <p> I'm having trouble picturing exactly how the counters you describe would work (and particularly exactly what happens when the per-cpu counter hits zero) but it seems possible that the "one bit of information" that I claim a bias stores would, in this case, be the bit "Someone cares about a precise total value" and that one bit could conceivably be stored in a read-mostly cache line that could be easily shared among multiple processors. <p> So the code might look like: <pre> if (dec_and_test(per_cpu_counter)) if (__test_bit(flag_name, &amp;flag_variable)) do some costly cross-cpu thing; </pre> <p> Is there any chance that would achieve the same result? <p> It may well be a case where you don't want to pay the price of an extra bit though. One use for reference-count biases... http://lwn.net/Articles/336719/rss 2009-06-09T04:36:44+00:00 PaulMcKenney <div class="FormattedComment"> One use for reference-count biases is where the reference count is modified often enough that it must be represented as a per-CPU variable, as can happen when the reference count represents the number of outstanding I/Os. However, unless the reference count tends to be much larger than the number of CPUs, you get no benefit from the per-CPU nature of the variable. Adding in a large bias, on the other hand, allows the counter to operate on a completely per-CPU basis under normal operating conditions.<br> <p> This trick is useful in cases where it is only necessary to actually test the counter for zero-or-not in special situations, such as when attempting to remove or reconfigure the I/O device. When such a situation arises, one subtracts out the bias, and then runs inefficiently until the I/Os drain and the reference count goes to zero, at which point the device may be safely removed or reconfigured. The bias may be added back in to return to normal operating conditions.<br> <p> A very specialized use of a reference count, perhaps, but one that has actually appeared in real code in past lives.<br> </div>