LWN.net Logo

Advertisement

Fast storage & processing: iSCSI, NFS, SMB/CIFS, clusters for financial, media, HPC, research, virtualization

Advertise here

Why it's harder than it looks

Why it's harder than it looks

Posted Aug 1, 2003 1:24 UTC (Fri) by Peter (guest, #1127)
In reply to: A different approach to module races by cpeterso
Parent article: A different approach to module races

I don't understand why module ref-counting is so difficult.

Try it some time. Write your own patch for module ref-counting. But, so as not to repeat the mistakes of the past, keep in mind a few points:

  • A request for module removal can come at any time. Not just at "convenient" times. The module may be running an IRQ handler on a different CPU, for example, and in such a case the module reference count had better not be zero. You are not allowed to decree that "module removal will only be attempted when the user knows the system isn't using that device". Likewise, you have to assume an SMP system; if you ignore SMP concurrency, you aren't solving the problem.
  • A module cannot, in general, manipulate its own reference count. Because if a module decrements its own count and it goes to zero, the module could disappear - *poof* - right in the middle of the module's execution thread. Likewise it rarely makes sense for a module to increment its own reference count - because, what if the count was zero before the increment operation? Then the module could disappear - *poof* - before the count is incremented. Thus, incrementing one's own ref count is in general no protection at all.
  • Any time you put yourself on a wait queue or similar, you need a reference. But keep in mind that, for reasons discussed above, it doesn't make much sense to take the reference yourself. Someone outside the module generally must do it for you.
  • Any time a module's data structures are registered in an external list (like a list of filesystems, etc), it needs a reference. That reference should be deleted when the structure is unregistered - keeping in mind, again, that the module itself must not decrement its own count from 1 to 0.
  • Try to minimize your overhead. Taking and releasing a module reference on every interrupt is a Bad Idea.
  • Whatever solution you come up with, make sure it is easy to update and verify all device drivers for correct operation. That's actually the hardest part. There are hundreds of drivers in the kernel tree, and a lot of them are not module-removal-safe. Your scheme must not be overly complex, such that it's impossible to fix all these drivers. Ideally, you change things in the kernel core such that drivers become safe automatically.

If you think you can solve this in a race-free manner, without imposing any interesting burdens on 99% of the modules out there (both in kernel and out of it), please post your patch to linux-kernel. I'm sure Rusty and Dave Miller will be happy to look at it.

(Note that Rusty, Dave and others have already solved all of the above problems, but they are left with a compliance issue in that most modules do not follow the rules they have set and thus are not "safe".)


(Log in to post comments)

Why it's harder than it looks

Posted Aug 1, 2003 16:48 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

Here's one more, that rarely gets attention:

- Any time you create a new thread executing the module code, you need a reference. The reference must exist from before the thread gets created to after the thread is dead.

By the way, there are two kinds of references involved. One type prevents the module unloader from attempting an unload. The other tells the module itself whether it's safe to be unloaded. The difference is that the module is capable of undoing some references or waiting for them to go away, whereas the module unloader isn't. E.g. while the module has a major number registered, it has a reference and cannot unload. But you wouldn't want rmmod to fail just because that reference exists. The module cleanup function can undo that reference by unregistering the major number.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds