LWN.net Logo

Fun with modules

So... The feature freeze is in effect, the 2.5 kernel appears to be relatively stable (for this stage of development), and all seems well with the world. Then Rusty Russell's new module loader patch goes in, and all hell breaks loose. What's going on?

The inclusion of the module patch is consistent with the policy Linus laid out toward the end of October: the freeze date would be considered the deadline for submission to him. Linus would, when it seemed appropriate, merge new features after the deadline. He has done very little of that sort of merging, but the new module code was one of the exceptions.

There are a few problems with the new module subsystem, most of which have to do with the facts that the job is not complete (i.e. features are missing), and that many of the changes had not been seriously tested out and reviewed prior to being merged. The work is not complete because Rusty never knew whether the patch would go in or not, and was busy enough just keeping it up to date with kernel releases. The lack of testing and review is explained by Rusty in this way:

Think back: who in their right mind would compile and test patches to a rapidly-changing kernel, when those changes required userspace tool changes and you didn't know if it was going to go in or not? If you care about modules in 2.5, you're probably a developer who needs modules to do their job, so why rock the boat?

In other words, the nature of the patch was such that the people who most needed to test it out were uninclined to do so. Many of those people are the ones who are upset by the current state of affairs.

The initial module patch did, indeed, lack some features. Little things like module parameters, device table support (needed for hotplug support), unloading of modules, a working modprobe, modversions, etc. In other words, when the module patch first went in, loadable modules stopped working for almost everybody. Broken features are not that unusual for a development kernel, but this is a much-used feature in a kernel that was supposed to be in a feature freeze, so people complained.

The situation was not helped by the fact that the first module patches were merged just as Rusty got on a plane to the other side of the world. Even so, he has been working frantically to fix up his patches and get them off to Linus. By the time 2.5.48 came out (the first actual kernel release with the new code), some of the worst omissions had been taken care of, and the rest are being addressed quickly. The level of complaints over missing features has dropped significantly.

Other sorts of complaints remain, however, as people try to actually make things work with the new scheme. The biggest controversy has related to Rusty's attempts to eliminate some of the race conditions that tend to crop up during module loading and unloading. A common bug found in module initializion routines is to make resources (i.e. a /proc file or a registered device) available to the kernel, then to fail module loading later on. If some other process has accessed that resource in the mean time, it could find itself trying to execute within a module that was never fully loaded.

Rusty's solution is to add a "live" flag to each module. Any code which calls into a module must first increase that module's reference count with the new try_module_get() function. This function will return a failure status if the live flag is not set. This flag remains cleared until the module initialization function has finished its work. This mechanism guarantees that a module's code will not be called until the module is ready, and it is clear that the module load process will succeed. (It is also used to unload modules safely; see Rusty's FAQ for more information on how this all works).

The problem is that, sometimes, there are legitimate reasons for wanting to call into a module before that module has finished initialization. For example, when a disk driver registers a disk, the upper layers immediately want to have a look at the partition table. Under the new scheme, that look would fail (since the module was not yet marked as being alive) and the drive's partitions would not be registered. Thus, a patch which was intended to fix theoretical problems (very few people have actually been bitten by module load race conditions) ended up creating real problems with drivers that, previously, had been working just fine. That did not go over particularly well.

This problem has been fixed by marking a module as being alive while its initialization function runs. In other words, initialization is, once again, unprotected, and driver authors need to be very careful to not export any interface to the rest of the kernel until they are ready for that interface to be used. Which makes basic sense.

Driver code also needs, in many cases, to be more fault tolerant. Rusty asked a related question: how does one register two /proc files? If the registration of the second file fails, there is no way to safely unregister the first one and fail the module load. Linus's answer makes basic sense once you look at it: the module simply can not fail to load at that point. Once the module has exported an interface, it must be there to handle uses of that interface. It is better to simply do without the failed /proc file than fail the whole load and risk race conditions. The complexity required to allow failing at any time is not justified by the benefits.

Various other problems (such as the requirement that every module have an initialization function, or explicitly include a no_module_init line) are being worked out. Before too long, with luck, modules will just work again (better than before), and the kernel developers will be arguing about something else.


(Log in to post comments)

Copyright © 2002, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds