Fun with modules
[Posted November 20, 2002 by corbet]
So... The feature freeze is in effect, the 2.5 kernel appears to be
relatively stable (for this stage of development), and all seems well with
the world. Then Rusty Russell's new module loader patch goes in, and all
hell breaks loose. What's going on?
The inclusion of the module patch is consistent with the policy Linus laid
out toward the end of October: the freeze date would be considered the
deadline for submission to him. Linus would, when it seemed appropriate,
merge new features after the deadline. He has done very little of that
sort of merging, but the new module code was one of the exceptions.
There are a few problems with the new module subsystem, most of which have
to do with the facts that the job is not complete (i.e. features are
missing), and that many of the changes had not been seriously tested out and
reviewed prior to being merged. The work is not complete because Rusty
never knew whether the patch would go in or not, and was busy enough just
keeping it up to date with kernel releases. The lack of testing and review
is explained by Rusty in this way:
Think back: who in their right mind would compile and test patches
to a rapidly-changing kernel, when those changes required userspace
tool changes and you didn't know if it was going to go in or not?
If you care about modules in 2.5, you're probably a developer who
needs modules to do their job, so why rock the boat?
In other words, the nature of the patch was such that the people who most
needed to test it out were uninclined to do so. Many of those people are
the ones who are upset by the current state of affairs.
The initial module patch did, indeed, lack some features. Little things
like module parameters, device table support (needed for hotplug support),
unloading of modules, a working modprobe, modversions, etc. In
other words, when the module patch first went in, loadable modules stopped
working for almost everybody. Broken features are not that unusual for a
development kernel, but this is a much-used feature in a kernel that was
supposed to be in a feature freeze, so people complained.
The situation was not helped by the fact that the first module patches were
merged just as Rusty got on a plane to the other side of the world. Even
so, he has been working frantically to fix up his patches and get them off
to Linus. By the time 2.5.48 came out (the first actual kernel release
with the new code), some of the worst omissions had been taken care of, and
the rest are being addressed quickly. The level of complaints over missing
features has dropped significantly.
Other sorts of complaints remain, however, as people try to
actually make things work with the new scheme. The biggest controversy has
related to Rusty's attempts to eliminate some of the race conditions that
tend to crop up during module loading and unloading. A common bug found in
module initializion routines is to make resources (i.e. a /proc
file or a registered device) available to the kernel, then to fail module
loading later on. If some other process has accessed that resource in the
mean time, it could find itself trying to execute within a module that was
never fully loaded.
Rusty's solution is to add a "live" flag to each module. Any code
which calls into a module must first increase that module's reference count
with the new try_module_get() function. This function will return
a failure status if the live flag is not set. This flag remains
cleared until the module initialization function has finished its work.
This mechanism guarantees that a module's code will not be called until the
module is ready, and it is clear that the module load process will succeed.
(It is also used to unload modules safely; see Rusty's FAQ for more information on how this
all works).
The problem is that, sometimes, there are legitimate reasons for wanting to
call into a module before that module has finished initialization. For
example, when a disk driver registers a disk, the upper layers immediately
want to have a look at the partition table. Under the new scheme, that
look would fail (since the module was not yet marked as being alive) and
the drive's partitions would not be registered. Thus, a patch which was
intended to fix theoretical problems (very few people have actually been
bitten by module load race conditions) ended up creating real problems with
drivers that, previously, had been working just fine. That did not go over
particularly well.
This problem has been fixed by marking a module as being alive while its
initialization function runs. In other words, initialization is, once
again, unprotected, and driver authors need to be very careful to not
export any interface to the rest of the kernel until they are ready for
that interface to be used. Which makes basic sense.
Driver code also needs, in many cases, to be more fault tolerant. Rusty asked a related question: how does one register
two /proc files? If the registration of the second file fails,
there is no way to safely unregister the first one and fail the module
load. Linus's answer makes basic sense once
you look at it: the module simply can not fail to load at that point. Once
the module has exported an interface, it must be there to handle uses of
that interface. It is better to simply do without the failed
/proc file than fail the whole load and risk race conditions. The
complexity required to allow failing at any time is not justified by the
benefits.
Various other problems (such as the requirement that every module have an
initialization function, or explicitly include a no_module_init
line) are being worked out. Before too long, with luck, modules will just
work again (better than before), and the kernel developers will be arguing
about something else.
(
Log in to post comments)