Kernel policy issues: compatibility and configuration
[Posted May 20, 2003 by corbet]
When the kernel is deep into a feature freeze and there are not a whole lot
of new developments to worry about, it must be time for some policy
debates. A couple of issues that have come up over the last week or so -
both involving the FUTEX subsystem - cast an interesting light on how
policy issues are made, and how the kernel project interacts with its user
community.
A "FUTEX" is, of course, a fast user-space mutual exclusion primitive.
FUTEXes are similar to SYSV semaphores in terms of the functionality they
provide, though no attempt has been made to be compatible with the SYSV
semaphore interface. A FUTEX is also fast: if there is no contention for a
particular lock (which should be the case most of the time) there is no
need to go into the kernel at all. An actual system call is only made when
a process must wait. FUTEXes are used by the blindingly fast 2.5
threading implementation; other applications will certainly be found for
them as they become more widely available.
Ingo Molnar recently sent out a series of patches to the FUTEX subsystem;
one of them adds a new "requeueing"
feature. This feature addresses a performance problem in glibc resulting
from a double-lock implementation there; with requeueing, a process which
waits on a condition variable can be automatically requeued on a different
lock when the condition becomes true. Requeueing avoids the "thundering
herd" problem (when many processes are awakened only to contend with each
other and go back to sleep) which otherwise results in this situation.
The patch drew complaints about how the new feature is implemented. The
FUTEX subsystem provides a single system call (futex()) with a
command argument. All FUTEX operations are multiplexed through this single
call. This style of system call has been deprecated within the kernel for
a while now; it is difficult to get a handle on what multiplexor calls are
really doing. So it was suggested that, rather than adding yet another
command to futex(), Ingo should really tear out the old system
call and create a set of new, single-function calls.
Ingo did, in fact, send out a patch
implementing the futex_wait(), futex_wake(), and
futex_requeue() system calls. But he left the old
futex() call in as well. And that is the core of the real
disagreement: certain developers feel that,
since no stable kernel was ever released with the old system call, it
should be simply removed before 2.6.0.
The problem, of course, is that stable kernels have been released
with that system call. In particular, Red Hat Linux 9 contains a
version of the 2.4.20 kernel with Native PThread Library and FUTEX support
patched in. Removing the futex() system call would break glibc on
those systems. So the question becomes: should a feature which has,
officially, only been present in development kernels be removed, thus
breaking a widely-deployed distribution? Or does a certain amount of
compatibility cruft have to remain in the 2.6.0 kernel in order to avoid
that breakage?
In this case, the issue has been resolved by a
decree from Linus: compatibility will be preserved.
Something like "it's only been in the development kernels" is
simply not an issue. The only thing that matters is whether it is
used by various binaries or not.
In a separate posting, Linus states:
"...the goodness of an operating system is not in how pretty it is,
but in how well it supports the user." And that attitude, of
course, has a lot to do with why Linux is as successful as it is.
The other FUTEX-related issue has to do with configuration options.
Christopher Hoover recently submitted this
patch which makes the FUTEX subsystem optional; those who don't want
FUTEXes would be able to configure them out of the kernel entirely. Linus,
however, doesn't like the idea:
I will strongly argue against making futexes conditional, simply
because I _want_ people to be able to depend on them in modern
kernels. I do not want developers to fall back on SysV semaphores
just because it's too painful for them to use the faster
alternatives.
Similar issues have come up, for example, with regard to making the
epoll() system call or parts of sysfs optional. Increasingly,
there is an interest in defining a minimal functionality that all Linux
kernels will have. Without that, it can be hard to get developers to use
some of the advanced features offered by the kernel.
On the other hand, developers creating kernels for embedded systems often
want to jettison everything that is not absolutely needed. These people,
of course, argue for the ability to configure every feature in the kernel.
And, as Alan Cox pointed out, making
features configurable forces developers to make the implementation of those
features properly modular.
The likely resolution is that configuration options will be provided
for "core" features, but they will be hard to find. Such options may be
buried under a menu titled "remove core functions for embedded systems," or
hidden from the higher-level configuration interfaces altogether (requiring
the use of a text editor on the .config file to change them).
Different users have very different needs, and the Linux kernel tries to
address as many of those needs as it can.
(
Log in to post comments)