A look at a Linux kernel rejection
[Posted November 6, 2002 by corbet]
The Halloween deadline for submission of new features for the 2.5 kernel
has passed. Linus has not made final decisions on everything on
the wishlist, but most of the new features
which will be in the next stable kernel series are in the development
kernel now. And some developments clearly are not going to be in that next
stable kernel. Negative results are often the most interesting - they
expose interesting information on how the system works. So we'll look at
why a seemingly sensible feature did not get into the 2.5 kernel.
The project in question is the Linux
Kernel Crash Dump (LKCD) subsystem. LKCD comes into play if a Linux
kernel panics; it uses the swap area to create an image of the dying
system. That dump can then used to figure out just what went wrong.
Commercial operating systems have had crash dump capabilities for decades;
crash dumps make life much easier for vendors facing angry customers who
want their systems fixed immediately. Given the increasing interest in
"enterprise" deployments and high-quality support, one would think that a
crash dump capability would be a high priority for inclusion. One would
also think that it would not be controversial, since crash dump support
does not slow down or adversely affect users in any way. So why did it
fail to go in?
Certainly, there were some technical concerns about LKCD. A kernel which
is crashing is, by definition, not functioning properly; do you
really want that kernel to write massive amounts of data to disk as
its last act? Some developers fear that LKCD has not taken sufficient care
to avoid overwriting files as it saves its dump to disk. There is no real
history of people having their systems trashed by LKCD, but the worry
remains.
LKCD has not played the kernel political game all that well. In some
cases, it is enough to write code and ask that it be merged. But, as a
general rule, you have to convince Linus that the development really
belongs in his kernel. In practice, that means turning one or more of the
top-tier developers into an advocate for your work. The LKCD developers
have not done that; instead, they have tried putting pressure on Linus
directly. Linus responded by digging in his heels and stating: "...right now I won't touch LKCD
with a ten-foot pole, if only because I've been mail-bombed by people who
argue for it when I have better things to do than to explain myself over
and over again."
But neither of those reasons are the real reason why LKCD got left out in
the cold. As Linus has been saying for a few years, his real job anymore
is saying "no" to people. He says "no" to anything that, in his opinion,
does not really have to be in his kernel. It is a hard job; it requires
enough backbone (and ego) to stand up against great pressure at times. But
it is also a crucial role that must be played well if the kernel code is to
remain maintainable over the long term.
Linus said "no" to LKCD because he did see any real advantage to having it
in his kernel. LKCD, says Linus, is a "vendor-driven" development. Since
LKCD is vendor driven, the vendors that are interested can merge it into
their trees. That is what free software is all about, of course.
This attitude may seem a little harsh, but it makes sense when you consider
a couple of points:
- Vendors, with very rare exception, do not ship Linus's kernels as
he distributes them. Most vendor kernels are heavily patched, with
dozens (or even hundreds) of changes and added features. The spec file for the 2.4.18 kernel shipped
with Red Hat Linux 8.0 lists a full 200 patches; Red Hat has
added User-mode Linux, TUX, the O(1) scheduler, the low-latency patch,
NAPI, netdump (a network-based crash dumper), etc. LKCD would
be a small addition to the list of patches already applied by
distributors. The fact that few vendors have included LKCD suggests
that they, who are the main market for such a feature, are not yet
interested in it.
- It is hard to imagine any vendor being interested in a crash dump
that comes from anything other than one of their own stock
kernels. Linux empowers any user to obtain and build any kernel they
want, but those users cannot, in general, expect their vendors to
chase bugs in "roll your own" kernels.
So, by suggesting that interested vendors patch in LKCD themselves, Linus
is getting that code to the places where it is useful without having to put
it into his tree. A certain amount of kernel source bloat is avoided, the
way is left open for other potential crash dump implementations, and LKCD
is still easily deployed in the situations where it is needed. All told,
it is not an entirely unreasonable decision. The
kernel process is often hard on developers, but it important that Linus
continues to say "no" if we want to have a kernel which does not eventually
collapse under its own weight.
(See also: Linus's explanation of why LKCD
didn't go in, and of how to get patches into the kernel in general, and
this week's Kernel page, which looks at the
next steps for the (non-merged) EVMS project).
(
Log in to post comments)