Brief items
The current development kernel remains 2.6.37-rc1; no new prepatches
have been released over the last week. The merge rate has also been low,
with only 148 non-merge changesets merged since 2.6.37-rc1 as of this
writing.
Stable updates: there have been no stable updates released in the
last week.
Comments (none posted)
And please also don't top-post. Being the antisocial egomaniacs we
are, people on lkml prefer to dissect the messages we're replying
to, insert insulting comments right where they would be most
effective and remove the passages which can't yield effective
insults.
--
Tejun Heo
You've done it. After hours of gdb and caffeine, you've finally
got a shell on your target's server. Maybe next time they will
think twice about running MyFirstCompSciProjectFTPD on a production
machine. As you take another sip of Mountain Dew and pick some of
the cheetos out of your beard, you begin to plan your next move -
it's time to tackle the kernel.
What should be your goal? Privilege escalation? That's
impossible, there's no such thing as a privilege escalation
vulnerability on Linux. Denial of service? What are you, some
kind of script kiddie? No, the answer is obvious. You must read
the uninitialized bytes of the kernel stack, since these bytes
contain all the secrets of the universe and the meaning of life.
--
Dan Rosenberg
Comments (none posted)
As a result of discussions held at two recent embedded Linux summits (and
reported back to the recent Kernel Summit), the community has decided to identify specific kernel versions as "flag versions" to try to reduce "version fragmentation". On the linux-embedded mailing list, Tim Bird (architecture group chair for the CE Linux Forum) has announced that 2.6.35 will be the first embedded flag version, and it will be supported by (at least) Sony, Google, MeeGo, and Linaro.
"
First,
it should be explained what having a flag version means. It means that
suppliers and vendors throughout the embedded industry will be
encouraged to use a particular version of the kernel for software
development, integration and testing. Also, industry and community
developers agree to work together to maintain a long-term stable branch
of the flag version of the kernel (until the next flag version is
declared), in an effort to share costs and improve stability and quality."
Full Story (comments: 10)
The Free Software Foundation Latin America has released a version of the
2.6.36 kernel with offending firmware (or drivers that need that firmware)
stripped out. They also are trying to tap into the ongoing discussion of
"open core" business models. "
Sad to say, Linux fits the definition of Free Bait or Open Core. Many
believe that Linux is Free Software or Open Source, but it isn't.
Indeed, the Linux-2.6.36 distribution published by Mr. Torvalds
contains sourceless code under such restrictive licensing terms as
'This material is licensed to you strictly for use in conjunction with
the use of COPS LocalTalk adapters', presented as a list of numbers in
the corresponding driver, and 'This firmware may not be modified and
may only be used with Keyspan hardware' and 'Derived from proprietary
unpublished source code, Copyright Broadcom' in the firmware
subdirectory, just to name a few examples."
Full Story (comments: 264)
The Linaro project has announced the release of Linaro 10.11. "
10.11 is the first public release that brings together the huge amount
of engineering effort that has occurred within Linaro over the past 6
months. In addition to officially supporting the TI OMAP3 (Beagle
Board and Beagle Board XM) and ARM Versatile Express platforms, the
images have been tested and verified on a total of 7 different platforms
including TI OMAP4 Panda Board, IGEPv2, Freescale iMX51 and ST-E
U8500."
Full Story (comments: none)
By Jonathan Corbet
November 10, 2010
A kernel oops produces a fair amount of data which can be useful in
tracking down the source of whatever went wrong. But that data is only
useful if it can be captured and examined by somebody who knows how to
interpret it. Capturing oops output can be hard; it typically will not
make it to any logfiles in persistent storage. That's why we still see
oops output posted in the form of a photograph taken of the monitor. Using
cameras as a debugging tool can work for a desktop system, but it certainly
does not scale to a data center containing thousands of systems. Google is
thought to operate a site or two meeting that description, so it's not
surprising to see an interest in better management of oops information
there.
Google has had its own oops collection tool running internally for years;
that has recently been posted for merging as netoops. Essentially, netoops
is a simple driver which will, in response to a kernel oops, collect the
most recent kernel logs and deliver them to a server across the net. The
functionality seems useful, but the first version of the patch was
questioned: netoops looks somewhat similar to the existing netconsole
system, so it wasn't clear that a need for it exists. Why not just add any
missing features to netconsole?
Mike Waychison, who posted the patch, responded with a number of reasons
which have since found their way into the changelog. Netoops only sends
data on an oops, so it is less hard on network bandwidth. The data is
packaged in a more structured manner which is easier for machines and
people to parse; that has enabled the creation of a vast internal "oops
database" at Google. Netoops can cut off output after the first oops, once again
saving bandwidth. And so on. There are enough differences that netconsole
maintainer Matt Mackall agreed that it made
sense for netoops to go in as a separate feature.
That said, there is clear scope for sharing some code between the two and,
perhaps, improving netconsole in the process. The current version of the
netoops patch includes new work to bring about that sharing. There seems
to be no further opposition, but it's worth noting that Mike, in the patch
changelog, notes that he's not entirely happy with either the user-space
ABI or the data format. So this might be a good time for others interested
in this sort of functionality to have a look and offer their suggestions
and/or patches.
Comments (none posted)
Kernel development news
By Jonathan Corbet
November 9, 2010
At the recent
Kernel Summit checkpoint/restart
discussion, developer Oren Laadan was asked to submit a trimmed-down
version of the patch which would just show the modifications to existing
core kernel code. Oren duly responded with a "
naked patch" which, as one might have
expected, kicked off a new round of discussion. What many observers may
not have expected was the appearance of an alternative approach to the
problem which has seemingly been under development for years. Now we have
two clearly different ways of solving this problem but no apparent increase
in clarity; the checkpoint/restart problem, it seems, is simply
complicated.
The responses to Oren's patch will not have been surprising to anybody who
has been following the discussion. Kernel developers are nervous about the
broad range of core code which is changed by this patch. They don't like
the idea of spreading serialization hooks around the kernel which, the
authors' claims to the contrary notwithstanding, look like they could be a
significant maintenance burden over time. It is clear that kernel
checkpoint/restart can never handle all processes; kernel developers wonder
where the real-world limits are and how useful the capability will be in
the end. The idea of moving checkpointed processes between kernel versions
by rewriting the checkpoint image with a user-space tool causes kernel
hackers to shiver. And so on; none of these worries are new.
Tejun Heo raised all these issues and
more. He also called out an interesting alternative checkpoint/restart
implementation called DMTCP,
which solves the problem entirely in user space. With DMTCP in mind, Tejun
concluded:
I think in-kernel checkpointing is in awkward place in terms of
tradeoff between its benefits and the added complexities to
implement it. If you give up coverage slightly, userland
checkpointing is there. If you need reliable coverage, proper
virtualization isn't too far away. As such, FWIW, I fail to see
enough justification for the added complexity.
As one might imagine, this post was followed by an extended conversation
between the in-kernel checkpoint/restart developers and the DMTCP
developers, who had previously not put in an appearance on the kernel
mailing lists. It seems that the two projects were each surprised to learn
of the other's existence.
The idea behind DMTCP is to checkpoint a distributed set of processes
without any special support from the kernel. Doing so requires support
from the processes themselves; a checkpointing tool is injected into their
address spaces using the LD_PRELOAD mechanism. DMTCP is able to
checkpoint (and, importantly, restart) a wide variety of programs,
including those running in the Python or Perl interpreters and those using
GNU Screen. DMTCP is also used to support the universal reversible debugger
project. It is, in other words, a capable tool with real-world uses.
Kernel developers naturally like the idea of eliminating a bunch of
in-kernel complexity and solving a problem in user space, where things are
always simpler. The only problem is that, in this case, it's not
necessarily simpler. There is a surprising amount that DMTCP can do with
the available interfaces, but there are also some real obstacles. Quite a
bit of information about a process's history is not readily available from
user space, but that history is often needed for checkpoint/restart;
consider tracking whether two file descriptors are shared as the result of
a fork() call or not. To keep the requisite information around,
DMTCP must place wrappers around a number of system calls. Those wrappers
interpose significant new functionality and may change semantics in
unpredictable ways.
Pipes are hard for DMTCP to handle, so the pipe() wrapper has to
turn them into full Unix-domain sockets. There is also an interesting
dance required to get those sockets into the proper state at restart time.
The handling of signals - not always straightforward even in the simplest
of applications - is made more complicated by DMTCP, which also must
reserve one signal (SIGUSR2 by default) for its own uses. The
system call wrappers try to hide that signal handler from the application;
there is also the little problem that signals which are pending at checkpoint time may be lost.
Checkpointing will interrupt system calls, leading to unexpected
EINTR returns; the wrappers try to compensate by automatically
redoing the call when this happens. A second VDSO page must be introduced
into a restarted process because it's not possible to control where the
kernel places that page. There's a "virtual PID" layer which
tries to fool restarted processes into thinking that they are still running
with the same process ID they had when they were checkpointed.
There is an interesting plan for restarting programs which have a
connection to an X server: they will wrap Xlib (not a small interface) and
use those wrappers to obtain the state of the window(s) maintained by the
application. That state can then be recreated at restart time before
reconnecting the application with the server. Meanwhile, applications
talking to an xterm are forced to reinitialize themselves at restart time
by sending two SIGWINCH signals to them. And so on.
Given all of that, it is not surprising that the kernel checkpoint/restart
developers see their approach as being a simpler, more robust, and more
general solution to the problem. To them, DMTCP looks like a shaky attempt
to reimplement a great deal of kernel functionality in user space. Matt
Helsley summarized it this way:
Frankly it sounds like we're being asked to pin our hopes on a
house of cards -- weird userspace hacks involving extra processes,
hodge-podge combinations of ptrace, LD_PRELOAD, signal hijacking,
brk hacks, scanning passes in /proc (possibly at numerous times
which begs for races), etc....
In contrast, kernel-based cr is rather straight forward when you
bother to read the patches. It doesn't require using combinations
of obscure userspace interfaces to intercept and emulate those very
same interfaces. It doesn't add a scattered set of new ABIs.
Seasoned LWN readers will be shocked to learn that few minds appear to have
been changed by this discussion. Most developers seem to agree that some
sort of checkpoint/restart functionality would be a useful addition to
Linux, but they differ on how it should be done. Some see a kernel-side
implementation as the only way to get even close to a full solution to the
problem and as the simplest and most maintainable option. Others think
that the user-space approach makes more sense, and that, if necessary, a
small number of system calls can be added to simplify the implementation.
It has the look of the sort of standoff that can keep a project like this
out of the kernel indefinitely.
That said, something interesting may happen here. One thing that became
reasonably clear in the discussion is that a complete, performant, and
robust checkpoint/restart implementation will almost certainly require
components in both kernel and user space. And it seems that the developers
behind the two implementations will be getting
together to talk about the problem in a less public setting. With
luck, determination, and enough beer, they might just figure out a way to
solve the problem using the best parts of both approaches. That would be a
worthy outcome by any measure.
Comments (26 posted)
By Jake Edge
November 10, 2010
Device trees are a fairly hot topic in the embedded Linux world as a means
to more easily support multiple system-on-chip (SoC) devices with a single
kernel image. Much of the work implementing device trees for the PowerPC
architecture, as well as making that code more generic so that others could
use it, has been done by Grant Likely. He spoke at the recent Embedded
Linux Conference Europe (ELCE) to explain what device trees are, what
they can do, and to update the attendees on efforts to allow the ARM
architecture use them.
All of the work that is going into adding device tree support for various
architectures is not being done for an immediate benefit to users, Likely said. It
is, instead, being done to make it easier to manage embedded Linux
distributions, while simplifying the boot process. It will also make
it easier to port devices (i.e. components and "IP blocks") to different
SoCs. But it is "not going to make your Android phone faster".
A device tree is just a data structure that came from OpenFirmware. It
represents the devices that are part of particular system, such that it can
be passed to the kernel at boot time, and the kernel can initialize and use
those devices. For architectures that don't use device trees, C code must
be written to add all of the different devices that are present in the hardware.
Unlike desktop and server systems, many embedded SoCs do not provide a way
to enumerate their devices at boot time. That means developers have to
hardcode the devices, their addresses, interrupts, and so on, into the kernel.
The requirement to put all of the device definitions into C code is hard to
manage, Likely said. Each different SoC variant has to have its own,
slightly tweaked kernel version. In addition, the full configuration of
the device is scattered over multiple C files, rather than kept in a single
place. Device trees can change all of that.
A device tree consists of a set of nodes with properties, which are simple key-value pairs. The nodes are organized into a tree
structure, unsurprisingly, and the property values can store arbitrary data
types. In addition, there are some standard usage conventions for
properties so that they can be reused in various ways. The most important
of these is the compatible property that uniquely defines devices, but
there are also conventions for specifying address ranges, IRQs, GPIOs, and
so forth.
Likely used a simplified
example from
devicetree.org to show what these trees look like. They are
defined with an essentially C-like syntax:
/ {
compatible = "acme,coyotes-revenge";
cpus {
cpu@0 {
compatible = "arm,cortex-a9";
};
cpu@1 {
compatible = "arm,cortex-a9";
};
};
serial@101F0000 {
compatible = "arm,pl011";
};
...
external-bus {
ethernet@0,0 {
compatible = "smc,smc91c111";
};
i2c@1,0 {
compatible = "acme,a1234-i2c-bus";
rtc@58 {
compatible = "maxim,ds1338";
};
};
...
The
compatible tags allow companies to define their own namespace
("acme", "arm", "smc", and "maxim" in the example) that they can manage
however they like.
The kernel already knows how to attach an ethernet device to a local bus or
a temperature sensor to an i2c bus, so why redo it in C for every
different SoC, he asked. By parsing the
device tree (or the binary "flattened" device tree), the kernel can set up the
device bindings that it finds in the tree.
One of the questions that he often gets asked is: "why bother
changing what we already have?" That is a "hard question to
answer" in some ways, because for a lot of situations, what we have
in the kernel currently does work. But in order to support large numbers
of SoCs with a single kernel (or perhaps a small set of kernels), something
like device tree is required. Both Google (for Android) and Canonical (for
Linaro) are very interested in seeing device tree support for ARM.
Beyond that, "going data-driven to describe our platforms is the
right thing to do". There is proof that it works in the x86 world
as "that's how it's been done for a long time". PowerPC
converted to device trees five years ago or so and it works well. There
may be architectures that won't need to support multiple devices
with a single kernel, and device trees may not be the right choice for
those, but for most of the architectures that Linux supports, Likely
clearly thinks that device trees are the right solution.
He next looked at what device trees aren't. They don't replace
board-specific code, and developers will "still have to write drivers
for weird stuff". Instead, device trees simplify the common case.
Device tree is also not a boot architecture, it's "just a data
structure". Ideally, the firmware will pass a device tree to the
kernel at boot time, but it doesn't have to be done that way. The device
tree could be included into the kernel image. There are plenty of devices
with firmware that doesn't know about device trees, Likely said, and they
won't have to.
There is currently a push to get ARM devices into servers, as they can
provide lots of cores at low power usage. In order to facilitate that,
there needs to be one CD that can boot any of those servers, like it is in
the x86 world. Device trees are what will be used to make that happen,
Likely said.
Firmware that does support device trees will obtain a .dtb
(i.e. flattened device tree binary) file from somewhere in memory, and
either pass it verbatim to the kernel or modify it before passing. Another
option would be for the firmware to create the .dtb on-the-fly,
which is what OpenFirmware does, but that is a "dangerous"
option. It is much easier to change the kernel than the firmware, so any
bugs in the firmware's .dtb creation code will inevitably be
worked around in the
kernel. In any case, the kernel doesn't care how the .dtb is created.
For ARM, the plan is to pass a device tree, rather than the existing,
rather inflexible ARM device configuration known as ATAGs. The kernel
will set up the memory for the processor and unflatten the .dtb
into memory. It will unpack it into a "live tree" that can
then be directly dereferenced and used by the kernel to register devices.
The Linux device model is also tree-based, and there is some congruence
between device tree and the device model, but there is not a direct 1-to-1
mapping between them. That was done "quite deliberately" as
the design goal was "not to describe what Linux wants",
instead it was meant to describe the hardware. Over time, the Linux device
model will change, so hardcoding Linux-specific values into the device tree
has been avoided. The device tree is meant to be used as support data, and
the devices it describes get registered using the Linux device model.
Device drivers will match compatible property values with device nodes in
a device tree. It is the driver that will determine how to
configure the device based on its description in a device tree. None of
that configuration code lives in the device tree handling, it is part of
the drivers which can then be built as loadable kernel modules.
Over the last year, Likely has spent a lot of time making the device tree
support be generic. Previously, there were three separate copies of much
of the support code (for Microblaze, SPARC, and PowerPC). He has removed
any endian dependencies so that any architecture can use device trees.
Most of that work is now done and in the mainline. There is some minimal
board support that has not yet been mainlined. The MIPS architecture has
added device tree support as of 2.6.37-rc1 and x86 was close to getting it
for 2.6.37, but some last minute changes caused the x86 device tree support to
be held back until 2.6.38.
The ARM architecture still doesn't have device tree support and ARM
maintainer Russell King is "nervous about merging an unmaintainable
mess". King is taking a wait-and-see approach until a real ARM
board has device tree
support. Likely agreed with that approach and ELCE provided an opportunity
for him and King to sit down and discuss the issue. In the next
six months or so (2.6.39 or 2.6.40), Likely expects that the board support
will be completed and he seems confident that ARM device tree support in
the mainline won't be far behind.
There are other tasks to complete in addition to the board support, of
course, with documentation being high on that list. There is a need for
documentation on how to use device trees, and on the property conventions
that are being used. The devicetree.org wiki is a
gathering point for much of that work.
There were several audience questions that Likely addressed, including the
suitability of device tree for Video4Linux (very suitable and the
compatible property gives each device manufacturer its own
namespace), the
performance impact (no complaints, though he hasn't profiled it —
device trees are typically 4-8K in size, which should minimize their
impact), and licensing or patent issues (none known so far, the code is
under a BSD license so it can be used by proprietary vendors — IBM's
lawyers don't seem concerned). Overall, both Likely and the audience
seemed very optimistic about the future for device trees in general and
specifically for their future application in the ARM architecture.
Comments (7 posted)
By Jake Edge
November 10, 2010
The number of kernel regressions over time is one measure of the overall
quality of the kernel. Over the last few years, Rafael
Wysocki has taken on the task of tracking those regressions and regularly
reporting on them to the linux-kernel mailing list. In addition, he has
presented a "regressions report" at the last few Kernel Summits [2010, 2009, and 2008]. As part of
his preparation for this year's talk, Wysocki wrote a paper, Tracking of Linux Kernel
Regressions [PDF], that digs
in deeply and explains the process of Linux regression tracking, along with
various
trends in regressions over time. This article is an attempt to summarize
that work.
A regression is a user-visible change in the behavior of the kernel between
two releases. A program that was working on one kernel version and then
suddenly stops working on a newer version has detected a kernel
regression. Regressions are probably the most annoying kind of bug that
crops up in the
kernel development process, as well as the one of the most visible. In
addition, Linus Torvalds has decreed that regressions may not be
intentionally introduced—to fix a perceived kernel shortcoming for
example—and that fixing inadvertent regressions should be a high
priority for the kernel developers.
There is another good reason to concentrate on fixing any regressions: if you
don't, you really have no assurance that the overall quality of the code is
increasing, or at least staying the same. If things that are currently
working continue to work in the future, there is a level of comfort that
the bug situation is, at least, not getting worse.
Regression tracking process
To that end, various efforts have been made to track kernel regressions,
starting with Adrian Bunk in 2007 (around 2.6.20), through Michał
Piotrowski, and then to Wysocki during the 2.6.23 development cycle. For
several years, Wysocki handled the regression tracking himself, but it is
now a three-person operation, with Maciej Rutecki turning email regression reports
into kernel bugzilla entries, and Florian Mickler maintaining the
regression entries: marking those that have been fixed, working with the
reporters to determine which have been fixed, and so on.
The kernel bugzilla is used to track the regression meta-information as
well as the individual bugs. Each kernel release has a bugzilla entry that
tracks all of the individual regressions that apply to it. So, bug #16444
tracks the regressions reported against the 2.6.35 kernel release. Each
individual regression is listed in the "Depends on" field in the meta-bug,
so that a quick look will show all of the bugs, and which have been
closed.
There is another meta-bug, bug #15790,
that tracks all of the release-specific meta-bugs. So, that bug depends on
#16444 for 2.6.35, as well as #21782 for 2.6.36, #15310 for
2.6.33, and so on. Those bugs are used by the scripts that Wysocki runs to
generate the "list of known regressions" which gets posted to linux-kernel
after each -rc release.
Regressions are added to bugzilla one week after they are reported by
email, if they
haven't been fixed the interim. That's a change from earlier practices to
save Rutecki's time as well as to reduce unhelpful noise. Bugzilla entries
are linked to fixes as they become available. The bug state is
changed to "resolved" once a patch is available and "closed" once Torvalds
merges the fix into the mainline.
Regressions for a particular kernel release are tracked through the
following two development cycles. For example, when 2.6.36 was released,
the tracking of 2.6.34 regressions ended. When 2.6.37-rc1 was released,
that began the tracking for 2.6.36, and once 2.6.37 is released in early
2011, tracking of 2.6.35 regressions will cease. That doesn't mean that
any remaining regressions have magically been fixed, of course, and they
can still be tracked using the meta-bug associated with a release.
Regression statistics
To look at the historical regression data, Wysocki compiled a table that
listed the number of regressions reported for each of the last ten kernel
releases as well as the number that are still pending (i.e. have not been
closed). For the table, he has removed invalid and
duplicate reports from those listed in bugzilla. It should also be noted
that after 2.6.32, the methodology for adding new regressions changed such
that those that were fixed in the first week after being reported were not
added to bugzilla. That at least partially explains the drop in reports
after 2.6.32.
| Kernel | # reports | # pending |
| 2.6.26 | 180 | 1 |
| 2.6.27 | 144 | 4 |
| 2.6.28 | 160 | 10 |
| 2.6.29 | 136 | 12 |
| 2.6.30 | 177 | 21 |
| 2.6.31 | 146 | 20 |
| 2.6.32 | 133 | 28 |
| 2.6.33 | 116 | 18 |
| 2.6.34 | 119 | 15 |
| 2.6.35 | 63 | 28 |
| Total | 1374 | 157 |
| Reported and pending regressions |
The number of "pending" regressions reflects the bugs that have been fixed
since the release, not just those that were fixed during the
two-development-cycle
tracking period. In order to look more closely at what
happens during the tracking period, Wysocki provides another table. That
table separates the
two most important events during the tracking period, which are the releases of
the subsequent kernel versions (i.e. for 2.6.N, the releases of N+1 and N+2).
For example, once the 2.6.35 kernel was
released, that ended the period where the development focus was on fixing
regressions in 2.6.34. At that point, the merge window for 2.6.36 opened
and developers switched their focus to adding new features for the next
release. Furthermore, once 2.6.36
was released, regressions were no longer tracked at all for 2.6.34. That is
reflected in the following table where the first "reports" and "pending"
columns correspond to the N+1 kernel release, and the second to the N+2 release.
| Kernel | # reports (N+1) | # pending (N+1) | # reports (N+2) | # pending (N+2) |
| 2.6.30 | 122 | 36 | 170 | 45 |
| 2.6.31 | 89 | 31 | 145 | 42 |
| 2.6.32 | 101 | 36 | 131 | 45 |
| 2.6.33 | 74 | 33 | 114 | 27 |
| 2.6.34 | 87 | 31 | 119 | 21 |
| 2.6.35 | 61 | 28 | | |
| Reported and pending regressions (separated by release) |
The table shows that the number of regressions still goes up fairly
substantially after the release the next (N+1) kernel. This indicates that
the -rc kernels may not be getting as much testing as the released kernel
does. In addition, the pending kernel numbers are substantially higher for
the N+2 kernel release, at least in the 2.6.30-32 timeframe. Had that
trend continued, it could be argued that the kernel developers were paying
less attention to regressions in a particular release once the next
release was out. But the 2.6.33-34 numbers are fairly
substantially down after the N+2 release, and Wysocki says that there are
indications that 2.6.35
is continuing that trend.
Reporting and fixing regressions
We can look at the number of outstanding regressions over time in one of
the graphs from Wysocki's paper. For each kernel release, there are
generally two peaks that indicate where the number of open regressions is
highest. These roughly correspond with the end of the merge window and the
release date for the next kernel version. Once past those maximums, the
graphs tend to level out.
There are abrupt jumps in the number of regressions that are probably an
artifact of how the reporting is done. Email reports are generally batched
up, with multiple reports being added at roughly the same time.
Maintenance on the bugs can happen in much the same way, which results in
multiple regressions closed in a short period of time. That leads to a
much more jagged graph, with sharper peaks.
In the paper, Wysocki did some curve fitting for the the 2.6.33-34 releases
that corresponded reasonably well with the observed data. He noted that
the incomplete 2.6.35 curve was anomalous in that it didn't have a sharp
maximum and seemed to plateau, rather than drop off. He attributes that to
the shortened merge window for 2.6.37 along with the Kernel Summit and Linux
Plumbers Conference impacting the testing and debugging of the current
development kernels. Nevertheless, he used the same curve fitting
equations on the 2.6.35 data to derive a "prediction" that it would end up
with slightly more regressions than .33 and .34, but still less than 30.
It will be interesting to see if that is borne out in practice.
Regression lifetime
The lifetime of regressions is another area that Wysocki addresses. One of
his graphs is reproduced above and shows the cumulative number of
regressions whose lifetime is less than the number of days on the x-axis.
He separates the regressions into two sets, those from kernel 2.6.26-30 and
from 2.6.30-35. In both cases, the curves follow that of radioactive
decay, which allows for the derivation of the half-life for a set of kernel
regressions: roughly 17 days.
The graph for 2.6.30-35 is obviously lower than that of the earlier
kernels, which Wysocki attributes to the change in methodology that occurred
in the 2.6.32 timeframe. Because there are fewer short-lived (i.e. less
than a week) regressions tracked, that will lead to a higher average
regression lifetime. The average for the earlier kernels is 24.4 days,
while the later kernels have an average of 32.3 days. Wysocki posits
that the average really hasn't changed and that 24.5 days is a reasonable
number to use as
an average lifetime for regressions over the past two years or so.
Regressions by subsystem
Certain kernel subsystems have been more prone to regressions than others
over the last few releases, as is shown in a pair of tables from Wysocki's
paper. He cautions that
it is somewhat difficult to accurately place regressions into a particular
category, as they may be incorrectly assigned in bugzilla. There are also
murky boundaries between some of the categories, with power management (PM)
being used as an example. Bugs that clearly fall into the PM core, or
those that are PM-related but the root cause is unknown, get assigned to
the PM category, while bugs in a driver's suspend/resume code get assigned
to the category of the driver. Wysocki notes that these numbers should be
used as a rough guide to where regressions are being found, rather than as
an absolute and completely accurate measure.
| Category | 2.6.32 | 2.6.33 | 2.6.34 | 2.6.35 | Total |
| DRI (Intel) | 20 | 7 | 10 | 12 | 49 |
| x86 | 9 | 13 | 21 | 6 | 49 |
| Filesystems | 7 | 12 | 8 | 8 | 35 |
| DRI (other) | 10 | 7 | 10 | 5 | 32 |
| Network | 12 | 8 | 6 | 4 | 30 |
| Wireless | 6 | 6 | 11 | 4 | 27 |
| Sound | 8 | 9 | 4 | 2 | 23 |
| ACPI | 7 | 9 | 3 | 2 | 21 |
| SCSI & ATA | 4 | 2 | 2 | 2 | 10 |
| MM | 2 | 3 | 4 | 0 | 9 |
| PCI | 3 | 4 | 1 | 1 | 9 |
| Block | 2 | 1 | 3 | 2 | 8 |
| USB | 3 | 0 | 0 | 3 | 6 |
| PM | 4 | 2 | 0 | 0 | 6 |
| Video4Linux | 1 | 3 | 1 | 0 | 5 |
| Other | 35 | 30 | 35 |
12 | 112 |
| Reported regressions by category |
The Intel DRI driver and x86 categories are by far the largest source of
regressions, but there are a number of possible reasons for that. The
Intel PC ecosystem is both complex, with many different variations of
hardware, and well-tested because there are so many of those systems in
use. Other architectures may not be getting the same level of testing,
especially during the -rc phase.
It is also clear from the table that those subsystems
that are "closer" to the hardware tend to have more regressions. The eight
rows with 20 or more total regressions—excepting filesystems and
networking to some extent—are all closely tied to hardware. Those
kinds of regressions tend to be easier to spot because they cause the
hardware to fail, unlike regressions in the scheduler or memory management
code, for example, which are often more subtle.
| Category | 2.6.32 | 2.6.33 | 2.6.34 | 2.6.35 | Total |
| DRI (Intel) | 1 | 2 | 2 | 5 | 10 |
| x86 | 2 | 2 | 3 | 2 | 9 |
| DRI (other) | 1 | 3 | 2 | 3 | 9 |
| Sound | 5 | 2 | 0 | 1 | 8 |
| Network | 2 | 2 | 1 | 2 | 7 |
| Wireless | 1 | 1 | 1 | 2 | 5 |
| PM | 4 | 1 | 0 | 0 | 5 |
| Filesystems | 0 | 0 | 0 | 5 | 5 |
| Video4Linux | 1 | 3 | 0 | 0 | 4 |
| SCSI + SATA | 2 | 0 | 1 | 0 | 3 |
| MM | 1 | 0 | 1 | 0 | 2 |
| Other | 8 | 2 | 4 | 8 | 22 |
| Pending regressions by category |
It is also instructive to look at the remaining pending regressions by
category. In the table above, we can see that most of the regressions
identified have been fixed, with only relatively few persisting. Those are
likely to be bugs that are difficult to reproduce, and thus track down.
Some categories, like ACPI, fall completely out of the table, which
indicates that those developers have been very good at finding and fixing
regressions in that subsystem.
Conclusion
Regression tracking is important so that kernel developers are able to
focus their bug fixing efforts during each development cycle. But looking
at the bigger picture—how the number and types of regressions change—is also needed. Given the nature of kernel development, it
is impossible to draw any conclusions from the data collected for any single
release. By aggregating data over multiple development cycles, any oddities
specific to a particular cycle are smoothed out, which allows for trends to
be spotted.
Since regressions are a key indicator of kernel quality, and easier to
track than many others, they serve a key role in keeping Torvalds and
other kernel developers aware of kernel quality issues. As the developers
get more familiar with the "normal" regression patterns, it will become
more obvious that a given release is falling outside of those patterns,
which may mean that it needs more attention—or that something has
changed in the development process. In any case, there is clearly value in
the statistics, and that value is likely to grow over time.
Comments (4 posted)
Patches and updates
Core kernel code
Development tools
Device drivers
Documentation
Filesystems and block I/O
Memory management
Architecture-specific
Security-related
Virtualization and containers
Miscellaneous
Page editor: Jonathan Corbet
Next page: Distributions>>