LWN.net Logo

Kernel development

Brief items

Kernel release status

The current development kernel remains 2.6.37-rc1; no new prepatches have been released over the last week. The merge rate has also been low, with only 148 non-merge changesets merged since 2.6.37-rc1 as of this writing.

Stable updates: there have been no stable updates released in the last week.

Comments (none posted)

Quotes of the week

And please also don't top-post. Being the antisocial egomaniacs we are, people on lkml prefer to dissect the messages we're replying to, insert insulting comments right where they would be most effective and remove the passages which can't yield effective insults.
-- Tejun Heo

You've done it. After hours of gdb and caffeine, you've finally got a shell on your target's server. Maybe next time they will think twice about running MyFirstCompSciProjectFTPD on a production machine. As you take another sip of Mountain Dew and pick some of the cheetos out of your beard, you begin to plan your next move - it's time to tackle the kernel.

What should be your goal? Privilege escalation? That's impossible, there's no such thing as a privilege escalation vulnerability on Linux. Denial of service? What are you, some kind of script kiddie? No, the answer is obvious. You must read the uninitialized bytes of the kernel stack, since these bytes contain all the secrets of the universe and the meaning of life.

-- Dan Rosenberg

Comments (none posted)

Embedded Linux Flag Version

As a result of discussions held at two recent embedded Linux summits (and reported back to the recent Kernel Summit), the community has decided to identify specific kernel versions as "flag versions" to try to reduce "version fragmentation". On the linux-embedded mailing list, Tim Bird (architecture group chair for the CE Linux Forum) has announced that 2.6.35 will be the first embedded flag version, and it will be supported by (at least) Sony, Google, MeeGo, and Linaro. "First, it should be explained what having a flag version means. It means that suppliers and vendors throughout the embedded industry will be encouraged to use a particular version of the kernel for software development, integration and testing. Also, industry and community developers agree to work together to maintain a long-term stable branch of the flag version of the kernel (until the next flag version is declared), in an effort to share costs and improve stability and quality."

Full Story (comments: 10)

FSFLA: Linux kernel is "open core"

The Free Software Foundation Latin America has released a version of the 2.6.36 kernel with offending firmware (or drivers that need that firmware) stripped out. They also are trying to tap into the ongoing discussion of "open core" business models. "Sad to say, Linux fits the definition of Free Bait or Open Core. Many believe that Linux is Free Software or Open Source, but it isn't. Indeed, the Linux-2.6.36 distribution published by Mr. Torvalds contains sourceless code under such restrictive licensing terms as 'This material is licensed to you strictly for use in conjunction with the use of COPS LocalTalk adapters', presented as a list of numbers in the corresponding driver, and 'This firmware may not be modified and may only be used with Keyspan hardware' and 'Derived from proprietary unpublished source code, Copyright Broadcom' in the firmware subdirectory, just to name a few examples."

Full Story (comments: 264)

Linaro 10.11 released

The Linaro project has announced the release of Linaro 10.11. "10.11 is the first public release that brings together the huge amount of engineering effort that has occurred within Linaro over the past 6 months. In addition to officially supporting the TI OMAP3 (Beagle Board and Beagle Board XM) and ARM Versatile Express platforms, the images have been tested and verified on a total of 7 different platforms including TI OMAP4 Panda Board, IGEPv2, Freescale iMX51 and ST-E U8500."

Full Story (comments: none)

Netoops

By Jonathan Corbet
November 10, 2010
A kernel oops produces a fair amount of data which can be useful in tracking down the source of whatever went wrong. But that data is only useful if it can be captured and examined by somebody who knows how to interpret it. Capturing oops output can be hard; it typically will not make it to any logfiles in persistent storage. That's why we still see oops output posted in the form of a photograph taken of the monitor. Using cameras as a debugging tool can work for a desktop system, but it certainly does not scale to a data center containing thousands of systems. Google is thought to operate a site or two meeting that description, so it's not surprising to see an interest in better management of oops information there.

Google has had its own oops collection tool running internally for years; that has recently been posted for merging as netoops. Essentially, netoops is a simple driver which will, in response to a kernel oops, collect the most recent kernel logs and deliver them to a server across the net. The functionality seems useful, but the first version of the patch was questioned: netoops looks somewhat similar to the existing netconsole system, so it wasn't clear that a need for it exists. Why not just add any missing features to netconsole?

Mike Waychison, who posted the patch, responded with a number of reasons which have since found their way into the changelog. Netoops only sends data on an oops, so it is less hard on network bandwidth. The data is packaged in a more structured manner which is easier for machines and people to parse; that has enabled the creation of a vast internal "oops database" at Google. Netoops can cut off output after the first oops, once again saving bandwidth. And so on. There are enough differences that netconsole maintainer Matt Mackall agreed that it made sense for netoops to go in as a separate feature.

That said, there is clear scope for sharing some code between the two and, perhaps, improving netconsole in the process. The current version of the netoops patch includes new work to bring about that sharing. There seems to be no further opposition, but it's worth noting that Mike, in the patch changelog, notes that he's not entirely happy with either the user-space ABI or the data format. So this might be a good time for others interested in this sort of functionality to have a look and offer their suggestions and/or patches.

Comments (none posted)

Kernel development news

Checkpoint/restart: it's complicated

By Jonathan Corbet
November 9, 2010
At the recent Kernel Summit checkpoint/restart discussion, developer Oren Laadan was asked to submit a trimmed-down version of the patch which would just show the modifications to existing core kernel code. Oren duly responded with a "naked patch" which, as one might have expected, kicked off a new round of discussion. What many observers may not have expected was the appearance of an alternative approach to the problem which has seemingly been under development for years. Now we have two clearly different ways of solving this problem but no apparent increase in clarity; the checkpoint/restart problem, it seems, is simply complicated.

The responses to Oren's patch will not have been surprising to anybody who has been following the discussion. Kernel developers are nervous about the broad range of core code which is changed by this patch. They don't like the idea of spreading serialization hooks around the kernel which, the authors' claims to the contrary notwithstanding, look like they could be a significant maintenance burden over time. It is clear that kernel checkpoint/restart can never handle all processes; kernel developers wonder where the real-world limits are and how useful the capability will be in the end. The idea of moving checkpointed processes between kernel versions by rewriting the checkpoint image with a user-space tool causes kernel hackers to shiver. And so on; none of these worries are new.

Tejun Heo raised all these issues and more. He also called out an interesting alternative checkpoint/restart implementation called DMTCP, which solves the problem entirely in user space. With DMTCP in mind, Tejun concluded:

I think in-kernel checkpointing is in awkward place in terms of tradeoff between its benefits and the added complexities to implement it. If you give up coverage slightly, userland checkpointing is there. If you need reliable coverage, proper virtualization isn't too far away. As such, FWIW, I fail to see enough justification for the added complexity.

As one might imagine, this post was followed by an extended conversation between the in-kernel checkpoint/restart developers and the DMTCP developers, who had previously not put in an appearance on the kernel mailing lists. It seems that the two projects were each surprised to learn of the other's existence.

The idea behind DMTCP is to checkpoint a distributed set of processes without any special support from the kernel. Doing so requires support from the processes themselves; a checkpointing tool is injected into their address spaces using the LD_PRELOAD mechanism. DMTCP is able to checkpoint (and, importantly, restart) a wide variety of programs, including those running in the Python or Perl interpreters and those using GNU Screen. DMTCP is also used to support the universal reversible debugger project. It is, in other words, a capable tool with real-world uses.

Kernel developers naturally like the idea of eliminating a bunch of in-kernel complexity and solving a problem in user space, where things are always simpler. The only problem is that, in this case, it's not necessarily simpler. There is a surprising amount that DMTCP can do with the available interfaces, but there are also some real obstacles. Quite a bit of information about a process's history is not readily available from user space, but that history is often needed for checkpoint/restart; consider tracking whether two file descriptors are shared as the result of a fork() call or not. To keep the requisite information around, DMTCP must place wrappers around a number of system calls. Those wrappers interpose significant new functionality and may change semantics in unpredictable ways.

Pipes are hard for DMTCP to handle, so the pipe() wrapper has to turn them into full Unix-domain sockets. There is also an interesting dance required to get those sockets into the proper state at restart time. The handling of signals - not always straightforward even in the simplest of applications - is made more complicated by DMTCP, which also must reserve one signal (SIGUSR2 by default) for its own uses. The system call wrappers try to hide that signal handler from the application; there is also the little problem that signals which are pending at checkpoint time may be lost. Checkpointing will interrupt system calls, leading to unexpected EINTR returns; the wrappers try to compensate by automatically redoing the call when this happens. A second VDSO page must be introduced into a restarted process because it's not possible to control where the kernel places that page. There's a "virtual PID" layer which tries to fool restarted processes into thinking that they are still running with the same process ID they had when they were checkpointed.

There is an interesting plan for restarting programs which have a connection to an X server: they will wrap Xlib (not a small interface) and use those wrappers to obtain the state of the window(s) maintained by the application. That state can then be recreated at restart time before reconnecting the application with the server. Meanwhile, applications talking to an xterm are forced to reinitialize themselves at restart time by sending two SIGWINCH signals to them. And so on.

Given all of that, it is not surprising that the kernel checkpoint/restart developers see their approach as being a simpler, more robust, and more general solution to the problem. To them, DMTCP looks like a shaky attempt to reimplement a great deal of kernel functionality in user space. Matt Helsley summarized it this way:

Frankly it sounds like we're being asked to pin our hopes on a house of cards -- weird userspace hacks involving extra processes, hodge-podge combinations of ptrace, LD_PRELOAD, signal hijacking, brk hacks, scanning passes in /proc (possibly at numerous times which begs for races), etc....

In contrast, kernel-based cr is rather straight forward when you bother to read the patches. It doesn't require using combinations of obscure userspace interfaces to intercept and emulate those very same interfaces. It doesn't add a scattered set of new ABIs.

Seasoned LWN readers will be shocked to learn that few minds appear to have been changed by this discussion. Most developers seem to agree that some sort of checkpoint/restart functionality would be a useful addition to Linux, but they differ on how it should be done. Some see a kernel-side implementation as the only way to get even close to a full solution to the problem and as the simplest and most maintainable option. Others think that the user-space approach makes more sense, and that, if necessary, a small number of system calls can be added to simplify the implementation. It has the look of the sort of standoff that can keep a project like this out of the kernel indefinitely.

That said, something interesting may happen here. One thing that became reasonably clear in the discussion is that a complete, performant, and robust checkpoint/restart implementation will almost certainly require components in both kernel and user space. And it seems that the developers behind the two implementations will be getting together to talk about the problem in a less public setting. With luck, determination, and enough beer, they might just figure out a way to solve the problem using the best parts of both approaches. That would be a worthy outcome by any measure.

Comments (26 posted)

ELCE: Grant Likely on device trees

By Jake Edge
November 10, 2010

Device trees are a fairly hot topic in the embedded Linux world as a means to more easily support multiple system-on-chip (SoC) devices with a single kernel image. Much of the work implementing device trees for the PowerPC architecture, as well as making that code more generic so that others could use it, has been done by Grant Likely. He spoke at the recent Embedded Linux Conference Europe (ELCE) to explain what device trees are, what they can do, and to update the attendees on efforts to allow the ARM architecture use them.

[Grant Likely]

All of the work that is going into adding device tree support for various architectures is not being done for an immediate benefit to users, Likely said. It is, instead, being done to make it easier to manage embedded Linux distributions, while simplifying the boot process. It will also make it easier to port devices (i.e. components and "IP blocks") to different SoCs. But it is "not going to make your Android phone faster".

A device tree is just a data structure that came from OpenFirmware. It represents the devices that are part of particular system, such that it can be passed to the kernel at boot time, and the kernel can initialize and use those devices. For architectures that don't use device trees, C code must be written to add all of the different devices that are present in the hardware. Unlike desktop and server systems, many embedded SoCs do not provide a way to enumerate their devices at boot time. That means developers have to hardcode the devices, their addresses, interrupts, and so on, into the kernel.

The requirement to put all of the device definitions into C code is hard to manage, Likely said. Each different SoC variant has to have its own, slightly tweaked kernel version. In addition, the full configuration of the device is scattered over multiple C files, rather than kept in a single place. Device trees can change all of that.

A device tree consists of a set of nodes with properties, which are simple key-value pairs. The nodes are organized into a tree structure, unsurprisingly, and the property values can store arbitrary data types. In addition, there are some standard usage conventions for properties so that they can be reused in various ways. The most important of these is the compatible property that uniquely defines devices, but there are also conventions for specifying address ranges, IRQs, GPIOs, and so forth.

Likely used a simplified example from devicetree.org to show what these trees look like. They are defined with an essentially C-like syntax:

    / {
	compatible = "acme,coyotes-revenge";

	cpus {
	    cpu@0 {
		compatible = "arm,cortex-a9";
	    };
	    cpu@1 {
		compatible = "arm,cortex-a9";
	    };
	};

	serial@101F0000 {
	    compatible = "arm,pl011";
	};
        ...
	external-bus {
	    ethernet@0,0 {
		compatible = "smc,smc91c111";
	    };

	    i2c@1,0 {
		compatible = "acme,a1234-i2c-bus";
		rtc@58 {
		    compatible = "maxim,ds1338";
		};
	    };
            ...

The compatible tags allow companies to define their own namespace ("acme", "arm", "smc", and "maxim" in the example) that they can manage however they like. The kernel already knows how to attach an ethernet device to a local bus or a temperature sensor to an i2c bus, so why redo it in C for every different SoC, he asked. By parsing the device tree (or the binary "flattened" device tree), the kernel can set up the device bindings that it finds in the tree.

One of the questions that he often gets asked is: "why bother changing what we already have?" That is a "hard question to answer" in some ways, because for a lot of situations, what we have in the kernel currently does work. But in order to support large numbers of SoCs with a single kernel (or perhaps a small set of kernels), something like device tree is required. Both Google (for Android) and Canonical (for Linaro) are very interested in seeing device tree support for ARM.

Beyond that, "going data-driven to describe our platforms is the right thing to do". There is proof that it works in the x86 world as "that's how it's been done for a long time". PowerPC converted to device trees five years ago or so and it works well. There may be architectures that won't need to support multiple devices with a single kernel, and device trees may not be the right choice for those, but for most of the architectures that Linux supports, Likely clearly thinks that device trees are the right solution.

He next looked at what device trees aren't. They don't replace board-specific code, and developers will "still have to write drivers for weird stuff". Instead, device trees simplify the common case. Device tree is also not a boot architecture, it's "just a data structure". Ideally, the firmware will pass a device tree to the kernel at boot time, but it doesn't have to be done that way. The device tree could be included into the kernel image. There are plenty of devices with firmware that doesn't know about device trees, Likely said, and they won't have to.

There is currently a push to get ARM devices into servers, as they can provide lots of cores at low power usage. In order to facilitate that, there needs to be one CD that can boot any of those servers, like it is in the x86 world. Device trees are what will be used to make that happen, Likely said.

Firmware that does support device trees will obtain a .dtb (i.e. flattened device tree binary) file from somewhere in memory, and either pass it verbatim to the kernel or modify it before passing. Another option would be for the firmware to create the .dtb on-the-fly, which is what OpenFirmware does, but that is a "dangerous" option. It is much easier to change the kernel than the firmware, so any bugs in the firmware's .dtb creation code will inevitably be worked around in the kernel. In any case, the kernel doesn't care how the .dtb is created.

For ARM, the plan is to pass a device tree, rather than the existing, rather inflexible ARM device configuration known as ATAGs. The kernel will set up the memory for the processor and unflatten the .dtb into memory. It will unpack it into a "live tree" that can then be directly dereferenced and used by the kernel to register devices.

The Linux device model is also tree-based, and there is some congruence between device tree and the device model, but there is not a direct 1-to-1 mapping between them. That was done "quite deliberately" as the design goal was "not to describe what Linux wants", instead it was meant to describe the hardware. Over time, the Linux device model will change, so hardcoding Linux-specific values into the device tree has been avoided. The device tree is meant to be used as support data, and the devices it describes get registered using the Linux device model.

Device drivers will match compatible property values with device nodes in a device tree. It is the driver that will determine how to configure the device based on its description in a device tree. None of that configuration code lives in the device tree handling, it is part of the drivers which can then be built as loadable kernel modules.

Over the last year, Likely has spent a lot of time making the device tree support be generic. Previously, there were three separate copies of much of the support code (for Microblaze, SPARC, and PowerPC). He has removed any endian dependencies so that any architecture can use device trees. Most of that work is now done and in the mainline. There is some minimal board support that has not yet been mainlined. The MIPS architecture has added device tree support as of 2.6.37-rc1 and x86 was close to getting it for 2.6.37, but some last minute changes caused the x86 device tree support to be held back until 2.6.38.

The ARM architecture still doesn't have device tree support and ARM maintainer Russell King is "nervous about merging an unmaintainable mess". King is taking a wait-and-see approach until a real ARM board has device tree support. Likely agreed with that approach and ELCE provided an opportunity for him and King to sit down and discuss the issue. In the next six months or so (2.6.39 or 2.6.40), Likely expects that the board support will be completed and he seems confident that ARM device tree support in the mainline won't be far behind.

There are other tasks to complete in addition to the board support, of course, with documentation being high on that list. There is a need for documentation on how to use device trees, and on the property conventions that are being used. The devicetree.org wiki is a gathering point for much of that work.

There were several audience questions that Likely addressed, including the suitability of device tree for Video4Linux (very suitable and the compatible property gives each device manufacturer its own namespace), the performance impact (no complaints, though he hasn't profiled it — device trees are typically 4-8K in size, which should minimize their impact), and licensing or patent issues (none known so far, the code is under a BSD license so it can be used by proprietary vendors — IBM's lawyers don't seem concerned). Overall, both Likely and the audience seemed very optimistic about the future for device trees in general and specifically for their future application in the ARM architecture.

Comments (7 posted)

A more detailed look at kernel regressions

By Jake Edge
November 10, 2010

The number of kernel regressions over time is one measure of the overall quality of the kernel. Over the last few years, Rafael Wysocki has taken on the task of tracking those regressions and regularly reporting on them to the linux-kernel mailing list. In addition, he has presented a "regressions report" at the last few Kernel Summits [2010, 2009, and 2008]. As part of his preparation for this year's talk, Wysocki wrote a paper, Tracking of Linux Kernel Regressions [PDF], that digs in deeply and explains the process of Linux regression tracking, along with various trends in regressions over time. This article is an attempt to summarize that work.

A regression is a user-visible change in the behavior of the kernel between two releases. A program that was working on one kernel version and then suddenly stops working on a newer version has detected a kernel regression. Regressions are probably the most annoying kind of bug that crops up in the kernel development process, as well as the one of the most visible. In addition, Linus Torvalds has decreed that regressions may not be intentionally introduced—to fix a perceived kernel shortcoming for example—and that fixing inadvertent regressions should be a high priority for the kernel developers.

There is another good reason to concentrate on fixing any regressions: if you don't, you really have no assurance that the overall quality of the code is increasing, or at least staying the same. If things that are currently working continue to work in the future, there is a level of comfort that the bug situation is, at least, not getting worse.

Regression tracking process

To that end, various efforts have been made to track kernel regressions, starting with Adrian Bunk in 2007 (around 2.6.20), through Michał Piotrowski, and then to Wysocki during the 2.6.23 development cycle. For several years, Wysocki handled the regression tracking himself, but it is now a three-person operation, with Maciej Rutecki turning email regression reports into kernel bugzilla entries, and Florian Mickler maintaining the regression entries: marking those that have been fixed, working with the reporters to determine which have been fixed, and so on.

The kernel bugzilla is used to track the regression meta-information as well as the individual bugs. Each kernel release has a bugzilla entry that tracks all of the individual regressions that apply to it. So, bug #16444 tracks the regressions reported against the 2.6.35 kernel release. Each individual regression is listed in the "Depends on" field in the meta-bug, so that a quick look will show all of the bugs, and which have been closed.

There is another meta-bug, bug #15790, that tracks all of the release-specific meta-bugs. So, that bug depends on #16444 for 2.6.35, as well as #21782 for 2.6.36, #15310 for 2.6.33, and so on. Those bugs are used by the scripts that Wysocki runs to generate the "list of known regressions" which gets posted to linux-kernel after each -rc release.

Regressions are added to bugzilla one week after they are reported by email, if they haven't been fixed the interim. That's a change from earlier practices to save Rutecki's time as well as to reduce unhelpful noise. Bugzilla entries are linked to fixes as they become available. The bug state is changed to "resolved" once a patch is available and "closed" once Torvalds merges the fix into the mainline.

Regressions for a particular kernel release are tracked through the following two development cycles. For example, when 2.6.36 was released, the tracking of 2.6.34 regressions ended. When 2.6.37-rc1 was released, that began the tracking for 2.6.36, and once 2.6.37 is released in early 2011, tracking of 2.6.35 regressions will cease. That doesn't mean that any remaining regressions have magically been fixed, of course, and they can still be tracked using the meta-bug associated with a release.

Regression statistics

To look at the historical regression data, Wysocki compiled a table that listed the number of regressions reported for each of the last ten kernel releases as well as the number that are still pending (i.e. have not been closed). For the table, he has removed invalid and duplicate reports from those listed in bugzilla. It should also be noted that after 2.6.32, the methodology for adding new regressions changed such that those that were fixed in the first week after being reported were not added to bugzilla. That at least partially explains the drop in reports after 2.6.32.

Kernel # reports # pending
2.6.26 180 1
2.6.27 144 4
2.6.28 160 10
2.6.29 136 12
2.6.30 177 21
2.6.31 146 20
2.6.32 133 28
2.6.33 116 18
2.6.34 119 15
2.6.35 63 28
Total 1374 157
Reported and pending regressions

The number of "pending" regressions reflects the bugs that have been fixed since the release, not just those that were fixed during the two-development-cycle tracking period. In order to look more closely at what happens during the tracking period, Wysocki provides another table. That table separates the two most important events during the tracking period, which are the releases of the subsequent kernel versions (i.e. for 2.6.N, the releases of N+1 and N+2).

For example, once the 2.6.35 kernel was released, that ended the period where the development focus was on fixing regressions in 2.6.34. At that point, the merge window for 2.6.36 opened and developers switched their focus to adding new features for the next release. Furthermore, once 2.6.36 was released, regressions were no longer tracked at all for 2.6.34. That is reflected in the following table where the first "reports" and "pending" columns correspond to the N+1 kernel release, and the second to the N+2 release.

Kernel # reports (N+1) # pending (N+1) # reports (N+2) # pending (N+2)
2.6.30 122 36 170 45
2.6.31 89 31 145 42
2.6.32 101 36 131 45
2.6.33 74 33 114 27
2.6.34 87 31 119 21
2.6.3561 28  
Reported and pending regressions (separated by release)

The table shows that the number of regressions still goes up fairly substantially after the release the next (N+1) kernel. This indicates that the -rc kernels may not be getting as much testing as the released kernel does. In addition, the pending kernel numbers are substantially higher for the N+2 kernel release, at least in the 2.6.30-32 timeframe. Had that trend continued, it could be argued that the kernel developers were paying less attention to regressions in a particular release once the next release was out. But the 2.6.33-34 numbers are fairly substantially down after the N+2 release, and Wysocki says that there are indications that 2.6.35 is continuing that trend.

Reporting and fixing regressions

[Open regressions graph]

We can look at the number of outstanding regressions over time in one of the graphs from Wysocki's paper. For each kernel release, there are generally two peaks that indicate where the number of open regressions is highest. These roughly correspond with the end of the merge window and the release date for the next kernel version. Once past those maximums, the graphs tend to level out.

There are abrupt jumps in the number of regressions that are probably an artifact of how the reporting is done. Email reports are generally batched up, with multiple reports being added at roughly the same time. Maintenance on the bugs can happen in much the same way, which results in multiple regressions closed in a short period of time. That leads to a much more jagged graph, with sharper peaks.

In the paper, Wysocki did some curve fitting for the the 2.6.33-34 releases that corresponded reasonably well with the observed data. He noted that the incomplete 2.6.35 curve was anomalous in that it didn't have a sharp maximum and seemed to plateau, rather than drop off. He attributes that to the shortened merge window for 2.6.37 along with the Kernel Summit and Linux Plumbers Conference impacting the testing and debugging of the current development kernels. Nevertheless, he used the same curve fitting equations on the 2.6.35 data to derive a "prediction" that it would end up with slightly more regressions than .33 and .34, but still less than 30. It will be interesting to see if that is borne out in practice.

Regression lifetime

[Lifetime graph]

The lifetime of regressions is another area that Wysocki addresses. One of his graphs is reproduced above and shows the cumulative number of regressions whose lifetime is less than the number of days on the x-axis. He separates the regressions into two sets, those from kernel 2.6.26-30 and from 2.6.30-35. In both cases, the curves follow that of radioactive decay, which allows for the derivation of the half-life for a set of kernel regressions: roughly 17 days.

The graph for 2.6.30-35 is obviously lower than that of the earlier kernels, which Wysocki attributes to the change in methodology that occurred in the 2.6.32 timeframe. Because there are fewer short-lived (i.e. less than a week) regressions tracked, that will lead to a higher average regression lifetime. The average for the earlier kernels is 24.4 days, while the later kernels have an average of 32.3 days. Wysocki posits that the average really hasn't changed and that 24.5 days is a reasonable number to use as an average lifetime for regressions over the past two years or so.

Regressions by subsystem

Certain kernel subsystems have been more prone to regressions than others over the last few releases, as is shown in a pair of tables from Wysocki's paper. He cautions that it is somewhat difficult to accurately place regressions into a particular category, as they may be incorrectly assigned in bugzilla. There are also murky boundaries between some of the categories, with power management (PM) being used as an example. Bugs that clearly fall into the PM core, or those that are PM-related but the root cause is unknown, get assigned to the PM category, while bugs in a driver's suspend/resume code get assigned to the category of the driver. Wysocki notes that these numbers should be used as a rough guide to where regressions are being found, rather than as an absolute and completely accurate measure.

Category 2.6.32 2.6.33 2.6.34 2.6.35 Total
DRI (Intel) 20 7 10 12 49
x86 9 13 21 6 49
Filesystems 7 12 8 8 35
DRI (other) 10 7 10 5 32
Network 12 8 6 4 30
Wireless 6 6 11 4 27
Sound 8 9 4 2 23
ACPI 7 9 3 2 21
SCSI & ATA 4 2 2 2 10
MM 2 3 4 0 9
PCI 3 4 1 1 9
Block 2 1 3 2 8
USB 3 0 0 3 6
PM 4 2 0 0 6
Video4Linux 1 3 1 0 5
Other 35 30 35 12 112
Reported regressions by category

The Intel DRI driver and x86 categories are by far the largest source of regressions, but there are a number of possible reasons for that. The Intel PC ecosystem is both complex, with many different variations of hardware, and well-tested because there are so many of those systems in use. Other architectures may not be getting the same level of testing, especially during the -rc phase.

It is also clear from the table that those subsystems that are "closer" to the hardware tend to have more regressions. The eight rows with 20 or more total regressions—excepting filesystems and networking to some extent—are all closely tied to hardware. Those kinds of regressions tend to be easier to spot because they cause the hardware to fail, unlike regressions in the scheduler or memory management code, for example, which are often more subtle.

Category 2.6.32 2.6.33 2.6.34 2.6.35 Total
DRI (Intel)122510
x8622329
DRI (other)13239
Sound52018
Network22127
Wireless11125
PM41005
Filesystems00055
Video4Linux13004
SCSI + SATA20103
MM10102
Other824822
Pending regressions by category

It is also instructive to look at the remaining pending regressions by category. In the table above, we can see that most of the regressions identified have been fixed, with only relatively few persisting. Those are likely to be bugs that are difficult to reproduce, and thus track down. Some categories, like ACPI, fall completely out of the table, which indicates that those developers have been very good at finding and fixing regressions in that subsystem.

Conclusion

Regression tracking is important so that kernel developers are able to focus their bug fixing efforts during each development cycle. But looking at the bigger picture—how the number and types of regressions change—is also needed. Given the nature of kernel development, it is impossible to draw any conclusions from the data collected for any single release. By aggregating data over multiple development cycles, any oddities specific to a particular cycle are smoothed out, which allows for trends to be spotted.

Since regressions are a key indicator of kernel quality, and easier to track than many others, they serve a key role in keeping Torvalds and other kernel developers aware of kernel quality issues. As the developers get more familiar with the "normal" regression patterns, it will become more obvious that a given release is falling outside of those patterns, which may mean that it needs more attention—or that something has changed in the development process. In any case, there is clearly value in the statistics, and that value is likely to grow over time.

Comments (4 posted)

Patches and updates

Core kernel code

Development tools

Device drivers

Documentation

Filesystems and block I/O

Memory management

Architecture-specific

Security-related

Virtualization and containers

Miscellaneous

Page editor: Jonathan Corbet
Next page: Distributions>>

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds