LWN.net Logo

Kernel development

Release status

Kernel release status

The current 2.6 development kernel is 2.6.27-rc7, released on September 21. "All the changes are small - the biggest individual ones are literally things like a few m68k defconfig changes and the trivial cleanups in the MAINTAINERS file." Details can be found in the long-format changelog.

Several dozen fixes have been merged into the mainline git repository since the 2.6.27-rc7 release.

The linux-next tree has abruptly taken a break; it is expected to return around October 13.

Comments (none posted)

Kernel development news

Quotes of the week

And exactly as in the theory of relativity, two people on different CPU's can actually validly _disagree_ about the ordering of the same event. There are things that act as "light-cones" and are borders for what everybody can agree on, but basically, in the absence of explicit locks, it is very possible that no such thing as "ordering" may even exist.
-- Linus Torvalds

-	early_printk("Kernel really alive\n");
+	early_printk("Kernel really alive! It's alive! IT'S ALIIIIIIIIIVE!\n");
-- Bill Nottingham

Still experimental, not for inclusion, but given that I am now finding more bugs in the rest of Linux than in this code, I suspect that it is getting close.
-- Paul McKenney (Thanks to Steven Rostedt)

Comments (1 posted)

e1000e and the joy of development kernels

By Jonathan Corbet
September 24, 2008
The 2.6.27-rc regression list posted on September 21 contains - deep within the list - an entry reading "e1000e: 2.6.27-rc1 corrupts EEPROM/NVM". One might be forgiven for missing it; the list of regressions is still (unfortunately) long, and there is nothing there to indicate that it is a notable problem. But it is: this particular bug goes beyond breaking networking; when it bites, it corrupts the EEPROM on the device, causing it to cease to function forevermore (or, at least, until the user can manage to flash the EEPROM with working code). This is a problem which is worth fixing.

As of this writing, though, nobody seems to know what the problem is. There was some confusion resulting from the fact that the related e1000 driver also suffered from an EEPROM corruption problem - but that turns out to have been an entirely different bug. The e1000 problem was fixed by putting a lock around accesses to the EEPROM, preventing corruption caused by concurrent access. But something else is going on with the e1000e.

Figuring out what that "something else" is appears to be a challenge. The problem is not readily reproducible, and there is this little problem that triggering the bug more than once requires the replacement of the affected hardware. It's not even clear which kernel versions are affected, though it appears that only the 2.6.27 development series shows the bug. There is some correlation between e1000e corruptions and graphics driver crashes, leading David Miller to pursue a hypothesis that the real culprit is changes to the X server, but that idea has not, yet been proven. Other developers suspect a concurrency-related problem similar to the e1000 bug.

As of this writing, the bulk of what is known can be found in this advisory from Mandriva. Kernel developers are adding information to the kernel bugzilla entry as they find it.

It has been suggested that anybody running 2.6.27 on a potentially affected system might want to save a copy of the current EEPROM contents with a command like:

    ethtool -e eth0 > eth0.eeprom

(That assumes, of course, that the relevant device is eth0 on your system). With the saved data, it should be possible to recover the device if the worst happens; without, chances are that victims will have to return their systems to the vendor.

In one sense, this bug demonstrates that the system works. It was caught while the kernel was still in the stabilization phase; one can be certain that it will be obliterated somehow before any stable 2.6.27 release comes out. On the other hand, the first report of this problem hit the net on August 8; the problem was known for over a month before distributors started responding to it and the all-out hunt for the cause began. That is a long time for any regression to persist, but it is especially long when one is dealing with a regression which has the ability to regress hardware back to a stone-age state.

The distributors have now responded; most of them have withdrawn kernels with the affected drivers. So far, nobody has posted tools to help affected users recover their hardware (suggestions to use ibautil should be ignored and forgotten about as soon as possible). Such a tool is forthcoming, but it would be hard to blame the relevant engineers for focusing on fixing the problem first. With any luck at all, the root cause will have been isolated by the time you read this.

There is one thing that will not have changed, though. Testers of unstable software - especially the kernel - have often been warned that said software can do all kinds of terrible things to their systems. It is easy to ignore those warnings; even -rc1 kernels actually work for most people, most of the time. But, as we have seen in this case, the potential for catastrophic bugs is real. Development code can brick your network adapter, scramble your filesystems, open up severe security holes, or save your documents as OOXML. When experimenting with unstable code - even if it has been neatly packaged by your distributor - it is always prudent to have good backups and an even better sense of humor.

Comments (14 posted)

LPC: The future of Linux graphics

By Jonathan Corbet
September 24, 2008
On the final day of the Linux Plumbers Conference, Keith Packard ran a microconference dedicated to future displays. A number of topics were discussed there, but the key session had to do with the near-term future of Linux video drivers. Longtime LWN readers will be more than familiar with the story: Linux has multiple subsystems charged with managing graphics hardware, the user-space driver model adopted by XFree86 leads to all kinds of problems, support for 3D graphics is not what it should be, etc. That whole story was recounted here, but with a notable difference: solutions are in the final stabilization stages, and these problems will soon be history.

[Display session] There are two major components to the work which is being done: graphics memory management and kernel-based mode setting. A contemporary graphics processor (GPU) is really a CPU in all respects, including the possession of a sophisticated memory management unit. Managing the sharing of memory between user space, the kernel, and the GPU is fundamental to the implementation of correct, high-performance graphics. One year ago, the TTM subsystem looked like the solution to the memory management problem, but TTM grew increasingly unworkable as the understanding of the problem improved. So now the Graphics Execution Manager (GEM) code looks like the way forward; it is currently being prepared for merging into the mainline kernel.

Kernel-based mode setting, instead, is meant to get user-space code out of the business of messing around directly with the hardware. Putting the kernel in charge of the configuration of the video adapter has a long list of advantages. Suspend and resume have a much better chance of working, for example. Once the X server stops accessing hardware directly, it no longer needs to run as root; having that much untrusted code running with full privileges has made people nervous for many years. In the current scheme, the kernel cannot change the graphics mode if it needs to; that means that, for example, if the system panics, a graphical user will never see the message. With kernel-based mode setting, the kernel can switch to a different mode and allow the user to frantically try to read the message before it scrolls off the screen. Kernel-based mode setting will also make fast user switching work much better, without the need to use a separate virtual terminal for each user session.

One of the first topics of discussion was: how does the kernel decide when to switch to the panic screen to show the user an important message? There are quite a few different paths by which the kernel can indicate distress; should a kernel message be presented every time a WARN_ON() condition is encountered? There would appear to be a need to unify the error paths in the kernel to help simplify this kind of decision. Linus Torvalds Jesse Barnes suggested that the kernel could simply switch on every message emitted with printk(), on the theory that such a policy would lead to a rapid and welcome reduction in kernel verbosity.

The real debate in this session, though, had to do with development process. As has been discussed previously on LWN, much of the video driver work is done outside of the mainline kernel tree. We are now seeing a big chunk of that work being prepared for a merge. But the new mode setting interface is a big API change which will require adjustments from user space; a new kernel expecting to handle mode setting may not give the best results when run with an older user space X server. So there will be a big flag day of sorts when everything changes and all of the new code gets run for the first time.

Linus is not pleased with the notion of a video graphics flag day; he made a long appeal for a more incremental approach to fixing the video driver work. In his opinion, the flag day will lead to a whole bunch of untested code being made active all at once; there will certainly be design mistakes which show up, and the whole thing will fail to work properly. At which point another flag day will be required. Linus was not impressed by the claim that Fedora users have selflessly been testing this code for everybody; in his view, the kernel developers are not doing this testing. He sees the whole thing as a recipe for disaster.

The real problem - and the reason for the out-of-tree development - is that all of this work requires the creation of a number of new, complex user-space ABIs. That is true for both mode setting and memory management, and the two cannot be easily separated from each other. Until the combination as a whole is seen to work, the video driver developers simply cannot commit themselves to a stable user-space interface - and that means that their code cannot be merged.

As an example, TTM was cited. Had that code been pushed when it looked like the right solution, there would now be even bigger problems to solve.

In summary, the graphics developers believe that the approach they are taking is as incremental as they can make it. Whether they convinced Linus of that fact is unclear, but he eventually seemed to accept the plan. He did ask for them to push the mode setting code upstream first, but that code cannot work without memory management support. So GEM will go into the mainline ahead of kernel-based mode setting. Once everything is in the kernel, it will be possible to boot a system with either kernel-based or user-space mode setting, so both new and old distributions will be supported. Someday, in the distant future, support for mode setting in user space can be removed. Much sooner than that, though, we should all be running much-improved graphics code and will have long since forgotten how things used to be.

Comments (10 posted)

Newer kernels and older SELinux policies

By Jake Edge
September 24, 2008

A subtle change in 2.6.25 recently left Andrew Morton with a less than completely functioning system, but it also demonstrated a user-space interface that may sometimes be overlooked: SELinux. The problem stemmed from a change to facilitate containers by making /proc/net into a symbolic link, which tripped up SELinux policies that had been written for earlier kernels. Putting policy into user space is a guiding principle of kernel development, but that can sometimes lead to an unexpected synchronization required between those policies and the kernel.

The change itself was fairly minor, making /proc/net be a symbolic link to /proc/self/net so that containers would only see their network devices, rather than those of the enclosing system. But when Morton ran a recent kernel on his Fedora Core 5 and 6 systems, he got:

    sony:/home/akpm> ifconfig -a
    Warning: cannot open /proc/net/dev (Permission denied). Limited output.
Further investigation found that even ls got permission errors when looking at /proc/net. As is usual with mysterious "permission denied" errors, SELinux was the underlying cause.

When the change was made, back in March, it was reviewed by the SELinux developers, but no one noticed that it would cause an additional permission check—on the symbolic link itself. So, when resolving things like /proc/net/dev or other entries in that directory, the "labels" on the symbolic link were checked. Of course, /proc is a synthetic filesystem, so the labels are generated from SELinux code rather than retrieved from extended attributes (xattrs).

Distributions have updated their policies to allow access to the symbolic link—probably by noticing the SELinux denial in log messages—so most folks never saw the problem. As Morton found out, though, existing distribution policy files (those shipped with FC5 and FC6 for example) would still disallow the access. Morton regularly runs newer kernels with older distributions to try to catch exactly this kind of error; he is probably one of very few, perhaps the only one, doing that.

Because the distribution-supplied kernel was being changed, some argued that requiring users to update their SELinux policies is not an onerous requirement. Paul Moore puts it this way:

Maybe I'm in the minority here, but in my mind once you step away from the distro supplied kernel (also applies to other packages, although those are arguably less critical) you should also bear the responsibility to make sure you upgrade/tweak/install whatever other bits need to be fixed.

Morton did not buy that argument saying:

Nope. Releasing a non-backward-compatible kernel.org kernel is a big deal.

We'll do it sometimes, with long notice, much care and much deliberation.

We did it this time by sheer accident. That's known in the trade as a "bug".

But SELinux developer Stephen Smalley points out that permissions checks are not normally considered part of the kernel to user space interface. It is something of a gray area, though. Clearly the standard UNIX permission checks are part of that interface, at least partially because the kernel does handle the policy for those checks. Since the policies that govern the decisions about SELinux access denial come from user space, it is a bit hard to argue that changes to the kernel will not ripple out. Smalley describes the problem:

I should note here that for changes to SELinux, we have gone out of our way to avoid such breakage to date through the introduction of compatibility switches, policy flags to enable any new checks, etc (albeit at a cost in complexity and ever creeping compatibility code). But changes to the rest of the kernel can just as easily alter the set of permission checks that get applied on a given operation, and I don't think we are always going to be able to guarantee that new kernel + old policy will Just Work.

One possible solution to the immediate problem was floated by Smalley: SELinux could change the label that it returns for symbolic links under /proc. It is not clear that anyone really wants that change, and there has been no movement to add it. As Morton says, "people who are shipping 2.6.25- and 2.6.26-based distros probably wouldn't want such a patch in their kernels anyway."

Longer term, Eric Biederman asks about supporting xattrs for /proc. That would allow user space to label the proc filesystem appropriately, removing one of the special cases. Unfortunately, doing so would create yet another incompatibility between newer kernels and older user spaces.

In the end, because the bug was only seen by Morton, many months after it was introduced, it may just be ignored. The larger issue of how permissions checks fit into the kernel to user space interface, though, may rear its head again.

Comments (4 posted)

Patches and updates

Kernel trees

Core kernel code

Development tools

Device drivers

Documentation

Filesystems and block I/O

Memory management

Networking

Architecture-specific

Security-related

Virtualization and containers

Benchmarks and bugs

Miscellaneous

Page editor: Jonathan Corbet
Next page: Distributions>>

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds