User: Password:
Subscribe / Log in / New account

Kernel development

Brief items

Kernel release status

The current development kernel is 2.6.36-rc7, released on October 6. "This should be the last -rc, I'm not seeing any reason to keep delaying a real release. There was still more changes to drivers/gpu/drm than I really would have hoped for, but they all look harmless and good. Famous last words." The short-form changelog is in the announcement; has the full changelog.

As of this writing, the final 2.6.36 release is not out, but it can be expected at almost any time.

Stable updates: there have been no stable updates in the last week.

Comments (none posted)

Quotes of the week

As the situation exists today, implementations for many 'high-level' programming languages operate in competition with the kernel. Garbage Collection without Paging demonstrated a 218x improvement in latencies and a 41x improvement in throughput by integration of GC and the kernel's paging system (requiring a patch to the Linux kernel). Intelligent, rational people often balk at the notion of a high-level language-as-an-OS because GC, in their experience, introduces so many performance issues - never mind that GC has never had a fair, portable opportunity to leverage the hardware features because the OS kernel is hogging them!
-- "dmbarbour"

No matter how hard I try, I always read this as "DAMAGED". Which I can't help but imagine subliminally influences the reader's opinion of the patches.

Of course I am excellent at naming things, see "chunkfs" and "relatime". But some ideas for naming various concepts in this patch:


-- Valerie Aurora

Quite frankly, if somebody has something in "next" (and really meant for the _next_ merge window, not the current one) that is marked for stable, I think that shows uncommonly bad taste. And that, in turn, means that the "stable" tag is also very debatable. It clearly cannot be important enough to really be for stable if it's not even being aggressively pushed into the current -rc.
-- Linus Torvalds

So changing kernel interfaces that get exported to user space is always a disaster. Anybody who _designs_ for that kind of disaster shouldn't be participating in kernel development, because they've shown themselves to be unable to understand the pain and suffering.
-- Linus Torvalds

Hello NSA. I'm the first person ever banned from linux-kernel. I was banned for spewing hackish off-topic stuff like a working stack machine interpreter daemon, "Why the Plan 9 C compiler doesn't have asm("")", and a packages-friendly internationalization of the file names tree. Appended below is a trivial shell function that gets rid of make.
-- Rick Hohensee is back (thanks to Valerie Aurora)

Dear Martin,

Are you from the same HTC mentioned here?


If so, please ask again in 90-120 days. Until then, you're on your own.

-- Matt Mackall

Comments (13 posted)

Linsched for 2.6.35 released

Linsched is a user-space simulator intended to run the Linux scheduling subsystem; it is intended to help developers working on improving (or understanding) the scheduler. A new version, based on 2.6.35, has been released. "Since Linsched allows arbitrary hardware topologies to be modeled, it enables testing of scheduler changes on hardware that may not be easily accessible to the developer. For example, most developers don't have access to a quad-core quad-socket box, but they can use LinSched to see how their changes affect the scheduler on such boxes." Google (the source of this release, but not the original developer) uses Linsched to validate its scheduler work.

Full Story (comments: 6)

No fanotify for 2.6.36

By Jonathan Corbet
October 12, 2010
The fanotify subsystem (originally "TALPA") was designed as a hook allowing anti-malware applications to intercept - and possibly block - file-oriented system calls. Getting this code into the mainline has been a long process, involving redefined requirements, reworking the low-level VFS notification code, and redesigning the user-space interface. After all that work, fanotify developer Eric Paris was able to get the code merged during the 2.6.36 merge window. Developers started using the interface to do interesting things; Lennart Poettering has mentioned, for example, using it to monitor file accesses to improve system bootstrap times. This long story, it seemed, was near an end.

Along came Tvrtko Ursulin, who pointed out a problem with the fanotify system calls; he then followed up with a second issue. It seems that the results of permission decisions were not always being handled correctly, and that the fanotify_init() system call had, somewhere along the way, lost the intended priority argument. The second issue, in particular, is serious because it affects the user-space ABI, which must be maintained indefinitely.

Eric acknowledged the problems and started to ponder ways to get around them before the 2.6.36 release, but Alan Cox advised a more cautious approach:

Given two chunks of "oh dear" last minute stuff would it be safer to simply punt and just pull the syscall/prototype itself (leaving the rest) for the release. That can go into the first pass of the next kernel tree, and if it the fixes and priority bits all work out may well then be tiny enough for -stable.

Eric, not entirely pleased with the idea, carried on the discussion for a while. Eventually, though, he sent in a patch disabling the fanotify system calls:

This feature can be added in an ABI compatible way in the next release (by using a number of bits in the flags field to carry the info) but it was suggested by Alan that maybe we should just hold off and do it in the next cycle, likely with an (new) explicit argument to the syscall. I don't like this approach best as I know people are already starting to use the current interface, but Alan is all wise and noone on list backed me up with just using what we have. I feel this is needlessly ripping the rug out from under people at the last minute, but if others think it needs to be a new argument it might be the best way forward.

Linus took the patch, so, while the fanotify code will be present in the 2.6.36 release, it will not be accessible from user space. Whether the problems can be fixed in a way which is suitable for a 2.6.36.y stable release remains to be seen.

Comments (none posted)

Kernel development news

ARM's multiply-mapped memory mess

By Jonathan Corbet
October 12, 2010
As a general rule, kernel changes which break drivers at run time are not seen as a good thing. Silent data corruption is also seen as the sort of outcome that the development community would rather avoid. What happens when it becomes necessary to choose one or the other? A long-running debate in the ARM community provides at least one answer.

First, some background. Contemporary processors do not normally address memory directly; instead, memory accesses are mediated through mappings created in the hardware's memory management unit. Depending on the processor, those mappings may be controlled through segment registers, page table entries, or some other means. The mapping will translate a virtual address into a physical address, but it also controls how the processor will access that memory and, perhaps, cache its contents.

As explained by ARM maintainer Russell King, ARM processors have a number of attributes which affect how memory mappings work. There is the concept of a memory type; "normal memory" is subject to reordering of reads and writes, while "device memory" is not, for example. There is also a bit indicating whether memory can be shared between processors; unshared memory is faster because there is no need to worry about cross-processor cache coherency. Then, like many CPUs, ARM processors can specify different caching behavior in the mapping; RAM might be mapped with writeback caching enabled, while device memory is uncached.

The ARM kernel maps RAM as normal memory with writeback caching; it's also marked non-shared on uniprocessor systems. The ioremap() system call, used to map I/O memory for CPU use, is different: that memory is mapped as device memory, uncached, and, maybe, shared. These different mappings give the expected behavior for both types of memory. Where things get tricky is when somebody calls ioremap() to create a new mapping for system RAM.

The problem with these multiple mappings is that they will have differing attributes. As of version 6 of the ARM architecture, the specified behavior in that situation is "unpredictable." Users, as a rule, are not enamored of "unpredictable' behavior, especially when their data is involved. So it would make sense to avoid multiple memory mappings with differing attributes. The ARM architecture has traditionally allowed this kind of mapping, though, and a number of drivers, as a result, rely on being able to remap RAM in this way.

Back in April, Russell raised an alarm about this issue, and posted a patch causing ioremap() to fail when the target is system RAM. This change avoids the potential data corruption issue, but at the cost of breaking every driver using ioremap() in this way. There were complaints at the time, so the patch sat out the 2.6.35 development cycle, but Russell merged it for 2.6.36. There it sat until, with the release imminent, Felipe Contreras posted a patch backing out the change, saying:

Many drivers are broken, and there's no alternative in sight. Such a big change should stay as a warning for now, and only later should it actually fail.

Russell was not impressed. In his view, remapping RAM in this way is a dangerous technique which will lead to data corruption sooner or later. Despite being warned six months ago, driver developers have not fixed the problem - there are as many broken drivers now as there were before. So, he says, there is no benefit to waiting any longer; the dangerous behavior should be stopped before somebody gets burned.

On the other side, driver developers point out that everything "seems to work" as it is, so there is no urgent need for change. Furthermore, Russell's patch looks to them like an API change, and the normal rule of kernel development is that anybody making internal API changes is charged with cleaning up any resulting messes. Fixing the drivers is not a trivial task, and it's Russell's contention that they have always been broken, so he is not willing (or necessarily able) to make them all work again.

The situation looked stalled, with a reversion of the patch looking like the only way forward. But, in fact, it looks like there is a way out. The first is to allow those mappings for one more cycle, but to put in a user-visible warning when they happen. As Andrew Morton put it:

We *do* have a plan: as of 2.6.36, the kernel will emit a WARN_ON trace when a driver does this. Offending code will be discovered, developers will get bug reports from worried users, etc. This is usually pretty effective.

It is the "worried users" who have been missing from the equation so far; they can provide a type of pressure which, seemingly, is unavailable to worried subsystem maintainers.

The other piece of the solution is to give driver developers a way to obtain a chunk of physically-contiguous RAM which can be remapped in this way. Such memory cannot be mapped simultaneously as system RAM. One nice idea would be to simply unmap system memory when it is put to a device's use, but that proves to be difficult to implement. The alternative is to just set aside some memory at boot time and never let the kernel use it for any other purpose; drivers can then allocate from that pool when they need to. Russell has posted a patch which makes this kind of memory set-aside possible.

So this particular situation will probably have a happy outcome, presuming that the above outcome happens and that that no users are burned by unpredictable mappings with the 2.6.36 kernel. But it highlights some ongoing problems. It can be very hard to get developers to fix things, especially if the current code "seems to work." Those developers also became aware of the change at a very late date - if, indeed, they are even aware of it now. It seems that testing of -rc kernels by developers is not happening as much as we would like. Still, the development process seems to work, and problems like this are overcome eventually.

Comments (15 posted)

Synaptics multitouch coming - maybe

By Jonathan Corbet
October 11, 2010
Single-pointer, mouse-based interfaces may be with us for a while yet, but much of the interesting user-interface development activity these days involves multitouch interfaces - those which can track multiple input position events simultaneously. Multitouch can be found on a number of touchscreen-based smartphones and on some trackpad-based laptops. There's also a lot of interesting potential for multitouch in other places - think about multiple people working on a table-size (or wall-size) display. At the bottom level, though, detecting multitouch events requires support from the hardware, and, for a number of touchpad-based systems, that hardware comes from Synaptics.

The Linux driver for Synaptics touchpads does not currently support multitouch mode for the usual reason: Synaptics has not deigned to tell the world how to actually use its hardware. There is hope for change, though: Chase Douglas has recently posted a set of Synaptics driver patches which add touchpad support based on information obtained through reverse engineering. There are evidently still a few glitches to work out, but the mode appears to work. Alas, that does not necessarily mean we'll have Synaptics multitouch support in the near future; the real difficulties may be outside of the technical realm.

When Chase posted his patches, Takashi Iwai - better known for his ALSA sound driver work - responded that he had also worked on multitouch support:

Great! Finally someone found it out! I found this and made a series of patches in 4 months ago. Since then, Novell legal prohibited me to send the patches to the upstream due to "possible patent infringing". Now you cracked out. Yay.

In the ensuing discussion, it became clear that Chase's patch was, in fact, Takashi's patch with some improvements added. Takashi had apparently posted the patch set once before Novell legal called a halt to the exercise; that work had been stashed in a Launchpad page until Chase stumbled across it, made some improvements, and resubmitted the code. (Just to be clear: it does not appear that Chase was trying to take credit for somebody else's work; he just hadn't understood the original source of the code).

Evidently, Takashi sees the independent posting of the code as being sufficient to get around Novell legal's objections to its merging; he responded with enthusiasm despite being allegedly on vacation. Chase's kernel patches have been topped up with a series of patches taking advantage of the new kernel support and adding user-level support for nice things like three-finger and "pinch" gestures. All of this code has seemingly been waiting for its chance to escape into the wild; all it took was for somebody else to start pushing the kernel-side code.

So it seems that the floodgates are open and multitouch support on Synaptics devices will be available to all. But there could yet be a catch. As Chase noted: "If you're the originator of the work, and my patch is accepted, I think we'll need your SOB on it." Without a signoff from Takashi, this code may not be accepted into the mainline. Takashi has suggested that his signoff from the initial posting could be used, but he appears to be unwilling to repost the code with a signoff now.

And that's where the trouble could come: your editor has had no contact with Novell's legal department and has no special knowledge, but it would not be surprising to learn that the concerns that department had about this code might not be swept aside quite so easily. It's possible that the new signoff from Takashi might not be forthcoming. Or, possibly, distributors will get cold feet for the same reason that Novell legal did and decide not to enable the feature. This code has been in the wild for some months now, but it has not found its way into users' systems; the increasing attention being paid to it now may not be enough to change that fact.

The inability to use Takashi's code - if, indeed, it comes to that - would not be a huge problem. The important thing is the information on how the hardware works; given that, some energetic hacker would undoubtedly reimplement the changes in short order. Patent concerns could be harder to work around, though. Without knowledge of which patents were at issue, it's hard to say how big an obstacle they could be. By some accounts, multitouch interfaces in general are patented, though that does not seem to have stopped some companies from incorporating such interfaces into their products. If it stops nervous Linux distributors, though, Linux users as a whole will be the losers.

That, of course, is the nature of the software patent system. But the scenario described above is highly speculative at this point. The important thing is that the code (along with the associated hardware information) is out there and available for those who would incorporate it into their systems. Hopefully it will be more widely distributed soon. Unfortunately, the wait for those nice nice wall-size displays may be just a little bit longer.

Comments (12 posted)

Statistics for the 2.6.36 development cycle

By Jonathan Corbet
October 13, 2010
As this is being written, the last 2.6.36 prepatch has (with luck) been released and the final release can be expected within a few days. So it is time to have a look at how this development cycle has gone. There are a couple of things which distinguish 2.6.36 from its predecessors in interesting ways.

The 2.6.36 kernel will incorporate about 9400 changesets contributed by 1159 developers. It thus continues a recent trend toward less active development cycles; here is what we have seen over the course of the last year or so:


The work which pushed up the changeset numbers in previous development cycles (shoveling out-of-tree code into the staging directory being at the top of the list) continues to wind down, as does work in other areas (like new filesystems). As a result, the kernel is going through a period of relatively low flux - but only relative to the last couple of years - and stabilization. That said, it's worth noting that, unless something unexpected happens, the 2.6.36 development cycle will be one of the shortest in recent memory; as a result, the number of changesets merged per day is the highest since 2.6.30.

Perhaps more interesting is this set of numbers: in 2.6.36, the development community added 604,000 lines of code and deleted 651,000 - for a total loss of almost 47,000 lines of code. This is the first time since the beginning of the git era that the size of the kernel source has gone down. Given that, perhaps it is appropriate to start with a look at who has been so busily removing code from the kernel:

Most lines removed - 2.6.36
Sam Ravnborg20581331.6%
Benjamin Herrenschmidt13366620.5%
Amerigo Wang191452.9%
Tony Luck84181.3%
Greg Kroah-Hartman70941.1%
Kiran Divekar44870.7%
Palash Bandyopadhyay44570.7%
Vincent Sanders34670.5%
Dave Jones26000.4%
Christoph Hellwig21630.3%

Sam Ravnborg and Ben Herrenschmidt both got to the top of the list through the removal of a bunch of defconfig files, part of a general cleanup inspired by some grumpiness from Linus back in June; Sam also finished up some SPARC unification work. Amerigo Wang removed a number of old and unused drivers. Between the three of them, they got rid of almost 360,000 lines of code - a laudable bit of work.

Looking at code changes in general for the 2.6.36 development cycle yields this picture:

Most active 2.6.36 developers
By changesets
Vasiliy Kulikov1601.7%
Eric Paris1241.3%
Dan Carpenter1221.3%
Chris Wilson1171.3%
Eric Dumazet1081.2%
Uwe Kleine-König1031.1%
Axel Lin981.0%
Johannes Berg971.0%
Al Viro961.0%
Julia Lawall891.0%
Tejun Heo880.9%
Joe Perches830.9%
Christoph Hellwig730.8%
Alex Deucher710.8%
Ben Skeggs690.7%
John W. Linville680.7%
Stefan Richter640.7%
Stephen M. Cameron620.7%
Felix Fietkau600.6%
Randy Dunlap590.6%
By changed lines
Sam Ravnborg20827019.4%
Benjamin Herrenschmidt13481112.5%
Chris Metcalf532044.9%
Omar Ramirez Luna510874.8%
Amerigo Wang191911.8%
Jarod Wilson160201.5%
Felix Fietkau118981.1%
Alan Olsen116501.1%
Mike Thomas110871.0%
Lars-Peter Clausen107951.0%
Tony Luck93510.9%
Tetsuo Handa79550.7%
Casey Leedom78880.7%
John Johansen75910.7%
Greg Kroah-Hartman71950.7%
Charles Clément68640.6%
Dmitry Kravkov67540.6%
Kiran Divekar67530.6%
Ben Collins65400.6%
Christoph Hellwig60450.6%

On the changesets side, Vasiliy Kulikov leads with a long list of mostly small fixes, mostly in device driver code. The bulk of Eric Paris's work is the addition of the fanotify subsystem - work which, as of this writing, will not be enabled for the 2.6.36 release due to user-space ABI concerns. Dan Carpenter is another master of small fixes, usually for problems identified by static analysis tools. Chris Wilson had a large number of changes to the Intel i915 driver - and seemingly an even larger number fixing the resulting problems. Eric Dumazet's changes were a large number of fixes and improvements to the networking subsystem.

Three of the top five in the "lines changed" column have already been mentioned above. The other two are Chris Metcalf, who added the new "Tile" architecture, and Omar Ramirez Luna, who added the TI dspbridge driver to the staging tree.

Only one top-five developer (Dan Carpenter) was also in the top five for 2.6.35; there are a lot of new faces on the list this time around.

There were 184 employers (that we could identify) who contributed code to the 2.6.36 kernel. The top corporate supporters were:

Most active 2.6.36 employers
By changesets
Red Hat112912.1%
Texas Instruments1892.0%
Societe Francaise de Radiotelephone1081.2%
Atheros Communications991.1%
By lines changed
Red Hat764557.1%
Texas Instruments635215.9%
ST Ericsson83900.8%
Atheros Communications77620.7%

For the most part, this list looks the way it has for most development cycles, but there are a couple of new names here. One is Tilera, the company behind the Tile architecture, which got its support merged for 2.6.36. The other name appearing here for the first time is Canonical, which got the AppArmor security module code merged at last. Meanwhile, one should not forget the other 164 companies which do not appear on the above list; the commercial ecosystem around the Linux kernel remains strong and diverse.

Finally, your editor decided to rerun an old experiment to look at the longevity of code in the kernel. Every line in the kernel source was mapped back to the kernel release where it was last changed, then the totals for each release were plotted. The resulting picture looks like this:

[Bar chart]

At 1.6% of the total, 2.6.36 represents a relatively small piece of the total code base - the smallest for a long time. Almost 29% of the kernel code still dates back to the beginning of the git era, down from 31% last February. While much of our kernel code is quite new - 31% of the code comes from 2.6.30 or newer - much of it has also hung around for a long time.

All told, 2.6.36 was a relaxed development cycle with relatively few big new features and a fair amount of cleanup. That is certainly part of how it was able to be stabilized in a shorter-than-usual period, and with fewer than the usual number of regressions (56 reported as of October 10, as opposed to 100 for 2.6.35-rc6). Whether 2.6.36 represents a new norm for a slightly slower kernel development process remains to be seen. As of this writing, the linux-next tree contains 5850 changesets, most of which are presumably intended for 2.6.37. Quite a few changes still typically do not appear in linux-next prior to the opening of the merge window, so we should see more changes than that merged for 2.6.37. Still, current linux-next does not look like a huge wave of pent-up changes waiting to fly into the mainline; 2.6.37 may or may not exceed 2.6.36 in the number of changes, but it does not look like it will be breaking any records.

Comments (9 posted)

Patches and updates

Kernel trees


Core kernel code

Development tools

Device drivers


Filesystems and block I/O

Memory management



Virtualization and containers

Benchmarks and bugs


Page editor: Jonathan Corbet
Next page: Distributions>>

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds