User: Password:
Subscribe / Log in / New account

Kernel development

Brief items

Kernel release status

The current 2.6 development kernel is 2.6.26-rc5, released on June 4. As is usual at this point in the release cycle, it is mostly bug fixes and the like. There are a fair number of changes in the core kernel code, mostly for scheduler issues, including some reverts for some performance regressions. "Another week, another batch of mostly pretty small fixes. Hopefully the regression list is shrinking, and we've fixed at least a couple of the oopses on Arjan's list." See the long-format changelog for all the details. A 2.6.26-rc6 release is probably coming soon.

The current -mm tree is 2.6.26-rc5-mm2 which is a bug fix for 2.6.26-rc5-mm1, also released this week. The main additions are the unprivileged mounts tree and a "large number of deep changes to memory management."

The current stable 2.6 kernel is, released on June 9. It has a whole pile of bugfixes, with none that are specifically called out as security related. "It contains a number of assorted bugfixes all over the tree. Users are encouraged to update." See the LWN announcement for some discussion about potential security issues with this release. Also, note that was released on June 7 with "one security bug fix. If you are using CIFS or SNMP NAT you could be vulnerable and are encouraged to upgrade."

For older kernels: was released on June 6. "It only fixes a vulnerability in the netfilter ip_nat_snmp_basic module (CVE-2008-1673). If you don't use it, you don't need to upgrade."

Comments (2 posted)

Kernel development news

A new kernel tree: linux-staging

By Jake Edge
June 11, 2008

There's a new kernel tree in town. The linux-staging tree was announced by Greg Kroah-Hartman on 10 June. It is meant to hold drivers and other kernel patches that are working their way toward the mainline, but still have a ways to go. The intention is to collect them all together in one tree to make access and testing easier for interested developers.

According to Kroah-Hartman, linux-staging (or -staging as it will undoubtedly be known) "is an outgrowth of the Linux Driver Project, and the fact that there have been some complaints that there is no place for individual drivers to sit while they get cleaned up and into the proper shape for merging." By collecting the patches in one place, it will increase their visibility in the kernel community, potentially attracting more developers to assist in fixing, reviewing, and testing them.

The intent is for -staging to house self-contained patches—Kroah-Hartman mentions drivers and filesystems—that should not affect anyone who is not using them. Because of that, he is hoping that -staging can get included in the linux-next tree. As he says to Stephen Rothwell, maintainer of -next, in the announcement:

Yes, I know it contains things that will not be included in the next release, but the inclusion and basic build testing that is provided by your tree is invaluable. You can place it at the end, and if there is even a whiff of a problem in any of the patches, you have my full permission to drop them on the floor and run away screaming (and let me know please, so I can fix it up.)

The -next tree is meant for things that are headed for inclusion in the "N+1" kernel (where 2.6.N is the release under development), so including code not meant for that release is bending the rules a bit. As of this writing, Rothwell has not responded to the request to include -staging, but it would clearly benefit those patches to have a wider audience—with only a small impact on -next. There is no set timeline for patches to move from -staging into mainline, Kroah-Hartman says:

Based on some of the work that is needed on some of these drivers, it is much longer than N+2, unless we have some people step up to help out with the work. It's almost all janitorial work to do, but I know I personally don't have enough time to do it all, and can use the help.

The -staging tree is seen as a great place for Kernel Janitors and others who are interested in learning about kernel development to get their start. The announcement notes: "The code in this tree is in desperate need of cleanups and fixes that can be trivially found using 'sparse' and 'scripts/'." In the process of cleaning up the code, folks can learn how to create patches and how to get them accepted into a tree. From there, the hope is that more difficult tasks will be undertaken—with -staging or other kernel code—leading to a new crop of kernel hackers.

The current status of -staging shows 17 patches, most of which are drivers from the Linux Driver Project. Kroah-Hartman is actively encouraging more code to be submitted for -staging, as long as it meets some criteria for the tree. The tree is not meant to be a dumping ground for drivers that are being "thrown over the wall" in hopes that someone else will deal with them. It is also not meant for code that is being actively worked on by a group of developers in another tree somewhere—the reiser4 filesystem is mentioned as an example—it is for code that would otherwise languish.

The reaction on linux-kernel has so far been favorable, with questions being asked about what kinds of patches are appropriate for the tree, in particular new architectures. The -staging tree fills a niche that has not yet been covered by other trees. It also serves multiple purposes, from giving new developers a starting point to providing additional reviewing and testing opportunities for new drivers and other code. With luck, that will hasten the arrival of new features—along with new developers.

Comments (3 posted)

A summary of 2.6.26 API changes

By Jonathan Corbet
June 11, 2008
The 2.6.26 development cycle has stabilized to the point that it's possible to look at the internal API changes which have resulted. They include:

  • At long last, support for the KGDB interactive debugger has been added to the x86 architecture. There is a DocBook document in the Documentation directory which provides an overview on how to use this new facility. Some useful features (e.g. KGDB over Ethernet) are not yet supported, but this is a good start.

  • Page attribute table (PAT) support is also (again, at long last) available for the x86 architecture. PATs allow for fine-grained control of memory caching behavior with more flexibility than the older MTRR feature. See Documentation/x86/pat.txt for more information.

  • ioremap() on the x86 architecture will now always return an uncached mapping. Previously, it had taken a more relaxed approach, leaving the caching as the BIOS had set it up. The practical result was to almost always create uncached mappings, but with occasional exceptions. Drivers which depend on a cached mapping will now break; they will need to use ioremap_cache() instead. See this article for more information on this change and caching in general.

  • The generic semaphores patch has been merged. The semaphore code also has new down_killable() and down_timeout() functions.

  • The final users of struct class_device have been converted to use struct device instead. The class_device structure, along with its associated infrastructure, has been removed.

  • The nopage() virtual memory area operation has been removed; all in-tree code is now using fault() instead.

  • The object debugging infrastructure has been merged.

  • Two new functions (inode_getsecid() and ipc_getsecid()), added to support security modules and the audit code, provide general access to security IDs associated with inodes and IPC objects. A number of superblock-related LSM callbacks now take a struct path pointer instead of struct nameidata. There is also a new set of hooks providing generic audit support in the security module framework.

  • The now-unused ieee80211 software MAC layer has been removed; all of the drivers which needed it have been converted to mac80211. Also removed are the sk98lin network driver (in favor of skge) and bcm43xx (replaced by b43 and b43legacy).

  • The ata_port_operations structure used by libata drivers now supports a simple sort of operation inheritance, making it easier to write drivers which are "almost like" existing code, but with small differences.

  • A new function (ns_to_ktime()) converts a time value in nanoseconds to ktime_t.

  • Greg Kroah-Hartman is no longer the PCI subsystem maintainer, having passed that responsibility on to Jesse Barnes.

  • The seq_file code now accepts a return value of SEQ_SKIP from the show() callback; that value causes any accumulated output from that call to be discarded.

  • The Video4Linux2 API now defines a set of controls for camera devices; they allow user space to work with parameters like exposure type, tilt and pan, focus, and more.

  • On the x86 architecture, there is a new configuration parameter which allows gcc to make its own decisions about the inlining of functions, even when functions are declared inline. In some cases, this option can reduce the size of the kernel's text segment by over 2%.

  • The legacy IDE layer has gone through a lot of internal changes which will break any remaining out-of-tree IDE drivers.

  • A condition which triggers a warning from WARN_ON will now also taint the kernel.

  • The get_info() interface for /proc files has been removed. There is also a new function for creating /proc files:

        struct proc_dir_entry *proc_create_data(const char *name, mode_t mode,
    					    struct proc_dir_entry *parent,
    					    const struct file_operations *proc_fops,
    					    void *data);

    This version adds the data pointer, ensuring that it will be set in the resulting proc_dir_entry structure before user space can try to access it.

  • The klist type now has the usual-form macros for declaration and initialization: DEFINE_KLIST() and KLIST_INIT(). Two new functions (klist_add_after() and klist_add_before()) can be used to add entries to a klist in a specific position.

  • kmap_atomic_to_page() is no longer exported to modules.

  • There are some new generic functions for performing 64-bit integer division in the kernel:

        u64 div_u64(u64 dividend, u32 divisor);
        u64 div_u64_rem(u64 dividend, u32 divisor, u32 *remainder);
        s64 div_s64(s64 dividend, s32 divisor)
        s64 div_s64_rem(s64 dividend, s32 divisor, s32 *remainder);
    Unlike do_div(), these functions are explicit about whether signed or unsigned math is being done. The x86-specific div_long_long_rem() has been removed in favor of these new functions.

  • There is a new string function:

         bool sysfs_streq(const char *s1, const char *s2);

    It compares the two strings while ignoring an optional trailing newline.

  • The prototype for i2c probe() methods has changed:

         int (*probe)(struct i2c_client *client, 
                      const struct i2c_device_id *id);

    The new id argument supports i2c device name aliasing.

One change which did not happen in the end was the change to 4K kernel stacks by default on the x86 architecture. This is still a desired long-term goal, but it is hard to say when the developers might have enough confidence to make this change.

Comments (4 posted)

Andrew Morton on kernel development

By Jonathan Corbet
June 11, 2008
Andrew Morton is well-known in the kernel community for doing a wide variety of different tasks: maintaining the -mm tree for patches that may be on their way to the mainline, reviewing lots of patches, giving presentations about working with the community, and, in general, handling lots of important and visible kernel development chores. Things are changing in the way he does things, though, so we asked him a few questions by email. He responded at length about the -mm tree and how that is changing with the advent of linux-next, kernel quality, and what folks can do to help make the kernel better.

Years ago, there was a great deal of worry about the possibility of burning out Linus. Life seems to have gotten easier for him since then; now instead, I've heard concerns about burning out Andrew. It seems that you do a lot; how do you keep the pace and how long can we expect you to stay at it?

I do less than I used to. Mainly because I have to - you can't do the same thing at a high level of intensity for over five years and stay sane.

I'm still keeping up with the reviewing and merging but the -mm release periods are now far too long.

There are of course many things which I should do but which I do not.

Over the years my role has fortunately decreased - more maintainers are running their own trees and the introduction of the linux-next tree (operated by Stephen Rothwell) has helped a lot.

The linux-next tree means that 85% of the code which I used to redistribute for external testing is now being redistributed by Stephen. Some time in the next month or two I will dive into my scripts and will find a way to get the sufficiently-stable parts of the -mm tree into linux-next and then I will hopefully be able to stop doing -mm releases altogether.

So. The work level is ramping down, and others are taking things on.

What can we do to help?

I think code review would be the main thing. It's a pretty specialised function to review new code well. The people who specialise in the area which the new code is changing are the best reviewers but unfortunately I will regularly find myself having to review someone else's stuff.

Secondly: it would help if people's patches were less buggy. I still have to fix a stupidly large number of compile warnings and compilation errors and each -mm release requires me to perform probably three or four separate bisection searches to weed out bad patches.

Thirdly: testing, testing, testing.

Fourthly: it's stupid how often I end up being the primary responder on bug reports. I'll typically read the linux-kernel list in 1000-email batches once every few days and each time I will come across multiple bug reports which are one to three days old and which nobody has done anything about! And sometimes I know that the person who is responsible for that part of the kernel has read the report. grr.

Is it your opinion that the quality of the kernel is in decline? Most developers seem to be pretty sanguine about the overall quality problem. Assuming there's a difference of opinion here, where do you think it comes from? How can we resolve it?

I used to think it was in decline, and I think that I might think that it still is. I see so many regressions which we never fix. Obviously we fix bugs as well as add them, but it is very hard to determine what the overall result of this is.

When I'm out and about I will very often hear from people whose machines we broke in ways which I'd never heard about before. I ask them to send a bug report (expecting that nothing will end up being done about it) but they rarely do.

So I don't know where we are and I don't know what to do. All I can do is to encourage testers to report bugs and to be persistent with them, and I continue to stick my thumb in developers' ribs to get something done about them.

I do think that it would be nice to have a bugfix-only kernel release. One which is loudly publicised and during which we encourage everyone to send us their bug reports and we'll spend a couple of months doing nothing else but try to fix them. I haven't pushed this much at all, but it would be interesting to try it once. If it is beneficial, we can do it again some other time.

There have been a number of kernel security problems disclosed recently. Is any particular effort being put into the prevention and repair of security holes? What do you think we should be doing in this area?

People continue to develop new static code checkers and new runtime infrastructure which can find security holes.

But a security hole is just a bug - it is just a particular type of bug, so one way in which we can reduce the incidence rate is to write less bugs. See above. More careful coding, more careful review, etc.

Now, is there any special pattern to a security-affecting bug? One which would allow us to focus more resources on preventing that type of bug than we do upon preventing "average" bugs? Well, perhaps. If someone were to sit down and go through the past five years' worth of kernel security bugs and pull together an overall picture of what our commonly-made security-affecting bugs are, then that information could perhaps be used to guide code-reviewers' efforts and code-checking tools.

That being said, I have the impression that most of our "security holes" are bugs in ancient crufty old code, mainly drivers, which nobody runs and which nobody even loads. So most metrics and measurements on kernel security holes are, I believe, misleading and unuseful.

Those security-affecting bugs in the core kernel which affect all kernel users are rare, simply because so much attention and work gets devoted to the core kernel. This is why the recent splice bug was such a surprise and head-slapper.

I have sensed that there is a bit of confusion about the difference between -mm and linux-next. How would you describe the purpose of these two trees? Which one should interested people be testing?

Well, things are in flux at present.

The -mm tree used to consist of the following:

  • 80-odd subsystem maintainer trees (git and quilt), eg: scsi, usb, net.
  • various patches which I picked up which should be in a subsystem maintainer's tree, but which for one of various reasons didn't get merged there. I spend a lot of time acting as backup for leaky maintainers.
  • patches which are mastered in the -mm tree. These are now organised as subsystems too, and I count about 100 such subsystems which are mastered in -mm. eg: fbdev, signals, uml, procfs. And memory management.
  • more speculative things which aren't intended for mainline in the short-term, such as new filesystems (eg reiser4).
  • debugging patches which I never intend to go upstream.

The 80-odd subsystem trees in fact account for 85% of the changes which go into Linux. Pretty much all of the remaining 15% are the only-in-mm patches.

Right now (at 2.6.26-rc4 in "kernel time"), the 80-odd subsystem trees are in linux-next. I now merge linux-next into -mm rather than the 80-odd separate trees.

As mentioned previously, I plan to move more of -mm into linux-next - the 100-odd little subsystem trees.

Once that has happened, there isn't really much left in -mm. Just

  • the patches which subsystem maintainers leaked. I send these to the subsystem maintainers.
  • the speculative not-for-next-release features
  • the not-to-be-merged debugging patches.

Do you have any specific goals for the development of the kernel over the next year or so? What would they be?

Steady as she goes, basically.

I keep on hoping that kernel development in general will start to ramp down. There cannot be an infinite number of new features out there! Eventually we should get into more of a maintenance mode where we just fix bugs, tweak performance and add new drivers. Famous last words.

And it's just vaguely possible that we're starting to see that happening now. I do get a sense that there are less "big" changes coming in. When I sent my usual 1000-patch stream at Linus for 2.6.26 I actually received an email from him asking (paraphrased) "hey, where's all the scary stuff?"

In the early-May discussions, Linus said a couple of times that he does not think code review helps much. Do you agree with that point of view?


How would you describe the real role of code review in the kernel development process?

Well, it finds bugs. It improves the quality of the code. Sometimes it prevents really really bad things from getting into the product. Such as rootholes in the core kernel. I've spotted a decent number of these at review time.

It also increases the number of people who have an understanding of the new code - both the reviewer(s) and those who closely followed the review are now better able to support that code.

Also, I expect that the prospect of receiving a close review will keep the originators on their toes - make them take more care over their work.

There clearly must be quite a bit of communication between you and Linus, but much of it, it seems, is out of the public view. Could you describe how the two of you work together? How are decisions (such as when to release) made?

Actually we hardly ever say anything much. We'll meet face-to-face once or twice a year and "hi how's it going".

We each know how the other works and I hope we find each other predictable and that we have no particular issues with the other's actions. There just doesn't seem to be much to say, really.

Is there anything else you would like to say to LWN's readers?

Sure. Please do contribute to Linux, and a great way of doing that is to test latest mainline or linux-next or -mm and to report on any problems which you encounter.

Nothing special is needed - just install it on as many machines as you dare and use them in your normal day-to-day activities.

If you do hit a bug (and you will) then please be persistent in getting us to fix it. Don't let us release a kernel with your bug in it! Shout at us if that's what it takes. Just don't let us break your machines.

Our testers are our greatest resource - the whole kernel project would grind to a complete halt without them. I profusely thank them at every opportunity I get :)

We would like to thank Andrew for taking time to answer our questions.

Comments (40 posted)

Patches and updates

Kernel trees


Core kernel code

Development tools

Device drivers


Filesystems and block I/O

Memory management

Virtualization and containers

Page editor: Jake Edge
Next page: Distributions>>

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds