LWN.net Logo

Kernel development

Brief items

Kernel release status

The 3.5 kernel was released on July 21; see Linus's announcement for some low-level details. Headline features in 3.5 include the CoDel queue management algorithm (a piece of the solution to the bufferbloat problem), the seccomp filters sandboxing mechanism, autosleep (an alternative to Android's opportunistic suspend mechanism), the uprobes user-space probe subsystem, the contiguous memory allocator, the new kcmp() system call, metadata checksumming in the ext4 filesystem, and a lot more. See the KernelNewbies 3.5 page for more information.

Stable updates: 3.0.38 and 3.4.6 were released on July 19 with the usual set of important fixes. The 3.2.24 update is in the review process as of this writing; its release can be expected any time.

Comments (none posted)

Quotes of the week

The magic constant police and the whitespace police are the TSA of the Linux kernel. In theory they are there to make things safer, but silently everyone thinks that these were the little kids that always got bullied in kindergarten, and this is their revenge to the rest of the population.
-- Arjan van de Ven

FWIW, I'm all for performance backports. They do have a downside though (other than the risk of bugs slipping in, or triggering latent bugs).

When the next enterprise kernel is built, marketeers ask for numbers to make potential customers drool over, and you _can't produce any_ because you wedged all the spiffy performance stuff into the crusty old kernel.

-- Mike Galbraith

Comments (none posted)

CRtools 0.1 released

The OpenVZ blog has the announcement of the release of CRtools 0.1. "It is our ultimate goal to merge all bits and pieces of OpenVZ to the mainstream Linux kernel. It's not a big secret that we failed miserably trying to merge the checkpoint/restore [CPT] functionality (and yes, we have tried more than once). The fact that everyone else failed as well soothes the pain a bit, but is not really helpful. The reason is simple: CPT code is big, complex, and touches way too many places in the kernel. So we* came up with an idea to implement most of CPT stuff in user space, i.e. as a separate program not as a part of the Linux kernel. In practice this is impossible because some kernel trickery is still required here and there, but the whole point was to limit kernel intervention to the bare minimum. Guess what? It worked even better that we expected. As of today, after about a year of development, up to 90% of the stuff that is needed to be in the kernel is already there, and the rest is ready and seems to be relatively easy to merge."

Comments (36 posted)

Kernel development news

3.6 merge window part 1

By Jonathan Corbet
July 25, 2012
Linus traditionally waits for a day or so after a major release before beginning to merge patches for the next cycle, but, with 3.6, he started right in. As of this writing, some 4,300 non-merge changesets have been pulled into the mainline; much of the activity thus far has been from the networking and ARM subsystems. Significant user-visible changes include:

  • The perf events subsystem now has support for the "uncore" performance measurement unit on Intel Nehalem and Sandy Bridge CPUs.

  • The x86 architecture now supports the reboot=bios and reboot-cpu command-line options on 64-bit processors (as well as on 32-bit, which has been supported for a long time)

  • "Suspend to both" support allows the system to be suspended after writing a hibernation image to disk. Then, should power run out before the suspended system is resumed, it can be restarted from the disk image instead.

  • The CANFD extension to the controller area network (CAN) protocol is now supported.

  • Numerous netfilter modules have gained proper namespace support. The netfilter user-space connection tracking helper infrastructure has also been merged.

  • The Bluetooth layer now has "three-wire UART" support, enabling Bluetooth operations over serial port connections.

  • The TCP small queues patch set, another piece of the solution to the bufferbloat problem, has been merged.

  • The TCP fast open protocol extension has been merged. TCP fast open is a patch out of Google that reduces the overhead of TCP connection setup, hopefully making protocols like HTTP go faster.

  • A long effort to remove the IPv4 routing cache from the networking subsystem has come to its conclusion. David Miller wrote:

    The ipv4 routing cache is non-deterministic, performance wise, and is subject to reasonably easy to launch denial of service attacks. The routing cache works great for well behaved traffic, and the world was a much friendlier place when the tradeoffs that led to the routing cache's design were considered.

    What it boils down to is that the performance of the routing cache is a product of the traffic patterns seen by a system rather than being a product of the contents of the routing tables.

    The replacement code simplifies the networking subsystem and, hopefully, gives better performance on high-volume systems.

  • New hardware support includes:

    • Processors and systems: Freescale BSC9131RDB reference boards, Altera SOCFPGA Cyclone V systems, Marvell Armada 370 and Armada XP boards, TI OMAP5 processors, TI EVMC6678LE evaluation boards, and Freescale (Motorola) Coldfire 5251/5253 and 5441x processors.

    • Audio: TI Isabelle audio ICs, ST-Ericsson AB8500 codecs, Dialog DA732x audio codecs Wolfson Micro WM5102 and WM5110 audio controllers, and ST STA529 audio amplifiers.

    • Input: Lenovo ThinkPad USB keyboards with trackpoint and Roccat Savu gaming mice.

    • Miscellaneous: Samsung S2MPS11 voltage regulators, Maxim 77686 voltage regulators, TI/National Semiconductor LP8720/LP8725 voltage regulators, Dialog Semiconductor DA9052 PMICs, Honeywell Humidicon HIH-6130/HIH-6131 humidity sensors, Wolfson Micro WM831x and WM832x PMICs, and NVIDIA Tegra20 APB DMA controllers.

    • Networking: RealTek rt3290 WiFi controllers, Sony PaSoRi contactless reader NFC controllers, Atmel RF230/231 radio transceivers, Broadcom BCM8706 and BCM8727 PHYs, and Asix AX88172A USB 2.0 Ethernet interfaces.

Changes visible to kernel developers include:

  • The obsolete static_branch() interface has been removed in favor of static_key_true() and static_key_false(). Some information on this interface can be found in this article.

  • Some initial work has been done to separate the dynamic tick code from the idle task, setting the ground for stopping the timer tick on non-idle CPUs.

  • The power domains subsystem has seen some integration with the cpuidle code to handle situations where devices share power lines with CPU cores.

  • The VFS layer has seen some significant changes. There is a new atomic_open() inode operation that combines the process of looking up, possibly creating, and opening a file into a single, atomic operation. The whole "open intents" mechanism has been removed. Numerous other operations have had prototype changes. The deferred fput() changes have been merged, simplifying the process of cleaning up file structures.

  • The PowerPC architecture now supports the jump label mechanism.

  • The NLMSG_NEW() and NLMSG_PUT() macros have been removed from the netlink interface.

  • The input subsystem has a new interface for the creation of user-space drivers; see Documentation/hid/uhid.txt for details.

  • There is a new grouping mechanism for I/O memory management units intended to help enable safe device access to virtualized guests.

This merge window can be expected to last until sometime around August 4, so there is quite a bit of code that can be expected to find its way into the mainline before the -rc1 release happens. See next week's Kernel Page for coverage of the continuation of the 3.6 merge window.

Comments (2 posted)

The UAPI header file split

By Michael Kerrisk
July 25, 2012

Patches that add new software features often gain the biggest headlines in free software projects. However, once a project reaches a certain size, refactoring work that improves the overall maintainability of the code is arguably at least as important. While such work does not improve the lives of users, it certainly improves the lives of developers, by easing later work that does add new features.

With around 15 million lines of code (including 17,161 .c files and 14,222 .h files) in the recent 3.5 release, the Linux kernel falls firmly into the category of projects large enough that periodic refactoring is a necessary and important task. Sometimes, however, the sheer size of the code base means that refactoring becomes a formidable task—one that verges on being impossible if attempted manually. At that point, an enterprising kernel hacker may well turn to writing code that refactors the kernel code. David Howell's UAPI patch series, which has been proposed for inclusion during the last few kernel merge windows, was created using such an approach.

The UAPI patchset was motivated by David's observation that when modifying the kernel code:

I occasionally run into a problem where I can't write an inline function in a header file because I need to access something from another header that includes this one. Due to this, I end up writing it as a #define instead.

He went on to elaborate that this problem of "inclusion recursion" in header files typically occurs with inline functions:

Quite often it's a case of an inline function in header A wanting a struct [or constant or whatever] from header B, but header B already has an inline function that wants a struct from header A.

As is the way of such things, a small itch can lead one to thinking about more general problems, and how to solve them, and David has devised a grand nine-step plan of changes to achieve his goals, of which the current patch set is just the first step. However, this step is, in terms of code churn, a big one.

What David wants to do is to split out the user-space API content of the kernel header files in the include and arch/xxxxxx/include directories, placing that content into corresponding headers created in new uapi/ subdirectories that reside under each of the original directories. As well as being a step toward solving his original problem and performing a number of other useful code cleanups, David notes that disintegrating the header files has many other benefits. It simplifies and reduces the size of the kernel-only headers. More importantly, splitting out the user-space APIs into separate headers has the desirable consequence that it "simplifies the complex interdependencies between headers that are [currently] partly exported to userspace".

There is one other benefit of the UAPI split that may be of particular interest to the wider Linux ecosystem. By placing all of the user-space API-related definitions into files dedicated solely to that task, it becomes easier to track changes to the APIs that the kernel presents to user space. In the first instance, these changes can be discovered by scanning the git logs for changes in files under the uapi/ subdirectories. Easing the task of tracking user-space APIs would help many other parts of the ecosystem, for example, C library maintainers, scripting language projects that maintain language bindings for the user-space API, testing projects such as LTP, documentation projects such as man-pages, and perhaps even LWN editors preparing summaries of changes in the merge window that occurs at the start of each kernel release cycle.

The task of disintegrating each of the header files into two pieces is in principle straightforward. In the general case, each header file has the following form:

    /* Header comments (copyright, etc.) */

    #ifndef _XXXXXX_H     /* Guard macro preventing double inclusion */
    #define _XXXXXX_H

    [User-space definitions]

    #ifdef __KERNEL__

    [Kernel-space definitions]

    #endif /* __KERNEL__ */

    [User-space definitions]
  
    #endif /* End prevent double inclusion */

Each of the above parts may or may not be present in individual header files, and there may be multiple blocks governed by #ifdef __KERNEL__ preprocessor directives.

The part of this file that is of most interest is the code that falls inside the outermost #ifndef block that prevents double inclusion of the header file. Everything inside that block that is not nested within a block governed by a #ifdef __KERNEL__ block should move to the corresponding uapi/ header file. The content inside the #ifdef __KERNEL__ block remains in the original header file, but the #ifdef __KERNEL__ and its accompanying #endif are removed.

A copy of the header comments remains in the original header file, and is duplicated in the new uapi/ header file. In addition, a #include directive needs to be added to the original header file so that it includes the new uapi/ header file, and of course a suitable git commit message needs to be supplied for the change.

The goal is to modify the original header file to look like this:

    /* Header comments (copyright, etc.) */

    #ifndef _XXXXXX_H     /* Guard macro preventing double inclusion */
    #define _XXXXXX_H

    #include <include/uapi/path/to/header.h>

    [Kernel-space definitions]

    #endif /* End prevent double inclusion */

The corresponding uapi/ header file will look like this:

    /* Header comments (copyright, etc.) */

    #ifndef _UAPI__XXXXXX_H     /* Guard macro preventing double inclusion */
    #define _UAPI__XXXXXX_H

    [User-space definitions]

    #endif /* End prevent double inclusion */

Of course, there are various details to handle in order to correctly automate this task. First of all, sometimes the script should produce only one result file. If there is no #ifdef __KERNEL__ block in the original header, the original header file is in effect renamed to the uapi/ file. Where the header file is disintegrated into two files, there are many other details that need to be handled. For example, if there are #include directives that are retained at the top of the original header file, then the #include for the generated uapi/ file should be placed after those #include directives (in case the included uapi/ file has dependencies on them). Furthermore, there may be pieces of the original header that are explicitly not intended for kernel space (i.e., they are for user-space only)—for example, pieces governed by #ifndef __KERNEL__. Those pieces should migrate to the uapi/ file, retaining the guarding #ifndef __KERNEL__.

David's scripts handle all of the aforementioned details, and many others as well, including making corresponding changes to .c source files and various kernel build files. Naturally, no scripting can correctly handle all possible cases in human-generated files, so part of the current patch set includes pre-patches that add markers to "coach" the scripts to do the right thing in those cases.

Writing scripts to automate this sort of task becomes a major programming project in its own right, and the shell and Perl scripts (.tar.xz archive) to accomplish the task run total more than 1800 lines. (Using scripting to generate the patch set has the notable benefit that the patch set can be automatically refreshed as the relevant kernel source files are changed by other kernel developers. Given that the UAPI patches touch a large number of files, this is an important consideration.)

Knowing the size of those scripts, and the effort that must have been required to write them, gives us a clue that the scale of the actual changes to the kernel code must be large. And indeed they are. In its current incarnation, the UAPI patch series consists of 74 commits, of which 65 are scripted (the scripted changes produce commits to the kernel source tree on a per-directory basis). Altogether, the patches touch more than 3500 files, and the diff of the changes amounts to over 300,000 lines.

The scale of these changes brings David to his next problem: how to get the changes accepted by Linus. The problem is that it's impossible to manually review source code changes of this magnitude. Even a partial review would require considerable effort, and would not provide ironclad guarantees about the remaining unreviewed changes. In the absence of such reviews, when Linus received David's request to pull these patches in the 3.5 merge window, he employed a time-honored strategy: the request was ignored.

Although David first started working on these changes around a year ago, Linus has not to date directly commented on them. However, back in January Linus accepted some preparatory patches for the UAPI work, which suggests that he's at least aware of the proposal and possibly willing to entertain it. Other kernel developers have expressed support for the UAPI split (1 and 2). However, probably because of the magnitude of the changes, getting actual reviews and Acked-by: tags has to date proved to be a challenge. Given the impossibility of a complete manual review of the changes, the best hope would seem to be to have other developers review the conceptual approach employed by David's scripts, possibly review the scripts themselves, perform a review of a sample of the changed kernel source files, and perform kernel builds on as many different architectures as possible. (Aspiring kernel hackers might note that much of the review task on this quite important piece of kernel work does not require deep understanding of the workings of the kernel.)

Getting sufficient review of any set of kernel patches, let alone a set this large, is a perennial difficulty. Things at least took a step forward with David's request to Linus to have the patches accepted for the currently open 3.6 merge window, when Arnd Bergmann provided his Acked-by: for the entire patch series. Whether that will prove enough, or whether Linus will want to see formal agreement from additional developers before accepting the patches is an open question. If it proves insufficient for this merge window, then perhaps a rethink will be required next time around about how to have such a large change accepted into the mainline kernel.

Comments (12 posted)

Who wrote 3.5

July 25, 2012

This article was contributed by Greg Kroah-Hartman.

Now that the 3.5 Linux kernel has been released, it's time for the traditional look at who wrote it. Here we'll try to summarize who did all of the work that went into this release.

Fastest-changing kernel ever

The 3.5 kernel was released one day faster than the 3.4 kernel was, in 62 days. The last time a kernel was released this quickly was back in 2005 with the 2.6.14 kernel release (61 days).

In those 62 days, the kernel developers crammed in a record-breaking 176.73 changes per day (7.36 changes per hour.) This is the fastest-changing kernel that has been recorded since I started keeping track of this development metric back in the 2.5 kernel release series.

These changes resulted in the following overall changes:

Changes in 3.5
571987 lines added
358836 lines removed
135848 lines modified

The kernel is still increasing at a pretty constant 1.37% growth in the number of lines and files, which is similar to the growth rate of the past three kernel releases.

Individual contributions

1,195 different developers contributing patches to the 3.5 kernel; those developers worked for at least 194 different companies. The names of the contributing developers are pretty familiar to those who track these statistics:

Most active 3.5 developers
By changesets
Greg Kroah-Hartman2392.2%
Axel Lin1911.7%
Mark Brown1871.7%
H. Hartley Sweeten1351.2%
David S. Miller1311.2%
Daniel Vetter1301.2%
Al Viro1281.2%
Stephen Warren1211.1%
Tejun Heo1121.0%
Eric Dumazet1051.0%
Hans Verkuil1020.9%
Paul Mundt1020.9%
Johannes Berg1020.9%
Shawn Guo1020.9%
Thomas Gleixner980.9%
Dan Carpenter860.8%
Sam Ravnborg840.8%
Chris Wilson790.7%
Trond Myklebust740.7%
Eric W. Biederman730.7%
Jiri Slaby730.7%
Arnaldo Carvalho de Melo710.6%
Artem Bityutskiy680.6%
Hans de Goede680.6%
Takashi Iwai640.6%
By changed lines
Paul Gortmaker440005.7%
Viresh Kumar204252.7%
Steven Rostedt146151.9%
H. Hartley Sweeten130831.7%
Dave Airlie122171.6%
Sakari Ailus108351.4%
Dong Aisheng105741.4%
Sonic Zhang104941.4%
Paul Walmsley100841.3%
Ben Skeggs100001.3%
Rob Herring98861.3%
Sascha Hauer96021.3%
Stephen Warren93651.2%
Parav Pandit88461.2%
Nicholas Bellinger87041.1%
Linus Walleij84961.1%
Shawn Guo77971.0%
David S. Miller74451.0%
Phil Edworthy71890.9%
Sam Ravnborg67520.9%
Hans Verkuil67180.9%
Alexander Shishkin66680.9%
Tejun Heo65790.9%
Greg Kroah-Hartman65240.9%
Vladimir Serbinenko64510.8%

In the quantity category (remember, we don't judge quality), I did a large number of cleanup patches removing old USB logging macros from the system, which resulted in the majority of my changes in the 3.5 kernel. Axel contributed a great number of regulator driver fixes and enhancements, and Mark Brown did the majority of his work in the sound system-on-a-chip drivers area. H. Hartley Sweeten has been working on cleaning up the Comedi (data acquisition) drivers to get them ready to move out of the staging area of the kernel. This work has him showing up in these statistics for the first time. And rounding out the top five is David Miller with a large number of networking core and driver patches.

Along with H. Hartley Sweeten, Daniel Vetter is also a newcomer to the "top changesets" list. His contributions came from numerous changes and enhancements to the Intel graphics drivers. Although Hans Verkuil is also a name that might not be familiar to many, his contributions to the Video4Linux drivers and core code show he is a core contributor to a subsystem that many users rely on every day.

Considering the statistics in lines changed, Paul Gortmaker leads by virtue of the fact that he deleted all of the old Token Ring drivers from the kernel. Viresh Kumar did a lot of SPEAr processor and driver work, adding numerous new drivers for the platform. Steven Rostedt did a large amount of development on ftrace and ktest (a kernel-testing tool). H. Hartley Sweeten did the aforementioned Comedi driver cleanup work, and Dave Arlie made major changes in the area of graphics drivers.

Reviewing the work

All kernel patches are reviewed and "Signed-off-by" a subsystem maintainer before they are committed to the Linux kernel. The developers with the most sign-offs for the 3.5 kernel were as follows:
Developers with the most signoffs (total 20391)
Greg Kroah-Hartman12166.0%
David S. Miller9224.5%
Mauro Carvalho Chehab6053.0%
Mark Brown5492.7%
John W. Linville4932.4%
Linus Torvalds4242.1%
Andrew Morton3731.8%
Daniel Vetter2681.3%
Dave Airlie2551.3%
Al Viro1971.0%
Axel Lin1910.9%
Trond Myklebust1730.8%
Arnaldo Carvalho de Melo1650.8%
James Bottomley1640.8%
Artem Bityutskiy1570.8%
Kyungmin Park1560.8%
Samuel Ortiz1540.8%
Linus Walleij1530.8%
Ingo Molnar1500.7%
Wey-Yi W Guy1460.7%
Thomas Gleixner1390.7%
Stephen Warren1360.7%
H. Hartley Sweeten1350.7%
Shawn Guo1310.6%
Paul Mundt1280.6%

I ended up doing the most sign-offs for this kernel release because of many changes in the staging and USB subsystems. David Miller follows with his work in the networking and networking driver trees, as well as in the IDE drivers. Mauro is the maintainer of the Video4Linux subsystem, Mark Brown is the maintainer of the embedded sound drivers, and John Linville is the maintainer of the wireless driver subsystem.

These numbers reflect the picture of what has been happening in the past few kernel releases, with the majority of changes happening in the staging and networking areas of the kernel.

Who sponsored this work

Here is the list of the companies who sponsored the developers doing the work for this kernel release, and the number of changes attributed to them:
Top changeset contributors by employer
(None)134312.3%
Red Hat112310.2%
Intel10619.7%
(Unknown)8607.8%
Linaro5194.7%
Novell4404.0%
Texas Instruments3132.9%
IBM2822.6%
Linux Foundation2792.5%
Google2652.4%
Samsung2512.3%
Oracle2041.9%
Renesas Electronics2011.8%
MiTAC1911.7%
NVIDIA1881.7%
Wolfson Microelectronics1871.7%
(Consultant)1601.5%
NetApp1531.4%
Vision Engraving Systems1351.2%
Qualcomm1211.1%

Longtime readers of this series of articles will notice that Linaro has appeared in the top 5 kernel developer companies by number of contributions for the first time. This is due to the increased number of patches Linaro has been contributing, as well as the organization's wish to have the member company employees' contributions be counted as coming from Linaro, instead of the member company itself, as we had previously been doing.

A newcomer to the top 20 companies is Vision Engraving Systems, thanks to the Comedi development work from H. Hartley Sweeten. With his work, hopefully this subsystem can move out of the staging area of the kernel in a future release.

Other than the large jump from Linaro, the other companies in the top 25 are well known. Even NVIDIA—despite Linus's well-publicized, and in my opinion well-deserved, criticism of its Linux graphics driver development efforts—continues to be a large contributor to the kernel in the area of embedded processor support for its products. Texas Instruments, Samsung, MiTAC, Wolfson Microelectronics, Qualcomm, Renesas, and Nokia are also primarily focused in the embedded Linux area, showing the wide range of ongoing company support for Linux in embedded systems.

Work continues as usual

With the 3.5 kernel release, the number of contributors remains as high as previous releases, the rate of contributions is greater than ever (as measured by number of patches per day), and the rate of increase in the size of the kernel code remains the same as it has been for the past year. This shows that the kernel development community is still growing, and maintaining its incredibly rapid development cycle, ensuring that Linux remains the largest software engineering project ever.

Comments (18 posted)

Patches and updates

Kernel trees

Core kernel code

Development tools

Device drivers

Documentation

Filesystems and block I/O

Memory management

Networking

Architecture-specific

Virtualization and containers

Miscellaneous

Page editor: Jonathan Corbet
Next page: Distributions>>

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds