|
|
Log in / Subscribe / Register

Kernel development

Brief items

Kernel release status

The 4.9 merge window remains open; see the separate article below for a summary of the work merged so far.

Stable updates: 4.8.1, 4.7.7, and 4.4.24 were released on October 7.

Comments (none posted)

Quotes of the week

As a _singlular_ argument, "it's for out-of-tree code" is weak. As an _additional_ argument, it has value. Saying "this only helps out-of-tree code" doesn't carry much weight. Saying "this helps kernel security, even for out-of-tree code" is perfectly valid. And a wrinkle in this is that some day, either that out-of-tree code, or brand new code, will land in the kernel, and we don't want to continue to require authors be aware of an opt-in security feature. The kernel should protect itself (and all of itself, including out-of-tree or future code) by default.
Kees Cook

this email is all in small letters because my gpg key expired so I couldn't sign the tag, and it's too early in the morning for me to go do gpg stuff.
Dave Airlie

I'm happy that you have found alternative identity management model, but I'm not sure this "all lower key" thing is considered a technically valid alternative to pgp signing from an identity validation standpoint.

I will have to ask around the security people to see what they think.

Linus Torvalds (thanks to Daniel Stone)

Comments (none posted)

Kernel development news

4.9 Merge window part 2

By Jonathan Corbet
October 12, 2016
As of this writing, Linus has pulled 13,488 non-merge changesets into the mainline repository for the 4.9 development cycle. That suggests that not only will 4.9 be the busiest cycle in the kernel's history, but that it will surpass the previous record (3.15, at 13,722 changesets) before the merge window closes. The merging of the greybus driver code has a lot to do with that but, even without greybus, there is a lot going on this time around.

Among the user-visible changes merged since last week's summary are:

  • The system calls for the memory protection keys feature have been merged. The pkey_alloc(), pkey_free(), and pkey_mprotect() calls are as described in this article, but the pkey_set() and pkey_free() calls, which can be implemented purely in user space, were not included. See Documentation/x86/protection-keys.txt for details.

  • The bottleneck bandwidth and RTT (BBR) congestion control algorithm has been merged.

  • The BATMAN mesh networking subsystem has a new netlink-based configuration mechanism that will, over time, supersede and replace the older, debugfs-based interface.

  • The netfilter module supports a new "quota" mechanism designed to enable the enforcement of byte quotas. There's also a new random-number generation module intended to enable the random distribution of packets (across multiple queues, for example).

  • There is a new just-in-time BPF compiler that can be used to load BPF programs for execution within Netronome network interfaces. In 4.9, only the cls_bpf classifier module will take advantage of this capability.

  • The filesystems in user space (FUSE) module now supports POSIX access-control lists.

  • The Greybus subsystem has been merged. This bus was intended for the "Project Ara" phone, which has since been canceled, but Greg Kroah-Hartman successfully argued for its inclusion anyway. This merge includes the entire development history for this code, some 2,400 changesets in total.

  • There is a new set of resource limits controlling how many namespaces may be created within any given user namespace. See Documentation/sysctl/user.txt for details.

  • The hardware latency tracer (which seeks to flush out latencies caused by the hardware itself) has moved into the mainline from the realtime tree. See Documentation/trace/hwlat_detector.txt for details and usage information.

  • The ubifs filesystem now supports overlayfs and the O_TMPFILE file-creation option.

  • New hardware support includes:

    • Systems and processors: Broadcom BCM53573-based processors.

    • Audio: Nuvoton NAU8810 audio codecs, Realtek RT5660/RT5663/RT5668 audio codecs, X-Powers AC100 audio codecs, and Samsung Exynos SoC low power audio subsystems.

    • Industrial I/O: Maxim thermocouple sensors, Measurement Computing CIO-DAC digital-to-analog converters, Asahi Kasei AK8974 3-axis magnetometers, Domintech DMARD05/DMARD06/DMARD07 accelerometers, Texas Instruments ADC161S626 1-channel differential analog-to-digital converters, Texas Instruments' ADC12130/ADC12132/ADC12138 analog-to-digital converters, MediaTek mt65xx analog-to-digital converters, Linear Technology LTC2485 analog-to-digital converters, Analog Devices AD8801/AD8803 digital-to-analog converters, Apex Embedded Systems STX104 analog-to-digital converters, mCube MC3230 digital accelerometers, and Murata ZPA2326 pressure sensors.

    • Media: Atmel image sensor controllers, Analog Devices AD5820 lens voice coils, Techwell TW5864 video/audio grabber/encoders, STMicroelectronics HVA multi-format video encoders, STMicroelectronics STiH4xx HDMI CEC interfaces, and Gennum GS1662 HD/SD-SDI serializers.

    • Miscellaneous: Rockchip RK818 power-management chips, Elan eKTF2127 touchscreen controllers, Microsemi PQI SCSI controllers, Intel integrated sensor hubs, Cavium ThunderX I2C buses, Cavium ThunderX random number generators, APM X-Gene SoC performance monitoring units, Qualcomm external bus interfaces (version 2), JDI LT070ME05000 WUXGA DSI panels, and Amlogic Meson PWM controllers.

    • Networking: Microsemi VSC85xx PHYs, Amazon Elastic Network adapters, Thunder RGX/RGMII MAC interfaces, Chelsio crypto coprocessors, Qualcomm EMAC gigabit Ethernet controllers, and Qualcomm Atheros QCA8K Ethernet switches.

    • Pin Control / GPIO: Aspeed G4/G5 pin and GPIO controllers, NextThing GR8 pin controllers, X-Powers AXP209 PMIC GPIO controllers, Intel Whiskey Cove PMIC GPIO controllers, Diamond Systems GPIO-MM controllers, Technologic Systems FPGA I2C GPIO controllers, and TI LP873X PMIC GPIO controllers.

    • Thermal: Qualcomm TSENS temperature sensors, QorIQ thermal monitoring units, and Intel Broxton PMIC thermal monitors.

Changes visible to kernel developers include:

  • The handling of messages printed with printk() has changed for the case of single-line messages created with multiple printk() calls. The rule has long been that the continuation lines should be marked with the KERN_CONT pseudo log level, but that requirement has not been enforced for several years. As of this commit, the use of KERN_CONT is again mandatory; without it, output will be garbled. Many places in the kernel will need fixing; for the short term, expect some ugly output from 4.9-rc kernels.

  • The "kthread_worker" API has seen a number of changes. These include the renaming of most functions to start with "kthread_" (e.g. init_kthread_worker() becomes kthread_init_worker()), the addition of kthread_create_worker() and kthread_destroy_worker(), support for delayed kthread work, and support for freezable kthreads.

  • The network subsystem has added a module called "strparser"; its job is to parse (in-kernel) application-layer protocol messages from a TCP connection. See Documentation/networking/strparser.txt for details.

  • The handling of extended attributes in filesystems has changed. Filesystems that support extended attributes should create an xattr_handlers structure with its low-level methods and attach it to the superblock structure. The setxattr(), getxattr() and removexattr() inode operations are no longer used and have been removed.

  • The rename() inode operation has gained a flags argument. In truth, rename() was removed and the rename2() operation was, well, renamed; all in-kernel filesystems have been updated to reflect the change.

  • The new function current_time() returns the current time at the proper resolution for storage in a specific filesystem; it replaces the old CURRENT_TIME() macro. Among other things, the new API is year-2038 safe.

At this point, it seems likely that things will slow down considerably as the 4.9 merge window approaches its scheduled closing on October 16.

Comments (none posted)

On Linux kernel maintainer scalability

By Jonathan Corbet
October 12, 2016

LinuxCon Europe
LWN's traditional development statistics article for the 4.6 development cycle ended with a statement that the process was running smoothly and that there were no process scalability issues in sight. Wolfram Sang started his 2016 LinuxCon Europe talk by taking issue with that claim. He thinks that there are indeed scalability problems in the kernel's development process. A look at his argument is of interest, especially when contrasted with another recent talk on maintainer scalability.

Beyond changesets merged

Sang's core point is that looking at the number of patches merged only tells part of the story; it says nothing about what had to happen to get those patches into the mainline. Looking at the last few years' worth of development cycles, he noted that relatively few patches carry tags beyond the Signed-off-by applied by the developer and the committer. In particular, around the 3.0 days, only about 20% of the patches in the mainline had an Acked-by, Reviewed-by, or Tested-by tag indicating that anybody other than the maintainer had seriously looked at them. That number is closer to 40% in current kernels, he said; it is a clear improvement, but still does not make him happy. For a properly scalable kernel process, he said, we should have much higher levels of review by developers who are not the subsystem maintainer.

Another metric one can look at is the time difference between the date on the patch and the date on which it was first committed to a git tree. The Ethernet driver maintainers, he said, are heroes: 80% of all the patches were accepted within two weeks. A number of other subsystems do not do [Wolfram Sang] anywhere near as well, and some have gotten significantly worse. I2C, Sang's own subsystem, has stayed about the same over the last three years, which surprised him. As the workload has increased, it has come to feel like things are getting much worse.

The time-to-commit metric may be useful, but it is not without its flaws. The final version of a patch may have been committed fairly quickly, but previous versions could have languished without review for a long time. Patches that are rejected or that get lost are not considered at all.

One way to try to get a better handle on things is to look at the Patchwork systems for the subystems that use it, and, in particular, to look at the backlog of patches found there. For I2C, it shows a relatively low backlog until about 3.16, when he gave up on trying to keep up with the flow and fell behind. The ACPI subsystem has an amazing backlog of zero. The relevant maintainer (Rafael Wysocki) was in the room; he noted that it depends on how a subsystem uses Patchwork. He said that he quickly marks a lot of patches as inapplicable; Sang replied that he doesn't even have the time to do that. The ext4 filesystem shows a linear growth in its backlog, up to about 800 patches currently. The numbers for several other subsystems were shown; almost all of them are going up.

The problem, Sang said, is that the number of committers is not scaling to match the growing number of contributors to the kernel. We are getting more reviewers, but they are coming in slowly and are not anywhere near enough. As a result, the number of unprocessed patches is on the increase.

How can this problem be addressed? Users can help by commenting on and, especially, testing patches. Developers need to be aware that sloppiness is often a problem; they should acknowledge when they have done suboptimal work. Developers need to take part in reviewing; if nothing else, they should review their own patches. For maintainers, working harder is not generally the solution; that just leads to burnout. They should get their tools in order and automate tasks whenever possible; looking at what other maintainers are using can be helpful. Companies should allow and encourage their developers to spend time reviewing patches.

What he does not want to see is a "kernel infrastructure initiative". The Core Infrastructure Initiative, run by the Linux Foundation as a way to channel resources to important but underfunded projects, is a good thing, but it is a reaction to a problem that got out of control. Things had to go wrong first. Sang would rather see action now to keep things from getting to that state.

For I2C, Sang intends to step back a bit. He will become one of the I2C developers, one of its architects, and one of its reviewers, but he will not be the only one. That may slow things down in the short term, since he will be doing less patch review. The advantage is that he will stay sane, and will have the time and energy to try to address the problem on higher levels.

The maintainer as bottleneck

While Sang intends to step back on patch review, his plan still calls for him to be the sole committer of patches for the I2C subsystem. In this context, it is interesting to look at another talk, given at Kernel Recipes one week earlier by i915 graphics driver maintainer Daniel Vetter. He, too, made the point that maintainers don't scale, but he would rather see maintainers get help at all levels.

One year ago, he would have said that there was no problem in the i915 subsystem. Applying patches was relatively easy, after all. He had never reviewed the majority of the patches there; i915 has a number of developers who can do that. But, as the single maintainer, he gave the subsystem "a bus factor of one"; when he wasn't available for any reason, things simply came to a stop.

At the 2015 Kernel Summit, Linus Torvalds said that he has come to like the group maintainer model, where more than one person takes responsibility for a given subsystem. Vetter wanted to give that a try, but he quickly ran into a problem: nobody was willing to sign up as the co-maintainer for the i915 subsystem. He was, however, able to find developers who were willing to commit patches for i915; indeed, he signed up 15 of them. He figured he would experiment with the multiple-committer [Daniel Vetter] model for one release cycle. After all, nobody had ever really tried this before in the kernel, so it must be a stupid idea.

That was one year ago, he said, and disaster has failed to materialize. Instead, he has "seriously happy contributors," and a whole set of reviewers who can apply the patches they look at. He is now "a bored maintainer," and all of the nagging and begging to get code merged has gone away. He has found that commit rights are a strong carrot that can be used to get developers and companies to contribute — and to be careful about the work they do. It also leads to "distributed conflict management" that makes life easier.

So what does he do anymore? His main job at this point, as "the" maintainer for i915, is communications with the outside, including any work that requires coordination with other subsystem trees. He connects developers with the appropriate reviewers, and puts together the pull requests to send work upstream. And, of course, he "takes the blame for everything".

To make this model work, he said, a subsystem clearly needs a team of developers, and non-maintainer reviews must be the norm. The group should be consistent, with developers who stay around; otherwise, enforcement via social feedback will not work well. Good documentation and tools are necessary; i915 has a set of process documents on this page. When somebody makes a mistake, if possible, a check should be put into the tools to keep it from happening again.

Good testing is crucial to this model. A multi-committer tree can never be rebased, so there is no way to remove embarrassing mistakes. They really need to be avoided in the first place; that requires good pre-commit testing to ensure that the obscure corner cases do not break.

The rough consensus model works best for a group like this. The default on any patch is "no action", so a developer's full disagreement will stop things. What's important, he said, is to have agreement on the goals for the subsystem; disagreement on the path taken toward those goals is acceptable. A good rule of thumb is "if you push a patch and there's screaming on IRC, you shouldn't have done it."

In general, he said, the kernel could probably benefit from more maintainer groups like this. It is a more efficient way to maintain busy subsystems, especially those that currently have a lot of submaintainer trees.

Meanwhile in Berlin

Fast-forward one week; your editor raised this idea in Sang's talk and asked whether the single-committer model might be part of the scalability problems raised there. The developers in that room tended toward skepticism over whether the idea could work outside of the i915 tree. Wysocki, in particular, seemed to feel that there were relatively few submaintainers who could be trusted with full commit access. These maintainers push patches that must be rejected fairly often, so they should not be able to commit directly to the subsystem tree.

Perhaps these developers, too, would be pleasantly surprised if they were to run an experiment with more widely distributed commit rights. In any case, it seems likely that growing numbers of developers and patches will put more stress on subsystem maintainers. If those maintainers are not to become a choke point for kernel development, ways to spread the work they do will be required.

[Your editor thanks both the Linux Foundation and Kernel Recipes for supporting his travel to these events.]

Comments (12 posted)

Patches and updates

Kernel trees

Greg KH Linux 4.8.1 Oct 07
Sebastian Andrzej Siewior 4.8-rt1 Oct 06
Greg KH Linux 4.7.7 Oct 07
Greg KH Linux 4.4.24 Oct 07
Steven Rostedt 4.4.23-rt33 Oct 05
Steven Rostedt 3.12.64-rt86 Oct 05

Architecture-specific

Core kernel code

Device drivers

Device driver infrastructure

Documentation

Michael Kerrisk (man-pages) man-pages-4.08 released Oct 08

Filesystems and block I/O

Andreas Gruenbacher Richacls (Core and Ext4) Oct 11

Memory management

Security-related

jobol@nonadev.net LSM ptags: Secure tagging of processe Oct 06

Virtualization and containers

Haozhong Zhang Add Dom0 NVDIMM support for Xen Oct 10

Miscellaneous

Stephen Hemminger iproute 4.8 Oct 09

Page editor: Jonathan Corbet
Next page: Distributions>>


Copyright © 2016, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds