|
|
Subscribe / Log in / New account

5.16 Merge window, part 1

By Jonathan Corbet
November 4, 2021
As of this writing, Linus Torvalds has pulled exactly 6,800 non-merge changesets into the mainline repository for the 5.16 kernel release. That is probably a little over half of what will arrive during this merge window, so this is a good time to catch up on what has been pulled so far. There are many significant changes and some large-scale restructuring of internal kernel code, but relatively few ground-breaking new features.

Changes pulled in the first half of the 5.16 merge window include:

Architecture-specific

  • The Arm 8.6 timer extensions are now supported.
  • MIPS systems have a new just-in-time compiler for the BPF virtual machine.
  • KFENCE is now supported on PA-RISC machines.
  • Hitting the TOC button ("transfer of control") on a PA-RISC machine will now cause the kernel to enter the debugger.
  • RISC-V now has support for virtualization with KVM, an improvement that took rather longer than the developers would have liked.
  • The kernel has gained support for Intel's Advanced Matrix Extension (AMX) feature. This required extensive discussion and a major reworking of the existing floating-point support code.

Core kernel

  • The futex_waitv() system call (described in this article) has been merged.
  • The CPU scheduler has gained an understanding of "clusters", a hardware arrangement where multiple cores share the same L2 cache. The cluster-aware scheduler will take pains to distribute tasks across all clusters in the system to balance the load on caches across the machine.
  • The tracefs interface to the tracing mechanism now supports the setting of owner and group permissions; this feature can be used to give a specific group access to tracing functionality. The "other" permission bits still cannot be set to allow access to the world as a whole, though.
  • As usual there is a pile of BPF changes. The new bpf_trace_vprintk() helper can output information without the three-argument limit of bpf_trace_printk(). Support for calling kernel functions in loadable modules from BPF has been added. There is a new bloom-filter map type. Unprivileged BPF is now disabled by default. There is also a new document describing the licensing of various BPF components and the requirements for users.

Filesystems and block I/O

  • The block layer continues to see a series of performance optimizations resulting in significantly better per-core operation rates.
  • There is now support for multi-actuator (rotating) disks that can access multiple sectors at the same time. This commit documents the sysfs interface for these drives.
  • There is a new ioctl() command (CDROM_TIMED_MEDIA_CHANGE) for detecting media-change events in CDROM drives. Evidently people are still using CDROM drives...
  • The EROFS filesystem has gained simple multiple-device support.

Hardware support

  • Media: OmniVision OV13B10 sensors, Hynix Hi-846 sensors, and R-Car image signal processors.
  • Miscellaneous: Microchip external interrupt controllers, Apple mailbox controllers, Ingenic JZ47xx SoCs SPI controllers, Cadence XSPI controllers, Maxim MAX6620 fan controllers, Intel Keem Bay OCS elliptic-curve encryption accelerators, ACPI WMAA backlight interfaces, Intel ISHTP eclite embedded controllers, Barco P50 GPIO, and Samsung S6D27A1 DPI panels.
  • Networking: Realtek RTL8365MB-VC Ethernet switches, Realtek 802.11ax wireless chips, Asix AX88796C-SPI Ethernet adapters, and Mellanox MSN4800-XX line cards.

Networking

  • There is a new, user-settable socket option called SO_RESERVE_MEM. It has the effect of reserving some kernel memory permanently for the relevant socket; that, in turn, should speed networking operations, especially when the system is under memory pressure. Note that the feature is only available when memory control groups are in use, and the reserved memory is charged against the group's quota.
  • The In-situ Operations, Administration, and Maintenance (IOAM) support has been enhanced support the encapsulation of IOAM data into in-transit packets. This commit has a little bit of further information.
  • The ethtool netlink API has gained the ability to control transceiver modules; see this commit and this commit for more information.
  • The netfilter subsystem can now classify packets at egress time; see this commit for some more information.
  • Support for Automatic Multicast Tunneling (RFC 7450) has been added.
  • There are two new sysctl knobs controlling what happens to the ARP cache when a network device loses connectivity. arp_evict_nocarrier says whether ARP cache entries should be deleted when an interface loses carrier, while ndisc_evict_nocarrier does the same thing for the neighbor discovery table. Both exist to allow cache entries to be retained when a WiFi interface moves between access points on the same network. This commit and this one contain more information.

Security-related

  • Most of the work for strict memcpy() bounds checking has been merged. The actual patches enabling bounds checking across the kernel have not (yet) been merged, though, pending fixes for some remaining problems.
  • The io_uring subsystem has gained audit support.
  • The SELinux and Smack security modules can now apply security policies to io_uring operations.
  • Auditing will now record the contents of the open_how structure passed to openat2().
  • The integrity measurement architecture (IMA) can now apply rules based on the group ID of files and the users accessing them.
  • The default Spectre-mitigation behavior for seccomp() threads has changed, resulting in fewer mitigations being applied and a corresponding performance increase. One can read this commit for the reasoning behind this change, but the short version is that the extra mitigations weren't really buying any more security.

Internal kernel changes

  • The folio patch set, which has been the topic of much discussion over the past several months, was the very first thing to be merged for 5.16. This work adds a new "folio" type to represent pages that are known not to be tail pages, then reworks the internal memory-management APIs to use this type. The result is better type clarity and even a small performance increase — and a lot more work to do in the future.
  • There is a new internal function, cc_platform_has(), that provides a generic interface for kernel code to query the presence of confidential-computing features. Its first use is to replace mem_encrypt_active() for checking whether memory encryption is in use.

Assuming that the usual two-week schedule holds, the 5.16 merge window can be expected to close on November 14. Once that happens, we'll be back with a summary of the remaining changes pulled for the next kernel release.

Index entries for this article
KernelReleases/5.16


to post comments


Copyright © 2021, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds