|
|
Log in / Subscribe / Register

The first part of the 6.14 merge window

By Jonathan Corbet
January 23, 2025
As of this writing, just over 4,300 non-merge changesets have been pulled into the mainline repository for the 6.14 release. Many of the pull requests this time around include remarks saying that activity has been relatively low this time around, presumably due to the holidays. So those 4,300 changesets are probably closer to the merge-window halfway point than usual. Much of the work merged thus far looks more like incremental improvements than major new initiatives, but there still have been a number of interesting changes in the mix.

Some of the most significant changes pulled into the mainline so far are:

Architecture-specific

  • The PowerPC architecture has gained lazy preemption support.
  • X86 systems using AMD's Secure Encrypted Virtualization feature now support a secure timestamp counter for guests. In short, it allows guests to read timestamps that cannot be manipulated by the host.
  • AMD's energy-use counters for CPU cores are now supported in the perf events subsystem.

Core kernel

  • The pid_max sysctl knob sets the highest number that can be used for a process ID; it has the effects of limiting the size of PID values and of limiting the total number of processes that may exist. In 6.14, pid_max is now tied to the PID namespace, allowing it to be set independently within containers. It is hierarchical, so no namespace can set pid_max to a value higher than that found in any of its parent namespaces. See this commit for more information about this change.
  • When a program is launched with execveat(), the name of the executed file as stored in its directory entry will be shown in /proc rather than (as is done in current kernels) the file-descriptor number that was used. See this article for details on this change.
  • The new "dmem" control-group controller regulates access to device memory, such as that found on graphics cards. Documentation is sparse, but there is a brief guide to the configuration of this controller available.

Filesystems and block I/O

  • The pidfdfs filesystem can now create file handles (when requested by a name_to_handle_at() call); these can be used to create a system-wide unique identifier for processes even on 32-bit systems. It is also now possible to bind-mount pidfds.
  • The statx() system call can now return the required alignment for read operations on a file; that alignment may be different than the requirement for writes, and some applications can benefit from knowing both.
  • Some Btrfs configurations give the filesystem a choice of multiple devices when the time comes to read a specific block. In current kernels, the PID of the reading process is used to make that decision, but that will focus all read traffic onto a single device in a single-reader workload. The 6.14 kernel adds a couple of new policy options that can implement either round-robin read balancing or simply focus reads onto a specific device. See this commit for instructions on enabling round-robin, or this one to set a specific device.
  • The bcachefs filesystem has a lot of changes after missing the 6.13 development cycle; these include a major on-disk format change that will require a "big and expensive" format upgrade. These changes include self-healing improvements, filesystem-checking time "improved by multiple orders of magnitude", and more; see this merge message for more information.
  • The md-linear device-mapper target (which essentially concatenates block devices) was removed in 6.8 as being deprecated and unmaintained. It seems that there were still users of this target, though, so it has been restored for 6.14. This change is also marked for the stable updates, so should propagate to the older kernels as well.

Hardware support

  • Clock: Qualcomm X1P42100 graphics clock controllers, Qualcomm QCS615 and SM8750 global clock controllers, Qualcomm SM8750 TCSR clock controllers, Qualcomm SM8750 display clock controllers, Qualcomm IPQ CMN PLL clock controllers, and Qualcomm SM6115 low power audio subsystem clock controllers.
  • Graphics: Synopsys Designware MIPI DSI host DRM bridges and ZynqMP DisplayPort audio interfaces.
  • Hardware monitoring: TI TPS25990 monitoring interfaces, Intel common redundant power supply monitors, and Analog Devices ADM1273 hot-swap controllers.
  • Miscellaneous: NVMe PCI endpoint function targets, Loongson memory controllers, AMD AI engines, STMicroelectronics LED1202 I2C LED controllers, TI LP8864/LP8866 4/6 channel LED drivers, KEBA SPI interfaces, and Airoha EN7581 SoC CPU-frequency controllers.
  • Networking: NXP S32G/S32R Ethernet interfaces, Realtek 8922AE-VS PCI wireless network adapters, and QNAP microcontroller unit cores.

Miscellaneous

  • The samples directory in the kernel repository contains a new program, mountinfo, which demonstrates the use of the statmount() and listmount() system calls.
  • When Rust 1.84.0 (or later) is available, Rust code in the kernel will use the derive(CoercePointee) feature for pointer coercion. That feature is on the Rust-language stabilization track, and its use is an important step toward using only stable Rust features in the kernel. This merge message shows how it can be used.

Networking

  • The RxRPC protocol implementation can now make use of huge UDP frames for better throughput. Support for the RACK-TLP loss-detection algorithm has also been added.
  • There is a new per-network-namespace sysctl knob — tcp_tw_reuse_delay — that controls how long the system will wait before reusing the port number of a closed TCP socket; its value is in milliseconds.
  • It is now possible to select whether an interface MAC or PHY should be used as the provider of PTP timestamps; this merge message gives some examples of how to do this that are presumably intelligible to people familiar with such things.
  • IPsec IP-TFS/AGGFRAG (RFC 9347) is now supported.

Security-related

  • The "xperms" SELinux feature allows policies to target specific ioctl() calls or netlink messages. In-kernel documentation is missing, but this wiki page has some information.

Internal kernel changes

  • The kernel's annotation system, used to add information about code (such as "this jump is safe without a retpoline") would previously create a different ELF section for each annotation type. There is now a generic annotation infrastructure that gathers all of that information into a single section.

The 6.14 merge window can be expected to remain open through February 2, with the 6.14 release most likely happening on March 23. This timing seems more certain than usual, just because it will maximize editorial pain at LWN due to the Linux Storage, Filesystem, Memory Management, and BPF Summit starting on March 24. One way or another, we'll survive the experience and tell you how it goes.

Index entries for this article
KernelReleases/6.14


to post comments

dmem controller docs

Posted Jan 23, 2025 16:11 UTC (Thu) by sima (subscriber, #160698) [Link]

Note that the new dmem controller also comes with full docs for the internal api:

https://docs.kernel.org/next/core-api/cgroup.html#device-...

I was worried for a moment from the article whether dri-devel folks suddenly skimped on docs, but I think it's all nicely there.

New AT_EXECVE_CHECK for code integrity for interpreters

Posted Jan 23, 2025 16:29 UTC (Thu) by bluca (subscriber, #118303) [Link] (6 responses)

Feels like this one deserved to be mentioned as well, as it opens the door for closing a major gap between Linux and other OSes like Windows:

Author: Mickaël Salaün <mic@digikod.net>
exec: Add a new AT_EXECVE_CHECK flag to execveat(2)

Add a new AT_EXECVE_CHECK flag to execveat(2) to check if a file would
be allowed for execution. The main use case is for script interpreters
and dynamic linkers to check execution permission according to the
kernel's security policy. Another use case is to add context to access
logs e.g., which script (instead of interpreter) accessed a file.

Combined with IPE, dm-verity and an englightened interpreter, this will allow code integrity checks for scripts too!

New AT_EXECVE_CHECK for code integrity for interpreters

Posted Jan 23, 2025 16:47 UTC (Thu) by acarno (subscriber, #123476) [Link] (3 responses)

I feel like I remember reading more about this flag, but I can't find an LWN article. If I'm reading the man page for execveat(2) correctly, this takes in a directory file descriptor and a pathname. Can this lead to TOCTOU bugs if the underlying file is modified between the AT_EXECVE_CHECK call and the actual script execution? I'd imagine it would be safer to do this via a file descriptor, no?

New AT_EXECVE_CHECK for code integrity for interpreters

Posted Jan 23, 2025 17:40 UTC (Thu) by bluca (subscriber, #118303) [Link]

The feature changed name a few times, this is a recent LWN article about it: https://lwn.net/Articles/982085/

New AT_EXECVE_CHECK for code integrity for interpreters

Posted Jan 23, 2025 20:25 UTC (Thu) by tux3 (subscriber, #101245) [Link]

Perhaps this one: https://lwn.net/Articles/982085/

Or it may have been one of the predecessor, O_MAYEXEC/trusted_for(), linked in this last article

New AT_EXECVE_CHECK for code integrity for interpreters

Posted Jan 23, 2025 20:28 UTC (Thu) by pbonzini (subscriber, #60935) [Link]

You can use fexecve() to exec a file descriptor, but internally it is implemented using execveat(). This used to have issues (https://lwn.net/Articles/999770/) but they are also fixed in 6.14.

New AT_EXECVE_CHECK for code integrity for interpreters

Posted Jan 24, 2025 2:41 UTC (Fri) by corbet (editor, #1) [Link] (1 responses)

That one came in after the cutoff — you'll get to read about it (again) in part 2

New AT_EXECVE_CHECK for code integrity for interpreters

Posted Jan 24, 2025 13:47 UTC (Fri) by bluca (subscriber, #118303) [Link]

Nice, thank you, looking forward to that!

Bcachefs

Posted Jan 26, 2025 19:48 UTC (Sun) by Donieck67 (guest, #175152) [Link] (1 responses)

The bcachefs is filesystem like btrfs or ZFS. Is this true that will be removed from kernel?

Bcachefs remaining in-kernel

Posted Jan 27, 2025 9:08 UTC (Mon) by farnz (subscriber, #17727) [Link]

At the moment, bcachefs will remain in the mainline kernel, and there's no current plans to remove it, unless Kent decides to remove it.


Copyright © 2025, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds