|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for February 27, 2025

Welcome to the LWN.net Weekly Edition for February 27, 2025

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Python interpreter adds tail calls

By Daroc Alden
February 26, 2025

The Faster CPython project has been working to speed up the Python interpreter for the past several years. Now, Ken Jin, a member of the project, has merged a new set of changes that have been benchmarked as improving performance by 10% for some architectures. The only change is switching from using computed goto statements to using tail calls as part of the implementation of Python's bytecode interpreter — but that change allows modern compilers to generate significantly better code.

Python's many interpreters

When running a Python program, the interpreter compiles the program to custom Python bytecode — instructions for a high-level stack-based virtual machine. This is a big speed advantage because the bytecode is a denser, more linear representation of the Python program. Instead of needing to process some data structure representing the program, the inner loop of the interpreter can just fetch each instruction sequentially and interpret it. The bytecode is still fairly high-level, however, and Python's new just-in-time compiler (JIT) cannot turn it directly into machine code without first collecting type information by actually running the code.

Prior to Jin's changes, the CPython code base boasted three different interpreters: a switch-based bytecode interpreter, a bytecode interpreter using computed goto statements, and a micro-op interpreter. The micro-op interpreter is a relatively recent addition that is also the work of the Faster CPython project. It is used for validating the implementation of Python's JIT compiler, and doesn't need to run outside of that context, so its performance is not a central concern. On the other hand, every instruction in a Python program is executed by one of the two bytecode interpreters at least once in order to gather the information that the JIT compiler needs to produce machine code.

The bytecode interpreters have the job of running the bytecode until the JIT compiler can take over — and, since it takes time for the JIT compiler to warm up, their performance is important to the overall performance of the program. Whether that interpreter uses switch statements or computed goto statements, its operation is conceptually the same: the interpreter fetches the next instruction, looks it up, jumps to the code to handle that instruction, and then loops.

The difference between the two interpreters comes down to how they jump between the implementation of bytecode instructions. The switch-based interpreter is conceptually laid out like this, although the actual implementation is a bit messier:

    while (true) {
        instruction i = get_next_instruction();
        switch (i) {
            case POP:
                // Code for a POP instruction
                ...
                break;
            case ADD:
                // Code for an ADD instruction
                ...
                break;
            ...
        }
    }

The computed-goto-based interpreter uses a C extension to look up a label in a jump table and then jump to it, resulting in code that instead looks something like this:

    static void* dispatch_table[] = { &&POP, &&ADD, ... };

    #define DISPATCH() goto *dispatch_table[get_next_instruction()]

    DISPATCH();

    POP:
        // Code for POP
        ...
        DISPATCH();
    Add:
        // Code for ADD
        ...
        DISPATCH();

In this version of the interpreter, the loop is replaced by the sequence of jumps from each bytecode instruction to the next one.

The need for different interpreters is purely performance-motivated. They implement the same loop over the bytecode, but Clang and GCC generate better code for the computed-goto-based version. The switch-based interpreter is an older design that exists in order to support the Microsoft Visual C++ (MSVC) compiler, which doesn't implement computed-goto statements. Most Linux users running the Python build from their distribution will use the computed-goto interpreter.

The main reason the computed-goto-based version is faster is due to the limitations of the CPU's branch predictor. In the switch-based design, a single native instruction (corresponding to the switch) makes an indirect jump to the code for every bytecode instruction. This quickly exhausts the limited prediction capabilities of the branch predictor, making every indirect jump into a slow misprediction. The computed-goto-based design spreads those indirect jumps out over multiple locations (everywhere DISPATCH() is called) at the end of each bytecode instruction's implementation. This lets the branch predictor perform better because it can save more context, and also because it can learn which bytecode instructions are more likely to follow each other.

But there are still problems with the version using computed gotos. Combining the implementations for all of the bytecode instructions into a single large function is hard on the compiler. Most compilers have a limit for how large a function they'll attempt to optimize, but even without that, optimizing a single gigantic function is hard. Many operations, such as register allocation, scale poorly to large functions. Clang and GCC produce valid, working code, but it's not the best code that they could produce with some help.

Tail calls

This is where tail calls come in. Tail calls are an old idea in computer science — when the last thing that one function does is call another function, the first function can simply jump into the second function without using a call instruction, reusing the stack frame. This saves memory and time by not pushing an extra return address to the stack, and lets the second function return directly to the first function's caller. In Python's case, changing the interpreter to use tail calls to jump between the implementation of different bytecode instructions lets the implementation be broken up into many smaller functions, which is easier for the compiler to optimize and therefore produces better code.

Tail calling is usually presented as an optimization, and both Clang and GCC do perform tail-call optimization opportunistically, but until relatively recently, that optimization could not be guaranteed, making it not possible to rely on. Some languages, such as Scheme, make it part of the language specification that tail calls must use a jump instruction. This lets programmers using those languages implement loops with recursion without ever overflowing the stack. With C, since it wasn't guaranteed, any program written in that style would have a chance of producing stack-overflow errors if there were a long enough sequence of un-optimized tail calls. In Python's case, that could affect any non-trivial Python program.

Since 2021, Clang has supported a [[clang::musttail]] attribute on function calls, which instructs the compiler that the call must be an optimized tail call. If the compiler can't comply for some reason, it will raise an error. GCC added support for the same attribute in 2024.

With the [[musttail]] attribute ensuring that switching to using tail calls would not cause stack overflow errors, Python could consider changing implementations. Unfortunately, just switching to tail calls with no other changes actually introduced a performance regression, Jin said.

The problem was function-call overhead. The prologue of a C function must do certain things in order to uphold the ABI requirements of the architecture. In particular, many architectures require functions to save the values in certain registers ("callee-saved" registers) to the stack. Then the epilogue of the function restores those values to the registers. When tail calling from one function to another, this means that the first function will restore a set of registers from the stack, jump to the second function, and then the second function will immediately put those registers back on the stack, where they just were. This is completely unnecessary, but because the C ABI requires it, every function would repeat this wasted work.

The solution comes from another attribute, preserve_none. Functions with this attribute don't preserve any registers from the caller. The calling convention was originally created for the Glasgow Haskell Compiler (GHC), which is why old documents sometimes refer to it as "ghccc", but C++11 and C23 standardized the attribute for C++ and C, respectively [A reader correctly pointed out that preserve_none has not actually been standardized yet].

Making random functions in a program preserve_none will likely not have much of an impact on performance, since it just puts the burden of preserving any registers on their callers. But for Python's bytecode interpreter, once the program starts executing bytecode instructions, every subsequent call will be a tail call to another preserve_none function. (Or, at the very end of the program, a call to Python's cleanup code and then exit()). So the compiler can completely remove the function prologue and epilogue, eliminating the overhead from using function calls.

When using both attributes, Jin's new tail-call-based interpreter inherits the nice performance benefits of the computed-goto-based version, while also making it easier for the compiler to figure out optimal register allocations and other local optimizations.

Impacts on Python

Normally, implementing a fourth interpreter for a language would seem like a fairly big maintenance burden. In Python's case, however, the build process uses an interpreter generator to create interpreters from a shared description of the bytecode instructions. So Jin's change is only about 200 lines of Python code that extends the interpreter generator to produce the tail-calling version.

The new interpreter does have some drawbacks, though. The main one is that it's only recently that compilers have started supporting both required attributes, and so this won't benefit builds of Python made with old compilers. The full benefit of the changes also depends on profile-guided optimization and link-time optimization; without those, the compilers can't take full advantage of the optimization opportunities provided by the smaller functions. The official Python releases are built with profile-guided optimization, link-time optimization, and a recent compiler; so are many Python packages built by different distributions, although a few distributions such as NixOS disable profile-guided optimization for reasons of build reproducibility. In any case, these limits shouldn't cause problems for most Python users.

Jin measured the performance impact of the change as 10% on average (using the geometric mean), but up to 40% on some benchmarks that spend most of their time executing bytecode. He said that the "speedup is roughly equal to 2 minor CPython releases worth of improvements. For example, CPython 3.12 roughly sped up by 5%." The change generated relatively little discussion, in the face of such a large speedup from a small change. Jin's work has been merged, and should be available to end users in Python 3.14, expected in October 2025.

Comments (19 posted)

A possible path for cancelable BPF programs

By Daroc Alden
February 25, 2025

The Linux kernel supports attaching BPF programs to many operations. This is generally safe because the BPF verifier ensures that BPF programs can't misuse kernel resources, run indefinitely, or otherwise escape their boundaries. There is continuing tension, however, between trying to expand the capabilities of BPF programs and ensuring that the verifier can handle every edge case. On February 14, Juntong Deng shared a proof-of-concept patch set that adds some run-time checks to BPF to make it possible in the future to interrupt a running BPF program.

When initially conceived, BPF had strict limits on the number of instructions a program could contain, and did not permit loops, limiting how long a program could run. This is important because the kernel will call BPF hooks during many time-sensitive operations, so a misbehaving BPF program that managed to run for too long could potentially cause kernel hangs or other problems. Over time, those limits have been gradually expanded for two reasons: the verifier has become more capable — handling loops, more complicated functions, etc. — and developers have discovered that complicated, long-running BPF programs are quite useful, prompting them to ask for the limits to be loosened.

Theoretically, BPF could have some kind of watchdog system to kill programs that take too long at run time, instead of trying to bound the run time of the program statically. This would have some advantages, such as allowing for programs with complex control flow that the verifier cannot currently understand. The major problem with this idea is that BPF programs can hold kernel resources, such as locks, and therefore killing them is not as simple as just interrupting their execution.

Deng's patch set tackles this problem by adding relatively low-overhead tracking of acquired kernel resources. With dynamic tracking, the process of killing a BPF program gets much simpler, because the kernel can release any resources held by the program at any time. The patch set doesn't implement the actual watchdog and associated killing of BPF programs itself, yet. Instead, it is more of a proof of concept to show that such a thing could be made to work with BPF's use of kernel resources.

Deng is also clear that even once the patch set is ready, it will still just be an intermediate step. The verifier should continue being improved until it can track which resources need to be cleaned up at each point in the program statically:

Note that this patch series is not intended to replace pre-runtime/post-runtime efforts, and having no runtime overhead is always better than having some. Our final goal is to have no runtime overhead. In the future, as these no-runtime-overhead solutions mature, the runtime-overhead solutions can be disabled.

This is similar to the relationship between BPF JIT and BPF interpreter. We always know that JIT is better and should be used eventually, but the interpreter is a not too bad alternative when JIT is not ready or cannot be used, and can help us support some features faster.

The details

Any solution needs to perform well, given how prevalent BPF programs are in performance-sensitive parts of the kernel. In this case, that means avoiding dynamic allocation for resource tracking. Luckily, the verifier already tracks when BPF programs acquire or release resources. Deng's patch set adds code to check the maximum number of resources that a program holds simultaneously, and allocates a static table to hold resource information. The table holds a series of slots, which are initialized to each hold a pointer to the next one, forming a linked list of free slots.

That table needs to not only hold pointers to the resources in question, it also needs to store the type of the resource. BPF programs can hold several different types of kernel resources, which all need to be acquired and released using specific functions. For example, bpf_task_from_pid() is used to acquire a reference to a task_struct, which is released with bpf_task_release().

In order to look up the correct release function based on the type of the resource, Deng's patch builds a table of types. Theoretically, when a BPF program is killed, the code can do binary search in the table to find the resource type, and thus its release function. Most of the time, however, the BPF program will release the resource normally. When this happens, the kernel needs to find the entry in the resource table for the resource and add it to the list of free slots. This is done using a fixed-size hash table so that the lookup is fast.

Overall, Deng's patch adds a small amount of run-time overhead to the process of acquiring and releasing resources. When acquiring, the code needs to look up the resource type using binary search, pop a slot off the free slots list, and insert an entry into the hash table. When releasing, the code needs to look up the entry in the hash table, remove it from the table, and push a slot onto the free slots list. All of these steps take an average of constant time, with the exception of looking up the resource type, which takes time logarithmic in the number of BPF types that can be tracked in this way.

In order to actually do this work, the patch changes the verifier to insert a call to a hook function that does the accounting and then forwards the call to the real function every time the program calls an acquire or release function. This works well for kfuncs (kernel functions exposed to BPF), and means that the kfuncs themselves don't need to be modified. Kfuncs are already annotated with KF_ACQUIRE and KF_RELEASE flags to indicate that they are involved in acquiring or releasing a resource. These functions, of which there are 39 pairs in the 6.13 kernel, all have similar signatures, taking only a single argument of their associated type. Deng's code uses that information to figure out the association between kfuncs and types.

That approach doesn't work for BPF helpers (functions exposed through an older and more brittle mechanism), however. Helpers don't have the same kinds of annotations. A full solution to making BPF programs killable will need to hard-code information on which BPF helpers are associated with different types.

Deng later suggested using the same hook mechanism to add run-time tracing for BPF programs. The verifier could, in a special debug mode, add a hook for every kfunc, not only the ones associated with acquiring or releasing resources. That could potentially extend something like the kernel's ftrace mechanism to work with BPF programs.

The future of BPF

Deng's patch set has not seen any discussion yet — but the underlying idea of adding more run-time checks to BPF is not new. BPF's reliance on the verifier in order to run code without most run-time checking is one of the things that makes it unusual among language runtimes. Other compiled languages can use sophisticated analyses to remove redundant checks or indirections, but they usually approach the problem by starting with all of the necessary checks, and then selectively prune away the ones the compiler can prove are not needed. BPF, on the other hand, essentially makes anything the verifier can't understand the programmer's problem.

Does this unique feature of BPF need to be preserved at any cost, requiring kernel developers to eschew any approach that introduces run-time overhead? Or is the more common approach that other languages take the inevitable future of trying to push for BPF to become more capable? Deng's patch tries to walk a middle path — introducing a feature with run-time overhead with the expectation that it will be replaced when the verifier is improved to track how resources need to be released. But it's hard not to see the need for patches like this as indicative of the fact that developers want access to new capabilities that the verifier cannot provide.

The verifier is already complex, and only going to become more so. Whether that will drive BPF developers to move away from implementing everything in the verifier over time remains to be seen. Either way, the topic seems likely to provoke some amount of discussion in the coming months.

Comments (7 posted)

Slabs, sheaves, and barns

By Jonathan Corbet
February 24, 2025
The kernel's slab allocator is responsible for the allocation of small (usually sub-page) chunks of memory. For many workloads, the speed of object allocation and freeing is one of the key factors in overall performance, so it is not surprising that a lot of effort has gone into optimizing the slab allocator over time. Now that the kernel is down to a single slab allocator, the memory-management developers have free rein to add complexity to it; the latest move in that direction is the per-CPU sheaves patch set from slab maintainer Vlastimil Babka.

Many kernel developers interact with the slab allocator using functions like kmalloc(), which can allocate objects of any (reasonable) size. There is a lower level to the slab allocator, though, that deals with fixed-size objects; it is used heavily by subsystems that frequently allocate and free objects of the same size. The curious can see all of the special-purpose slabs in their system by looking at /proc/slabinfo. There are many core-kernel operations that involve allocating objects from these slabs and returning them, so the slab allocator has gained a number of features, including NUMA awareness and bulk operations, to accelerate allocation and freeing.

But, naturally, it's never fast enough.

One of the keys to improved performance on today's multiprocessor systems is to avoid interaction between the CPUs whenever possible. Interactions lead to contention, locking delays, and cache-line bouncing, all of which hurt performance, but a CPU that can act like the rest of the system isn't there can go forward at full speed. This understanding has driven the adoption of per-CPU data structures across the kernel. The slab allocator already makes use of per-CPU data, but it still has enough cross-CPU interactions to slow things down.

Sheaves are a concept introduced by Babka's patch set; in essence, they are a per-CPU cache of objects that can be handed out in response to allocation requests without the need to interact with any other CPUs in the system. By default, sheaves are disabled, but they can be enabled for a specific slab by setting a non-zero value in the new field sheaf_capacity in the kmem_cache_args structure passed to kmem_cache_create(). The value is the number of objects that should be cached in a single sheaf; the patch adding sheaf usage to the maple-tree data structure sets it to 32.

When sheaves are enabled, the allocator will maintain a sheaf with the given number of objects for each CPU. An allocation request will be satisfied from this sheaf whenever possible, and freed objects will be placed back into the sheaf if there is room. That turns allocation and free operations into purely local assignments that can be executed quickly; no locking (or even atomic operations) required. There is a second (backup) sheaf maintained for each CPU as well; when the main sheaf is found to be empty, an object will be allocated from the backup sheaf instead. If the main sheaf is full when an object is freed, that object will be placed into the backup sheaf if possible.

When both sheaves are full, there will no longer be a place to stash a freed object with a simple assignment; that is where the "barn" comes in. The barn is simply a place to keep sheaves that are not currently being used for caching by any CPU; there is one barn for each NUMA node in the system. Once a CPU has filled its sheaves, it will try to place one in the barn; if a CPU's sheaves are empty, it will try to obtain a full one from the barn. In either case, this operation is slower, since locking is required to safely access the shared barn, but it is still faster than going all the way into the slab allocator.

The barn holds two sets of sheaves — one for full sheaves, and one for empty sheaves. If a CPU is freeing a lot of objects, it can place its full sheaves into the barn and obtain empty ones to replace them. There is a limit to the number of sheaves the barn can hold; it is wired to ten each for full and empty sheaves in the current patch set. If a CPU tries to place a sheaf into a full barn, that sheaf will be freed, along with any objects it contains, back into the slabs they came from.

As described so far, this series has the potential to greatly accelerate memory-allocation operations for workloads that allocate and free a lot of slab objects. But there are a couple of other features that are added later in the series to make sheaves more useful.

One of those is an enhancement to kfree_rcu(), which will delay the freeing of an object until after a read-copy-update (RCU) grace period has passed, ensuring that the object no longer has any active references. A third per-CPU sheaf is maintained to hold objects freed with kfree_rcu(); once the sheaf fills, it is passed to the RCU subsystem for the grace-period wait. Once that has happened, an attempt will be made to put the sheaf back into the barn.

The other addition is preallocation. There are many places in the kernel where memory must be allocated without delay, and certainly without blocking the allocating thread. There are also numerous code paths that cannot deal with an allocation failure in any reasonable way. In most of these cases, there is an opportunity to preallocate the needed memory before going into the more constrained code. The kernel has long maintained subsystems like mempools to meet this need.

But, if the kernel can go into critical code in the knowledge that it has a per-CPU sheaf full of objects available to it, a lot of these problems (and the need for mempools) go away. To that end, the series provides a set of functions for working with special-purpose sheaves. A call to kmem_cache_prefill_sheaf() will return a sheaf containing at least the requested number of objects, grabbing it out of the barn if possible. Then, kmem_cache_alloc_from_sheaf() can be used to allocate objects from the sheaf with guaranteed success for at least the requested number of objects. Other functions can be used to return a sheaf to the barn or to place more objects into a sheaf. These special-purpose sheaves act similarly to mempools, but they are intended to be short-lived, unlike mempools which usually exist for the life of the system.

This series appears to be relatively uncontroversial, though perhaps developers are reserving their comments for the upcoming Linux Storage, Filesystem, Memory-Management, and BPF Summit, to be held in late March. Given the appetite for faster memory allocation and freeing, though, sheaves and barns seem likely to be added to the mix at some point.

Comments (17 posted)

Support for atomic block writes in 6.13

February 20, 2025

This article was contributed by Ritesh Harjani and Ojaswin Mujoo

Atomic block writes, which have been discussed here a few times in the past, are block operations that either complete fully or do not occur at all, ensuring data consistency and preventing partial (or "torn") writes. This means the disk will, at all times, contain either the complete new data from the atomic write operation or the complete old data from a previous write. It will never have a mix of both the old and the new data, even if a power failure occurs during an ongoing atomic write operation. Atomic writes have been of interest to many Linux users, particularly database developers, as this feature can provide significant performance improvements.

The Linux 6.13 merge window included a pull request from VFS maintainer Christian Brauner titled "vfs untorn writes", which added the initial atomic-write capability to the kernel. In this article, we will briefly cover what these atomic writes are, why they are important in database world, and what is currently supported in the 6.13 kernel.

To support atomic writes, changes were required across various layers of the Linux I/O stack. At the VFS level, an interface was introduced to allow applications to request atomic write I/O, along with enhancements to statx() to query atomic-write capabilities. Filesystems had to ensure that physical extent allocations were aligned to the underlying device's constraints, preventing extents from crossing atomic-write boundaries. For example, NVMe namespaces may define atomic boundaries; writes that straddle these boundaries will lose atomicity guarantees.

The block layer was updated to prevent the splitting of in-flight I/O operations for atomic write requests and to propagate the device constraints for atomic writes to higher layers. Device drivers were also modified to correctly queue atomic write requests to the hardware. Finally, the underlying disk itself must support atomic writes at the hardware level. Both NVMe and SCSI provide this feature, but in different ways; NVMe implicitly supports atomic writes for operations that remain within specified constraints, but SCSI requires a special command to ensure atomicity.

Why do databases care?

A common practice in databases is to perform disk I/O in fixed-size chunks, with 8KB and 16KB being popular I/O sizes. Databases also, however, maintain a journal that records enough information to enable recovery from a possible write error. The idea is that, if the write of new data fails, the database can take the old data present on disk as a starting point and use the information in the journal to reconstruct the new data. However, this technique is based on the assumption that the old data on disk is still consistent after the error, which may not hold if a write operation has been torn.

Tearing may happen if the I/O stack doesn't guarantee atomicity. The multi-KB write issued by the database could be split by the kernel (or the hardware) into multiple, smaller write operations. This splitting could result in a mix of old and new data being on disk after a write failure, thus leading to inconsistent on-disk data which can't be used for recovery.

To work around this possibility, databases employ an additional technique called "double write". In this approach, they first write a copy of the older data to a temporary storage area on disk and ensure that the operation completes successfully before writing to the actual on-disk tables. In case of an error in that second write operation, databases can recover by performing a journal replay on the saved copy of the older data, thus ensuring an accurate data recovery. But, as we can guess, these double writes come at a significant performance cost, especially for write-heavy workloads. This is the reason atomicity is sought after by databases; if the I/O stack can ensure that the chunks will never be torn, then databases can safely disable double writes without risking data corruption and, hence, can get that lost performance back.

Current state in Linux

As discussed during LSFMM+BPF 2024, some cloud vendors might already advertise atomic-write support using the ext4 filesystem with bigalloc, a feature that enables cluster-based allocation instead of per-block allocation. This helps to properly allocate aligned physical blocks (clusters) for atomic write operations. However, claiming to support atomic writes after auditing code to convince oneself that the kernel doesn't split a write request is one thing, while properly integrating atomic-write support with a well-defined user interface that guarantees atomicity is another.

With the Linux 6.13 release, the kernel provides a user interface for atomic writes using direct I/O. Although it has certain limitations (discussed later in this article), this marks an important step toward enabling database developers to explore these interfaces.

A block device's atomic-write capabilities are stored in struct queue_limits. These limits are exposed to user space via the sysfs interface at /sys/block/<device>/queue/atomic_*. The files atomic_write_unit_min and atomic_write_unit_max indicate the minimum and maximum number of bytes that can be written atomically. If these values are nonzero, the underlying block device supports atomic writes. However, hardware support alone is not sufficient; as mentioned earlier, the entire software stack, including the filesystem, block layer, and VFS, must also support atomic writes.

How to use the atomic-write feature

Currently, atomic-write support is only enabled for a single filesystem block. Multi-block support is under development, but those operations bring some more constraints that are still being discussed in the community. To utilize the current atomic write feature in Linux 6.13, the filesystem must be formatted with a block size that is suitable for the application's needs. A good choice is often 16KB.

Note, though, that ext4 does not support filesystem block sizes greater than the system's page size, so, on systems with 4KB page size (such as x86), ext4 cannot use a block size of 16KB and, thus, cannot support atomic write operations of that size. On the other hand, XFS recently got large block size support, allowing it to handle block sizes greater than page size. Note also that there is no problem with ext4 or XFS if the page size of the system itself is either 16KB or 64KB (such as on arm64 or powerpc64 systems), as both filesystems can handle block sizes less than or equal to the system's page size.

The following steps show how to make use of the atomic-write feature:

  1. First create a filesystem (ext4 or xfs) with a suitable block size based on the atomic-write unit supported by the underlying block device. For example:
        mkfs.ext4 -b 16K /dev/sdd
        mkfs.xfs -bsize=16K /dev/sdd
    
  2. Next, use the statx() system call to confirm whether atomic writes are supported on a file by the underlying filesystem. Unlike checking the block device sysfs path, which only indicates whether the underlying disk supports atomic writes, statx() allows the application to query whether it is possible to request an atomic write operation on a file and determine the supported unit size, which also ensures that the entire I/O stack supports atomic writes.

    To facilitate atomic writes, statx() now exposes the following fields when the STATX_WRITE_ATOMIC flag is passed:

    • stx_atomic_write_unit_min: Minimum size of an atomic write request.
    • stx_atomic_write_unit_max: Maximum size of an atomic write request.
    • stx_atomic_write_segments_max: Upper limit for segments — the number of separate memory buffers that can be gathered into a write operation (e.g., the iovcnt parameter for IOV_ITER). Currently, this is always set to one.
    • The STATX_ATTR_WRITE_ATOMIC flag in statx->attributes is set if atomic writes are supported.

    An example statx() snippet would look like the following:

        statx(AT_FDCWD, file_path, 0, STATX_BASIC_STATS | STATX_WRITE_ATOMIC, &stat_buf);
    
        printf("Atomic write Min: %d\n", stat_buf.stx_atomic_write_unit_min);
        printf("Atomic write Max: %d\n", stat_buf.stx_atomic_write_unit_max);
    
  3. Finally, to perform an atomic write, open the file in O_DIRECT mode and issue a pwritev2() system call with the RWF_ATOMIC flag set. Ensure that the total length of the write is a power of two that falls between atomic_write_unit_min and atomic_write_unit_max, and that the write starts at a naturally aligned offset in the file with respect to the total length of the write.

Currently, pwritev2() with RWF_ATOMIC supports only a single iovec and is limited to a single filesystem block write. This means that filesystems, when queried via statx(), report both the minimum and maximum atomic-write unit as a single filesystem block (e.g., 16KB in the example above).

The future

Kernel developers have implemented initial support for direct I/O atomic writes that are limited to a single filesystem block. However, there is an ongoing work which aims to extend the support to multi-block atomic writes for both the ext4 and XFS filesystems. Despite its limitations, this feature provides a foundation for those interested in atomic-write support in Linux. This also presents an opportunity for users, such as database developers, to start exploring and experimenting with this feature. One can still collaborate with the community to enhance this feature, as it is still under active discussion and development.

Comments (18 posted)

Filesystem support for block sizes larger than the page size

February 20, 2025

This article was contributed by Pankaj Raghav

The maximum filesystem block size that the kernel can support has always been limited by the host page size for Linux, even if the filesystems could handle larger block sizes. The large-block-size (LBS) patches that were merged for the 6.12 kernel removed this limitation in XFS, thereby decoupling the page size from the filesystem block size. XFS is the first filesystem to gain this support, with other filesystems likely to add LBS support in the future. In addition, the LBS patches have been used to get the initial atomic-write support into XFS.

LBS is an overloaded term, so it is good to clarify what it means in the context of the kernel. The term LBS is used in this article to refer to places where the filesystem block size is larger than the page size of the system. A filesystem block is the smallest unit of data that the filesystem uses to store file data on the disk. Setting the filesystem block size will only affect the I/O granularity in the data path and will not have any impact on the filesystem metadata.

Long history

The earliest use case for LBS came from the CD/DVD world, where reads and writes had to be performed in 32KB or 64KB chunks. LBS support was proposed to handle these devices and avoid workarounds at the device-driver level, but it was never merged. Beyond these historical needs, LBS enables testing filesystem block sizes larger than the host page size, allowing developers to verify XFS functionality with 64KB blocks on x86_64 systems that do not support 64KB pages. This is particularly valuable given the increasing adoption of architectures with larger page sizes.

Another emerging use case for LBS comes from modern high-capacity solid-state storage devices (SSDs). Storage vendors are increasing their internal mapping unit (commonly called the Indirection Unit or IU) beyond 4KB to support these devices. When I/O operations are not sized for this larger IU, the device must perform read-modify-write operations, increasing the write amplification factor. LBS enables filesystems to match their block size with the device's IU, avoiding these costly operations.

Although LBS sounds like it has something to do with the block layer, the block-size limit in the kernel actually comes from the page cache, not the block layer. The main requirement to get LBS support for filesystems is the ability to track a filesystem block as a single unit in the page cache. Since a block is the smallest unit of data, the page cache should not partially evict a single block during writeback.

There were multiple attempts in the past to add LBS support. The most recent effort from Dave Chinner in 2018 worked around the page-cache limitation by adding the IOMAP_F_ZERO_AROUND flag in iomap. This flag pads I/O operations with zeroes if the size is less than a single block. The patches also removed the writepage() callback to ensure that the entire large block was written during the writeback. This effort was not upstreamed as folios were getting traction, which had the potential to solve the partial-eviction problem directly at the virtual filesystem layer.

Large folio support was added to the page cache, and it was enabled in XFS in 5.18. If a filesystem supports large folios, then the page cache will opportunistically allocate larger folios in the read and write paths based on the size of the I/O operation. The filesystem calls mapping_set_large_folios() during inode initialization to enable the large-folios feature. But the page cache could still fall back to allocating an order-0 folio (a single page) if the system is running low on memory, so there is no guarantee on the size of the folios.

The LBS patches, which were developed by me and some colleagues, add a way for the filesystem to inform the page cache of the minimum and maximum order of folio allocation to match its block size. The page cache will allocate large folios that match the order constraints set by the filesystem and ensure that no partial eviction of blocks occurs. The mapping_set_folio_min_order() and mapping_set_folio_order_range() APIs have been added to control the allocation order in the page cache. The order information is encoded in the flags member of the address_space struct.

Setting the minimum folio order is sufficient for filesystems to add LBS support since they only need a guarantee on the smallest folio allocated in the page cache. Filesystems can set the minimum folio order based on the block size during inode initialization. Existing callers of mapping_set_large_folios() will not notice any change in behavior because that function will now set the minimum order to zero and the maximum order to the MAX_PAGECACHE_ORDER.

Under memory pressure, the kernel will try to break up a large folio into individual pages, which could violate the promise of minimum folio order in the page cache. The main constraint is that the page cache must always ensure that the folios in the page cache are never smaller than the minimum order. Since the 6.8 kernel, the memory-management subsystem has the support to split a large folio into any lower-order folio. LBS support uses this feature to always maintain the minimum folio order in the page cache even when a large folio is split due to memory pressure or truncation.

Other filesystems?

Readers might be wondering if it will be trivial to add LBS support to other filesystems since the page cache infrastructure is now in place. The answer unfortunately is: it depends. Even though the necessary infrastructure is in place in the page cache to support LBS in any filesystem, the path toward adding this support depends on the filesystem implementation. XFS has been preparing for LBS support for a long time, which resulted in LBS patches requiring minimal changes in XFS.

The filesystem needs to support large folios to support LBS. So any filesystem that is using buffer heads in the data path cannot support LBS at the moment. XFS developers moved away from buffer heads and designed iomap to address the shortcomings in buffer heads. While there is work underway to support large folios in buffer heads, it might take some time before it gets added. Once large folios support is added to the filesystem, adding LBS support is all about finding any corner cases that have made any assumption on block size in the filesystem.

LBS has already found a use case in the kernel. The guarantee that the memory in the page cache representing a filesystem block will not be split has been used by the atomic-write support in XFS for 6.13. The initial support in XFS will only allow writing one filesystem block atomically. The drive needs to be formatted with the desired size for atomic writes as its filesystem block size.

The next focus in the LBS project is to remove the logical-block-size restriction for block devices. Similar to filesystem block size, the logical block size, which is the smallest size that a storage device can address, is restricted to the host page size due to limitations in the page cache. Block devices cache data in the page cache when applications performed buffered I/O operations directly on the devices and they use buffer heads by default to interact with the page cache. So large folio support is needed in buffer heads to remove this limitation for block devices.

Many core changes, such as large folios support, XFS using iomap instead of buffer heads, multi-page bvec support in the block layer, and so on, took place over the past 17 years after the first LBS attempt. This resulted in adding LBS support with relatively fewer changes. With that being done for 6.12, XFS will finally support all the features that it supported in IRIX before it was ported to Linux in 2001.

[I would like to thank Luis Chamberlain and Daniel Gomez for their contributions to both this article and the LBS patches. Special thanks to Matthew Wilcox, Hannes Reinecke, Dave Chinner, Darrick Wong, and Zi Yan for their thorough reviews and valuable feedback that helped shape the LBS patch series.]

Comments (19 posted)

AlmaLinux considers EPEL 10 rebuild for older hardware

By Joe Brockmeier
February 24, 2025

The AlmaLinux project has published a request for comments (RFC) on rebuilding Fedora's Extra Packages for Enterprise Linux (EPEL), which provides additional software for Red Hat Enterprise Linux (RHEL) and its derivatives, to support older x86_64 hardware that is not supported by EPEL 10. While this may sound simple on the surface, the proposed rebuild carries a few potential risks that the AlmaLinux and EPEL contributors would like to avoid. The AlmaLinux Engineering Steering Committee (ALESCo) is currently considering feedback and will vote on the RFC in March.

Alma and EPEL

AlmaLinux is one of the projects that sprang up in the wake of Red Hat discontinuing CentOS Linux. The project provides a binary compatible clone of RHEL using sources from CentOS Stream, rather than attempting to be a 1:1 rebuild. It occasionally deviates from RHEL to provide faster security updates, provide expanded hardware support, or re-add features and packages that Red Hat has discontinued such as SPICE protocol support for QEMU, and Firefox and Thunderbird in AlmaLinux 10. Most of those changes have been fairly modest, but the project will be going a bit farther with the next major RHEL release.

The upcoming RHEL 10 release, and Stream 10, are only targeting x86_64 systems with the x86_64_v3 instruction set architecture (ISA), which means Advanced Vector Extensions (AVX) and AVX2. The decision to drop support for v2 systems has not been entirely popular, since there are still recent CPUs (such as Intel's Atom series) that do not have support for AVX2.

To accommodate users who have hardware without AVX2 support, AlmaLinux is producing two builds of AlmaLinux 10⁠—⁠one which follows upstream Stream and RHEL with v3-optimized binaries by default, and one which targets x86_64_v2. However, that leaves a gap for users on the v2 builds: access to packages in the EPEL 10 repositories, which will also be built for v3 only and would not run on hardware that only supports v2.

Without access to EPEL packages, the utility of a RHEL clone is significantly reduced for many users. In recognition of that fact, ALESCo members Jonathan Wright, Neal Gompa, and Andrew Lukoshko submitted an RFC in January for the AlmaLinux project to take on rebuilding EPEL to support AlmaLinux 10, and AlmaLinux Kitten (the project's rolling development release) for the v2 builds. (AlmaLinux 10 builds for v3 can use EPEL normally.)

The proposal is straightforward: AlmaLinux would pull the latest stable source RPMs (SRPMs) from EPEL on a daily basis and rebuild those packages with the proper compiler options for v2, and then put them in an optional repository that can be enabled with DNF just like EPEL for other AlmaLinux releases. There will be two production branches for Alma's EPEL: one that targets AlmaLinux 10 stable and one that targets Kitten. The repository will be an optional target for AlmaLinux mirrors, so mirrors won't have to bear the burden of the extra disk space for the v2 EPEL repository by default. The RFC also states that no changes are permitted to EPEL sources "downstream", so if a package does not build for v2 without modification the project will have to contribute fixes to EPEL.

In addition to providing the obvious benefit of EPEL packages for users on v2 builds, the RFC states that this work will also lay the groundwork for "future alternative architecture work" to support other architectures that CentOS does not support. The RFC does not specify which architectures those might be, but I emailed the RFC authors about the proposal and Wright indicated that RISC-V is one of the architectures that the project has in mind.

Potential confusion

One of the concerns with this plan is the potential for confusion between the official EPEL repositories and the AlmaLinux rebuild. To no one's surprise, EPEL maintainers would prefer not to have to field bugs that might be caused by rebuilding EPEL 10 packages for x86_64_v2. EPEL team lead Carl George suggested that the rebuild RPMs carry a custom distribution tag (or dist tag) to indicate that they are built by Alma and not EPEL. George said that users rarely know to check a package's Vendor tag, but the EPEL Bugzilla template asks for the package's name, version, and release (NVR), which would capture the custom tag and make it obvious it was not provided by EPEL.

A custom dist tag would allow EPEL packagers to summarily close bugs filed against packages that had been rebuilt against AlmaLinux rather than built by the EPEL project. While that might sound less than helpful, George said that it would avoid wasting time on both sides by speeding the process of putting the bug in front of an AlmaLinux developer interested in resolving the problem. It would also, he said, help avoid "reputation damage" for the AlmaLinux project by reducing the number of misfiled bugs with EPEL.

Gompa initially seemed to object to providing a custom dist tag, and said that it would be "a significant deviation from our existing policy". George noted that this was necessary so that EPEL package maintainers could distinguish Alma rebuilds from EPEL builds. After some back-and-forth between Gompa and George, Wright said that he was leaning towards the dist tag solution:

It's pretty common knowledge though that Alma itself is a fully separate distro from RHEL. Adding EPEL to the mix, which aside from Oracle's thing has never really been widely rebuilt anywhere, could cause confusion. I see no harm in differing the dist-tag to help make it more obvious that it is different since there is no downside to doing so that has been brought up yet, or that I can think of.

With that, Gompa agreed and suggested a .alma_altarch suffix for the dist tag.

Expected to pass

Wright said via email that next meeting for ALESCo is on March 5 and that most of the technical details have been ironed out and that the proposal is expected to pass.

The demise of CentOS Linux has been widely viewed, not unreasonably, in a negative light. However, the projects that are filling the void left by CentOS Linux—with the freedom to deviate from RHEL and provide options Red Hat has no interest in supporting—may end up being better for the community in the long run.

Comments (6 posted)

Multi-host testing with the pytest-mh framework

February 21, 2025

This article was contributed by Pavel Březina

The pytest-mh project is a plugin that provides a multi-host test framework for the popular pytest unit-testing framework and test runner. Work on pytest-mh started in 2023 to solve a multitude of issues that cropped up for developers and testers when testing the SSSD project, which is a client for enterprise identity management. I was not happy with the state of testing of the SSSD project and wanted to create something that would increase test readability, remove duplication, eliminate errors, and provide multi-host testing capabilities, while having the flexibility to build a new API around it. Finally, I also wanted something that can be used by anyone to test their projects as well.

The pytest-mh plugin is licensed under the GPLv3. (The pytest framework is under the MIT license.) It provides building blocks to build a test framework around a project, and tools that connect the tests to one or more hosts. Developers can write tests in Python that execute commands on remote systems as well as implement complex automated setup and teardown to ensure that each test starts with a well-defined environment.

The pytest-mh plugin is designed for applications that work on multiple hosts, such as client/server applications. Unlike pytest, pytest-mh is not designed for unit testing. Instead it allows users to write tests that will exercise the application as a complete product, which is often referred to as system testing or application testing. Projects do not need to be written in Python to work with pytest-mh, making it a potential solution for all kinds of projects.

The main focus of pytest-mh is to test applications by executing commands on virtual machines, containers, or remote hosts; then to automatically revert all changes that were made to the host during a test to make sure that every single test starts with a fresh and clearly-defined environment. It also has a well-documented, and extensible, API that helps to write readable and maintainable test cases. While small or single-host projects can benefit from the framework, it is meant for projects that consist of multiple components that usually run on different hosts—typical examples are client/server and frontend/backend models, or those projects that are tightly integrated with the operating system where writing tests can be often more difficult than implementing new features.

Brief introduction to pytest

The pytest command-line application supports automatic collection of tests from a directory structure, test parameterization (running a single test with different parameters), and fixtures (functions that provide context for tests, such as content or environment details). It has a comprehensive plugin system that can be used to alter and enhance its behavior and functionality.

This snippet shows an example of a test for the strip() method. This is a built-in Python method available on string objects that removes all whitespace characters that are present around the string. The test provides a single body that is run multiple times for different parameters.

    import pytest
    
    @pytest.mark.parametrize("value", ["hello", " hello", "hello ", " hello "])
    def test_strip(value):
       assert value.strip() == "hello"
    
    $ pytest -vvv tests/test_strip.py
    ============ test session starts ============
    platform linux -- Python 3.13.1, pytest-8.3.4, pluggy-1.5.0 -- /usr/bin/python3
    cachedir: .pytest_cache
    rootdir: /var/home/pbrezina/workspace/pytest-mh
    configfile: pytest.ini
    collected 4 items
    
    tests/test_strip.py::test_strip[hello] PASSED                            [ 25%]
    tests/test_strip.py::test_strip[ hello] PASSED                           [ 50%]
    tests/test_strip.py::test_strip[hello ] PASSED                           [ 75%]
    tests/test_strip.py::test_strip[ hello ] PASSED                          [100%]
    
    ============= 4 passed in 0.01s =============

More information about pytest can be found on the project's documentation site including a get started guide.

What is multi-host testing

Multi-host testing, as the name suggests, is a kind of application testing that requires multiple hosts (usually virtual machines or containers) to run a test. While it is often possible to run all components of the application on a single host, that may not always be the case or it may require non-trivial work to enable testing on a single machine. Therefore it is often beneficial to install individual components of the application on multiple hosts to simplify the work and provide a testing environment that is closer to reality.

Consider SSSD as an example project. SSSD provides a system daemon that connects to various authentication and authorization backends such as LDAP, FreeIPA, SambaDC, and Active Directory. When requested, it reads data from one or more backends, stores it in a local cache, then passes the data via defined interfaces to other system components such as nsswitch, PAM, sudo, or autofs. SSSD therefore requires a complex test environment with a client connecting to many different backends with addition of NFS and Kerberos Key Distribution Center (KDC) servers in order to test autofs and Kerberos integration.

SSSD is written in C, and used to run tests on a single machine inside a chroot, utilizing cwrap to get control over many system calls. While testing using cwrap was possible, these tests were limited only to the LDAP backend. Adding new functionality required by new SSSD features was quite difficult and time consuming. SSSD switched to pytest-mh to make testing easier and more straightforward. Much of the code from tests has been moved into a reusable framework that is supported by pytest-mh's built in features, to reduce the amount of time worrying about how to test software and spend more time on what to test.

Introduction to pytest-mh

The pytest project provides markers, (pytest.mark.$markername) which are decorators that can be used to perform simple filtering by keyword. For example, "pytest -m $markername" would only run tests marked with $markername. They can also be used to implement advanced features inside a plugin. To support testing across multiple hosts, pytest-mh includes a topology marker (@pytest.mark.topology). A topology declares the hosts and roles that are required to run a test. It can be used to allow developers and testers to execute commands on the hosts, and to automatically filter out tests that cannot be run because some of the required hosts are not currently available. If a test requires two hosts such as "ldap" and "client" and only the "client" host is available then the test is skipped.

The supported connectors for running commands on remote hosts are, currently, SSHClient for connecting via SSH, and ContainerClient for connections to containers running under Docker or Podman. It is expected that the project is fully installed on one of the defined hosts. The tests then execute commands on the hosts in order to test the application. Here is a simple test from SSSD's test suite that utilizes the pytest-mh topology marker to associate the test with specific topology requirements.

    @pytest.mark.topology(KnownTopology.LDAP)
    def test_sssctl__check_missing_id_provider(client: Client):
        # create sssd.conf and start the sssd, with default configuration with a LDAP server.
        client.sssd.start()
    
        # remove id_provider parameter from domain section.
        client.sssd.config.remove_option("domain/test", "id_provider")
        client.sssd.config_apply(check_config=False)
    
        # Check the error message in output of # sssctl config-check
        output = client.host.conn.run("sssctl config-check", raise_on_error=False)
        assert "[rule/sssd_checks]: Attribute 'id_provider' is missing in section 'domain/test'." in output.stdout_lines[1]
    
    $ pytest --mh-config=./mhc.yaml -k test_sssctl__check_missing_id_provider -vvv
    ...
    tests/test_sssctl.py::test_sssctl__check_missing_id_provider (ldap) PASSED   [100%]

This test is associated with an LDAP topology by the pytest.mark.topology marker. This particular topology from the sssd-test-framework declares that the test requires one client and one LDAP host for it to run. If this requirement cannot be satisfied, the test is skipped.

The test starts SSSD with the default configuration, makes changes to the configuration to render it invalid and applies these changes. This is all implemented in the sssd-test-framework using pytest-mh building blocks. Then the test runs sssctl config-check on the client host and asserts its result. The client fixture is dynamically created by pytest-mh, and it allows access to the client host; it is an instance of the client role that gives access to various functionality required by SSSD tests.

Multi-host topology

When pytest is run, it collects all tests from available test files. But each test may require different hosts to run properly. Running a test suite like that would often end up with lots of timeouts and failed tests, unless tests that should not be executed are explicitly filtered out either by deselecting them on the command line or by splitting the test suite into multiple distinct smaller sets.

The pytest-mh project solves this by associating each test with a multi-host topology which declares a set of hosts and their roles that are required to run a test. A test suite may include many topologies. pytest-mh requires a configuration file that defines which hosts and roles are currently available. It is possible that available hosts satisfy more than one topology, in this case all tests that can be run are executed.

The topology from the previous example can be defined as:

    LDAP = TopologyMark(
        "ldap",
        Topology(TopologyDomain("myproject", client=1, ldap=1)),
        fixtures=dict(client='myproject.client[0]', ldap='myproject.ldap[0]')
    )

This says that topology LDAP requires one host that supports the client role and one host that supports the ldap role. It also defines the name of the fixtures: client and ldap pointing to the relevant host objects.

A configuration file that will support this topology can define two hosts with required roles and how to access them via SSH, which is the default connector.

    domains:
    - id: myproject
      hosts:
      - hostname: client.myproject.test
        role: client
        conn:
          type: ssh
          host: 192.168.0.10
          user: root
          password: Secret123
      - hostname: ldap.myproject.test
        role: ldap
        conn:
          type: ssh
          host: 192.168.0.20 # IP address or hostname
          user: root
          private_key: /my/private/key/path
          private_key_password: Secret123

In this example, pytest-mh will establish an SSH connection to the hosts as the root user, using a password for client and private key for LDAP.

Topology parameterization

Some projects provide one interface to the user but may fetch data from different backends (e.g. different SQL databases) or over different setups (e.g. encrypted versus an unencrypted communication channel). SSSD is a perfect example as it provides user information from various identity management solutions. The result is always the same for the user that runs id pbrezina, but the triggered code path is completely different for each backend inside SSSD. Normally, testing this would require duplicating a test code with slight modifications to its setup like adding a user to the IPA server instead of Active Directory and then configuring SSSD properly. However, pytest-mh brings pytest's parameterization to multi-host topology.

Topology parameterization allows a developer or tester to associate a single test with multiple topologies, which can rapidly increase code coverage. The test is then run multiple times, once for each topology. A different setup of each topology can be handled by the topology controller. The following test uses SSSD's topology controllers and KnownTopologyGroup.AnyProvider, which expands to multiple topologies: Active Directory, Samba DC, LDAP, and IPA. The test is executed once for each topology, having the same body but testing completely different code paths, demonstrating a test failure for the IPA backend.

    @pytest.mark.topology(KnownTopologyGroup.AnyProvider)
    def test_identity__lookup_username_with_id_command(client: Client, provider: GenericProvider):
        ids = [("user1", 10001), ("user2", 10002), ("user3", 10003)]
        for user, id in ids:
            provider.user(user).add(uid=id, gid=id + 500)
    
        client.sssd.domain["ldap_id_mapping"] = "false"
        client.sssd.start()
    
        for name, uid in ids:
            result = client.tools.id(name)
            assert result is not None, f"User {name} was not found using id!"
            assert result.user.name == name, f"Username {result.user.name} is incorrect, {name} expected!"
            assert result.user.id == uid, f"User id {result.user.id} is incorrect, {uid} expected!"
    
    $ pytest --mh-config=./mhc.yaml -k test_identity__lookup_username_with_id_command -vvv
    ...
    tests/test_identity.py::test_identity__lookup_username_with_id_command (ad)    PASSED [ 25%]
    tests/test_identity.py::test_identity__lookup_username_with_id_command (ipa)   FAILED [ 50%]
    tests/test_identity.py::test_identity__lookup_username_with_id_command (ldap)  PASSED [ 75%]
    tests/test_identity.py::test_identity__lookup_username_with_id_command (samba) PASSED [100%]

    ================================== FAILURES ===================================
    ========================== short test summary info ============================
    ____________ test_identity__lookup_username_with_id_command (ipa) _____________

    client = <sssd_test_framework.roles.client.Client object at
    0x7f7d92bfb770>,
    provider = <sssd_test_framework.roles.ipa.IPA object at 0x7f7d92691e80>

        @pytest.mark.topology(KnownTopologyGroup.AnyProvider)
        def test_identity__lookup_username_with_id_command(client: Client, provider: GenericProvider):
            ids = [("user1", 10001), ("user2", 10002), ("user3", 10003)]
            for user, id in ids:
                provider.user(user).add(uid=id, gid=id + 500)
        
            client.sssd.domain["ldap_id_mapping"] = "false"
            client.sssd.start()
        
            for name, uid in ids:
                result = client.tools.id(name)
    >           assert result is not None, f"User {name} was not found using id!"
    E           AssertionError: User user1 was not found using id!
    E           assert None is not None

    tests/test_identity.py:30: AssertionError
    ------------------------------ Captured log setup -----------------------------
    ...
    ========================== short test summary info ============================
    FAILED
    tests/test_identity.py::test_identity__lookup_username_with_id_command
    (ipa) - AssertionError: User user1 was not found using id!
    assert None is not None
    ===== 1 failed, 2 passed, 690 deselected, 3 warnings in 74.47s (0:01:14) ======

Building blocks

Besides filtering and running the tests, pytest-mh provides many building blocks and ready-to-use utilities. Building blocks provide classes to implement custom hosts, roles and topologies to support testing the project. Utilities are used to share code. One utility, that is probably used the most, handles filesystem management: reading and writing files, creating directories, uploading, downloading and so on. Other utilities can start or stop systemd services, manipulate the system firewall, fail tests when there is a core dump or an SELinux denial is detected, or simulate network delays.

All utilities that are included in pytest-mh include thorough setup and teardown code to make sure that the host is set up as expected and all changes are automatically reverted. If a file is created, it is automatically removed when the test is done; if a host is blocked by firewall, it is unblocked at the end of the test, and so on.

Further reading

This article only scratches the surface of pytest-mh's capabilities. The best way to go from here is to visit the pytest-mh documentation and getting started guide that implements a small and basic test suite for sudo. For a real project use case, it is recommended to look into the sssd-test-framework and sssd tests. Another good example is the shadow-utils test suite that has much lower complexity. This is still a young project so contributions are welcome, whether they are bug reports, feature requests, or pull requests to fix bugs and add features.

Comments (9 posted)

Building an open-source battery

February 26, 2025

This article was contributed by Koen Vervloesem


FOSDEM

FOSDEM 2025 featured the usual talks about open-source software, but, as always, the conference also offered the opportunity to discover some more exotic and less software-centric topics. That's how I learned about the Flow Battery Research Collective (FBRC), which is building what will eventually become an open-source home battery. Daniel Fernández Pinto represented the collective at FOSDEM with his talk "Building an Open-Source Battery for Stationary Storage" in the "Energy: Accelerating the Transition through Open Source" developer room (devroom).

The open-source battery project has a close cooperation with Utrecht University's FAIR-Battery project and is fully financed by NLnet Foundation. The FBRC is a relatively new project that started last year. Fernández, a chemist, had been doing battery research at home and documenting his findings on a blog since 2019. Electrochemical engineer Kirk Smith discovered Fernández's blog and proposed joining forces. That led to the formation of a project to "build an open-source battery aimed at solar and wind storage in the long term", while aiming to create kits for academic purposes in the short term.

Fernández started his talk with a brief explanation of how lithium-ion (Li-ion) batteries work, since that is the battery technology most of us are familiar with. "In lithium-ion batteries, we're basically just moving lithium ions from a graphite substrate to a metal oxide." He underscored how thin such a battery is: the cross-section is actually just a quarter of a human hair's thickness. "What prevents such a battery from shorting is a 5µm separator", he elaborated. What we commonly refer to as a Li-ion battery is actually composed of thousands of these layers packed and rolled together. The advantage of Li-ion batteries is their high energy-density, but his explanation was followed by the obligatory images of fires at large battery installations. He noted: "any puncture in these tiny layers and it all goes up in flames".

Redox flow battery

Fernández then proceeded to describe an alternative type of energy storage: a redox flow battery (RFB), which he characterized as "more robust for large-scale energy storage". The technology actually dates back to the 1980s. Rather than incorporating solid-state layers, a flow battery consists of two tanks that store the reagents: a positive electrolyte (a solution that accepts electrons) and a negative electrolyte (a solution that donates electrons). Each tank has a pump, circulating both solutions within their own circuit (tank and pipes), with a membrane in a cell in the middle where the fluids meet and exchange electrons. This approach is easily scalable: the tanks can be enlarged if greater energy storage is needed, while the cell size can be increased if more power is required.

Naturally, the question arises whether RFBs can compete with other battery technologies. Fernández presented a chart with five variables: energy density, power density, safety/sustainability, initial affordability, and cycle life. This clearly illustrated that RFBs are not nearly as dense as Li-ion batteries: "They have probably a tenth, or even a twentieth, of the energy density of lithium-ion batteries." However, they hold significant advantages in safety, affordability, and cycle life.

As for safety, RFBs are aqueous systems, "so they don't catch fire". The reagents are generally also more environmentally friendly than the chemistry in Li-ion batteries, Fernández noted. Additionally, "if something breaks in the cell, you can take it apart and replace it", and the reagents can be simply replaced if they stop working. "In a lithium-ion battery, any small failure is critical and destroys the battery."

Why an open-source battery?

"There is currently no open-source battery initiative at all", Fernández stated, whether for Li-ion or flow batteries, hence the motivation for him and Smith to establish one. He is aware of some other open-source projects that aim to reproduce "just a cell", but upon reading the research papers, he realized that there's a lot of missing information, such as instructions on how to build the pumps, reservoirs, and electronics.

Regarding this matter, Fernández said that flow-battery research faces reproducibility issues: "a lot of researchers publish completely different results because of their varying setups, and we wanted to create a cell that could serve as a standard cell for flow battery research". Even though the FBRC intends to sell kits with battery cells, Fernández emphasized that they want everyone to be able to build the cells on their own.

Roadmap

By the end of 2024, the FBRC had completed a bench-top battery cell, with a cell area of less than 10cm², capable of supplying a low voltage and low current. This will evolve into the kit that the FBRC plans to sell "probably this year or next year". As the project is open-source, buying the kit isn't necessary, since anyone can build the bench-top cell following the provided instructions. "We're currently testing chemistries and different materials", Fernández said, to deliver a kit with reproducible results.

By mid-2025, the FBRC aims to have a large-format cell with a cell area exceeding 600cm², capable of supplying low voltage with a high current. Although this will still be a single cell, it should be scalable for larger-scale energy storage. Then, by the end of this year, Fernández hopes to have built a stack of tens of these large-format cells, capable of supplying the high voltage and high current to power a house.

The goal is to replicate something akin to Redflow's ZBM3, which is a 10kWh zinc-bromine flow battery with a continuous power rating of 3kW. Fernández's mention of Redflow was no coincidence, the company went bankrupt at the end of 2024, and that is part of why he's so keen to make an open-source battery. "Redflow invested a lot of time in these cells, which are pretty good. If this had been open-source, people could've used the knowledge to start other businesses, or even build the batteries themselves."

The cheapest pump that works

Fernández also described some of the prototypes that they built. The first prototype of the battery cell was intended to be built from polypropylene, with silicone gaskets and small diaphragm pumps, but the design had numerous possibilities for leakage and other problems. They actually 3D-printed this design with resin, "because we never managed to print polypropylene well enough". Regarding this initial design, Fernández said: "None of this worked. The tubing didn't work, the cell didn't work, the pumps didn't work." To control the pumps, Fernández used an Arduino and an inexpensive motor driver "in total costing less than 20 euros for the electronic part".

In the second attempt, they had polypropylene bodies manufactured, and the diaphragm pumps were replaced by peristaltic pumps. The design saw several improvements, and Fernández showed a video of the pumps in operation, adding "that pump is not supposed to be orange", but was, because it had leaked orange fluid. This was all conducted in Fernández's apartment, he said, joking that he was still standing, "so it's not that dangerous".

Then they changed to a second design trying to prevent the leaks, and the pumps were changed once again. Fernández explained this choice: "We began with the cheapest possible pump we could get, I broke it, and then we moved to the next one. I've been doing that iteratively, breaking every tier of AliExpress pump until I had the cheapest model that works."

This second design worked pretty well, initially with Zinc-iodide as the chemical, since this is readily available in the EU without requiring a license to buy chemical materials. "It's not like you can drink it, but it's not extremely toxic", Fernández added. They did some tests, and this design yielded around half of the energy density of commercial flow batteries.

In flow-battery research, the membrane in the middle is traditionally an expensive Nafion membrane. In contrast, the FBRC's design uses "a very fancy microporous membrane that's called photopaper". According to Fernández, this still exhibits some leakage and resistance, explaining that the resulting energy efficiency of 65% is not particularly high, "but it's very cheap to achieve".

Following this, Fernández showed that they can also achieve higher densities, "at the level of a commercial flow battery", although the graph stopped at two discharge cycles because other parts of the system failed at that point, such as the tubing and pumps: "As things become more energy-dense, they become more reactive. We only had two cycles here because of the corrosion."

Thus, for the next step, they acquired a beefier, impeller pump "the size of a fist". While the previous pumps could pump 60 milliliters per minute, this new pump manages 6000mL per minute, which is necessary to scale up to the large-format cell.

Getting involved

Fernández concluded his talk by describing some ways to get involved in the project. Firstly, individuals can assemble a kit using the online documentation. "Nobody has attempted this, so we're not sure if the instructions are any good", Fernández joked, adding that "we want to make the documentation better". Additionally, just testing whether 3D-printing the pieces works is also valuable "because we need to make sure that the pieces can be printed on a range of printers and with different materials".

Similarly, individuals can also assist by testing various tubing materials or pumps. Additionally, when the project scales up soon, building and testing larger-scale cells will be useful, although Fernández advised that this should only be done with water "because we don't want anyone to die helping us". Lastly, from an electronics standpoint, the project doesn't have a battery management system yet, which is essential for larger-scale flow batteries.

The Flow Battery Research Collective is an intriguing initiative to develop an open-source home battery. Fernández and Smith have clearly focused on an approach that is affordable, safe to handle, and with parts and chemicals that are easy to source. Hopefully their projects can make battery research reproducible and help to democratize home batteries.

[While I was unable to attend FOSDEM in person, I watched the live-stream of the talk.]

Comments (80 posted)

Two new site features: full-text RSS and automatic dark mode

One of the often-requested LWN site features that has languished the longest on our to-do list is full-text RSS feeds. We are happy to announce that, finally, there is a set of such feeds available; the full set can be seen on our feeds page. This is a subscriber-only feature, and it works by creating a unique fetch URL for each user. We will, of course, be counting on our readers to not share those URLs.

Another feature we have had requests for is to automatically present the site in dark-mode colors when a reader's browser has been configured to prefer it. That feature, too, is now available. In this case, we had to think about the interaction between automatic selection and the color customization that the site has long had. The conclusion we reached is that, if custom colors have been configured for an account, they will win out over the automatic selection. There is a new preference in the customization area to change this default if desired.

Both of these features — and the other enhancements we have made recently — were enabled by the support of LWN's subscribers. By making it possible to bring in new staff last year, you created the space to improve the site experience while keeping up with the writing. We thank all of you for your support.

Comments (32 posted)

Page editor: Jonathan Corbet

Brief items

Security

Security quote of the week

The reason for their silence? Fear of mean tweets—many generated by bot networks. Fear of being primaried. Fear of the digital mob that Elon Musk can direct with a few keystrokes—a mob increasingly composed of artificial accounts and coordinated influence operations. These aren't just personal failures of courage—they represent something far more dangerous: the complete surrender of democratic institutions to manufactured technological intimidation.

Let's be clear about the historical magnitude of this choice: They're trading their eyewitness testimony of war crimes for social media comfort. They're choosing X followers—most of whom are probably bots tied to various influence operations, both foreign and domestic—over the international order that has prevented nuclear war for three generations.

The supreme irony is that the pressure they're surrendering to isn't even real. These senators are abandoning witnessed truth about war crimes in response to artificially generated outrage. They're choosing bot approval over bomb evidence. The digital mob they fear is largely synthetic—but the consequences of their cowardice will be catastrophically real.

Mike Brock on the US Senate

Comments (15 posted)

Kernel development

Kernel release status

The current development kernel is 6.14-rc4, released on February 23. Linus said: "This continues to be the right kind of 'boring' release: nothing in particular stands out in rc4".

Stable updates: 6.13.4, 6.12.16, 6.6.79, and 6.1.129 were released on February 21.

The 6.13.5, 6.12.17, and 6.6.80 updates are in the review process; they are due at any time.

Comments (none posted)

Linus on Rust and the kernel's DMA layer

At the end of January we ran this article on the discussions around a set of Rust bindings for the kernel's DMA-mapping layer. Many pixels have been expended on the topic since across the net, most recently in this sprawling email thread. Linus Torvalds has now made his feelings known on the topic:

You are not forced to take any Rust code, or care about any Rust code in the DMA code. You can ignore it.

But "ignore the Rust side" automatically also means that you don't have any *say* on the Rust side.

You can't have it both ways. You can't say "I want to have nothing to do with Rust", and then in the very next sentence say "And that means that the Rust code that I will ignore cannot use the C interfaces I maintain".

The code in question seems highly likely to be merged for the 6.15 release.

Comments (72 posted)

A change in maintenance for the kernel's DMA-mapping layer

The conversation around the merging of a set of Rust abstractions for the kernel's DMA-mapping layer has mostly settled after Linus Torvalds made it clear that the code would be accepted. One other consequence of this decision, though, is that Christoph Hellwig has quietly stepped down from the maintenance of the DMA-mapping code. Marek Szyprowski will be the maintainer of that layer going forward. Hellwig has maintained that code for many years; his contributions will be missed.

Comments (96 posted)

Quotes of the week

For example, this merge window I did have that unusual "this doesn't work for my rust build" situation, but that one was caught and fixed before the merge window even closed. Guess what *wasn't* caught, and then wasn't fixed until -rc3? A bog-standard build error on the esoteric platform called "i386".
Linus Torvalds

The difference between network and security developers is that a network developer thinks 10 microseconds is a long time, while a security developer thinks 10 years is no time at all.
Casey Schaufler

Comments (none posted)

Distributions

Armbian 25.2 released

Version 25.2 of the Armbian Linux distribution for single-board computers (SBCs) has been released. Notable changes in this release include support for many new SBCs, an upgrade to Linux kernel 6.12.x, and more. See the changelog for a complete list.

Comments (1 posted)

Gentoo now offers qcow2 disk images

The Gentoo Linux project has announced the availability of qcow2 images for amd64 (x86_64) and arm64 (aarch64), and plans to "eventually" offer images for the riscv64 and loongarch64 architectures.

The images, updated weekly, include an EFI boot partition and a fully functional Gentoo installation; either with no network activated but a password-less root login on the console ("no root pw"), or with network activated, all accounts initially locked, but cloud-init running on boot ("cloud-init").

Comments (none posted)

Distributions quote of the week

To me, it's not about "fun" of swapping out bits of the distribution (trust me, fun it ain't). And it's not about shipping "longterm" stale kernels. It's about the fact that we have no resources to have the kernel package support the hardware people want to use Fedora on. From the Raspberry Pi 5 to the new RISC-V stuff, the "mainline-only" approach only works if there's folks supporting that effort with engineering resources. And we don't have that. As a concrete example, Fedora Mobility has struggled to figure out a reference device because of the lack of options in the mainline kernel.

Justin Forbes does great work maintaining the Fedora kernel, but he's the only one. In 2018, we had three kernel engineers. We briefly had four in 2019, but today, we're down to one. And because of the way the kernel package is maintained (with CKI and secure boot and such), it seems like only Red Hatters are able to maintain that package. So where's the community opportunity to enable the things that they want to use Fedora on? So far, it seems by creating custom kernel builds in COPR for remixes.

This is a point where I feel we have some misalignment of interests. I think it's fair to say we *want* to remain tracking mainline and there are a lot of benefits to doing so. But we're sacrificing a lot for it, including first-mover advantage and opportunities that follow from that.

Neal Gompa

Comments (none posted)

Development

Aqualung 2.0 released

Version 2.0 of the Aqualung gapless music player has been released. Aqualung supports playback of a wide range of audio formats, ripping CDs to WAV, FLAC, Ogg Vorbis, or MP3, and subscribing to podcasts via RSS or Atom feeds. The primary change in this release is the migration from GTK2 to GTK3, and dropping support for custom skins as a result.

Comments (none posted)

Emacs 30.1 released

The Emacs extensible text editor (among other things) has made a security release to address two vulnerabilities. Emacs 30.1 has fixes for CVE-2025-1244, which is a shell-command-injection flaw in the man.el man page browser and for CVE-2024-53920, which is a code-execution vulnerability in the flymake syntax-checking mode. LWN covered the flymake problems back in December.

Full Story (comments: 5)

Rust 1.85.0 released

Version 1.85.0 of the Rust language has been released. Changes in the release include support for async closures, some convenience iterators for tuples, and a number of stabilized APIs. The headline feature, though, is that this release stabilizes the Rust 2024 edition, described as "the largest edition we have released". The 2024 edition guide has a detailed listing of all the changes that were incorporated this time around.

Comments (13 posted)

Page editor: Daroc Alden

Announcements

Newsletters

Distributions and system administration

Development

Emacs News February 24
This Week in GNOME February 21
Golang Weekly February 26
LLVM Weekly February 24
OCaml Weekly News February 25
Perl Weekly February 24
This Week in Plasma February 22
PyCoder's Weekly February 25
Weekly Rakudo News February 24
Ruby Weekly News February 20
This Week in Rust February 19
Wikimedia Tech News February 24

Meeting minutes

Calls for Presentations

CFP Deadlines: February 27, 2025 to April 28, 2025

The following listing of CFP deadlines is taken from the LWN.net CFP Calendar.

DeadlineEvent Dates EventLocation
February 28 March 24
March 28
TechWeekStorage 25 Geneva, Switzerland
March 1 April 26 Central Pennsylvania Open Source Conference Lancaster, PA, US
March 1 April 26 21st Linux Infotag Augsburg Augsburg, Germany
March 2 June 12
June 14
DevConf.CZ Brno, Czech Republic
March 10 June 26
June 27
Linux Security Summit North America Denver, CO, US
March 16 July 24
July 29
GUADEC 2025 Brescia, Italy
March 16 May 24
May 25
Journées du Logiciel Libre Lyon, France
March 28 October 12
October 14
All Things Open Raleigh, NC, US
March 30 July 1
July 3
Pass the SALT Conference Lille, France
March 31 June 26
June 28
Linux Audio Conference Lyon, France
March 31 August 9
August 10
COSCUP 2025 Taipei City, Taiwan
April 13 April 30
May 5
MiniDebConf Hamburg Hamburg, Germany
April 14 August 25
August 27
Open Source Summit Europe Amsterdam, Netherlands
April 27 June 13
June 15
SouthEast LinuxFest Charlotte, NC, US

If the CFP deadline for your event does not appear here, please tell us about it.

Upcoming Events

Events: February 27, 2025 to April 28, 2025

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
March 6
March 9
SCALE 22x Pasadena, CA, US
March 10
March 11
FOSS Backstage Berlin, Germany
March 10
March 14
Netdev 0x19 Zagreb, Croatia
March 13
March 15
FOSSASIA Summit Bangkok, Thailand
March 18 Nordic PGDay 2025 Copenhagen, Denmark
March 18
March 20
Linux Foundation Member Summit Napa, CA, US
March 20 pgDay Paris Paris, France
March 22
March 23
Chemnitz Linux Days 2025 Chemnitz, Germany
March 24
March 26
Linux Storage, Filesystem, Memory-Management and BPF Summit Montreal, Canada
March 24
March 28
TechWeekStorage 25 Geneva, Switzerland
March 29 Open Source Operating System Annual Technical Conference Beijing, China
April 1
April 2
FediForum Unconference online
April 7
April 8
sambaXP 2025 Göttingen, Germany
April 8
April 10
SNIA SMB3 Interoperability Lab EMEA Goettingen, Germany
April 14
April 15
foss-north 2025 Gothenburg, Sweden
April 26 Central Pennsylvania Open Source Conference Lancaster, PA, US
April 26 21st Linux Infotag Augsburg Augsburg, Germany

If your event does not appear here, please tell us about it.

Security updates

Alert summary February 20, 2025 to February 26, 2025

Dist. ID Release Package Date
AlmaLinux ALSA-2025:1675 8 bind 2025-02-20
AlmaLinux ALSA-2025:1681 9 bind 2025-02-21
AlmaLinux ALSA-2025:1676 8 bind9.16 2025-02-20
AlmaLinux ALSA-2025:1670 9 bind9.18 2025-02-24
AlmaLinux ALSA-2025:1737 8 libpq 2025-02-25
AlmaLinux ALSA-2025:1738 9 libpq 2025-02-21
AlmaLinux ALSA-2025:1671 9 mysql 2025-02-21
AlmaLinux ALSA-2025:1673 8 mysql:8.0 2025-02-20
AlmaLinux ALSA-2025:1742 9 postgresql 2025-02-21
AlmaLinux ALSA-2025:1736 8 postgresql:13 2025-02-25
AlmaLinux ALSA-2025:1739 8 postgresql:15 2025-02-25
AlmaLinux ALSA-2025:1741 9 postgresql:15 2025-02-21
AlmaLinux ALSA-2025:1740 8 postgresql:16 2025-02-25
AlmaLinux ALSA-2025:1743 9 postgresql:16 2025-02-24
Debian DSA-5869-1 stable chromium 2025-02-21
Debian DLA-4060-1 LTS djoser 2025-02-20
Debian DLA-4066-1 LTS fort-validator 2025-02-24
Debian DLA-4063-1 LTS gnutls28 2025-02-21
Debian DLA-4065-1 LTS krb5 2025-02-23
Debian DLA-4061-1 LTS libtasn1-6 2025-02-21
Debian DLA-4064-1 LTS libxml2 2025-02-22
Debian DLA-4059-1 LTS mosquitto 2025-02-20
Debian DLA-4067-1 LTS nodejs 2025-02-25
Debian DLA-4068-1 LTS php-nesbot-carbon 2025-02-25
Debian DLA-4052-2 LTS postgresql-13 2025-02-21
Debian DLA-4062-1 LTS python-werkzeug 2025-02-21
Fedora FEDORA-2025-c0c371a0b6 F40 chromium 2025-02-22
Fedora FEDORA-2025-acbfdd26a1 F41 chromium 2025-02-24
Fedora FEDORA-2025-166f075581 F40 crun 2025-02-26
Fedora FEDORA-2025-5e5783f0d1 F40 gnutls 2025-02-26
Fedora FEDORA-2025-a62f1e771c F41 gnutls 2025-02-20
Fedora FEDORA-2025-b268fceaec F40 kernel 2025-02-20
Fedora FEDORA-2025-cca2fcc70c F41 kernel 2025-02-20
Fedora FEDORA-2025-a5edb54660 F40 libtasn1 2025-02-26
Fedora FEDORA-2025-9b659aa327 F41 libtasn1 2025-02-20
Fedora FEDORA-2025-dd577cf35f F40 microcode_ctl 2025-02-20
Fedora FEDORA-2025-62f6cb2785 F40 openssh 2025-02-24
Fedora FEDORA-2025-18cb3f852d F41 openssh 2025-02-20
Fedora FEDORA-2025-becf280371 F40 openssl 2025-02-26
Fedora FEDORA-2025-d37ad923f5 F40 proftpd 2025-02-22
Fedora FEDORA-2025-835949b994 F41 proftpd 2025-02-22
Fedora FEDORA-2025-fb4c448085 F41 python3.10 2025-02-20
Fedora FEDORA-2025-81304012fc F41 python3.11 2025-02-20
Fedora FEDORA-2025-2543c24e23 F40 python3.12 2025-02-21
Fedora FEDORA-2025-b353a46e0c F40 python3.8 2025-02-23
Fedora FEDORA-2025-bec494726c F41 python3.8 2025-02-23
Fedora FEDORA-2025-66c560fa22 F40 python3.9 2025-02-20
Fedora FEDORA-2025-be080d5ed4 F41 python3.9 2025-02-20
Fedora FEDORA-2025-0e5d6864d8 F40 vaultwarden 2025-02-23
Fedora FEDORA-2025-5f07738947 F41 vaultwarden 2025-02-23
Fedora FEDORA-2025-3e178bb819 F40 vim 2025-02-24
Fedora FEDORA-2025-a71acb72e9 F41 vim 2025-02-21
Mageia MGASA-2025-0075 9 emacs 2025-02-25
Mageia MGASA-2025-0071 9 gnutls 2025-02-25
Mageia MGASA-2025-0077 9 iniparser 2025-02-26
Mageia MGASA-2025-0079 9 kernel, kmod-virtualbox, kmod-xtables-addons 2025-02-26
Mageia MGASA-2025-0078 9 kernel-linus 2025-02-26
Mageia MGASA-2025-0072 9 krb5 2025-02-25
Mageia MGASA-2025-0073 9 libxml2 2025-02-25
Mageia MGASA-2025-0070 9 neomutt 2025-02-24
Mageia MGASA-2025-0074 9 vim 2025-02-25
Oracle ELSA-2025-1675 OL8 bind 2025-02-24
Oracle ELSA-2025-1681 OL9 bind 2025-02-24
Oracle ELSA-2025-1676 OL8 bind9.16 2025-02-24
Oracle ELSA-2025-1670 OL9 bind9.18 2025-02-24
Oracle ELSA-2025-1737 OL8 libpq 2025-02-24
Oracle ELSA-2025-1738 OL9 libpq 2025-02-24
Oracle ELSA-2025-1047 OL7 libsoup 2025-02-24
Oracle ELSA-2025-1671 OL9 mysql 2025-02-24
Oracle ELSA-2025-1673 OL8 mysql:8.0 2025-02-24
Oracle ELSA-2025-1582 OL8 nodejs:18 2025-02-24
Oracle ELSA-2025-1611 OL8 nodejs:22 2025-02-24
Oracle ELSA-2025-1613 OL9 nodejs:22 2025-02-24
Oracle ELSA-2025-1742 OL9 postgresql 2025-02-24
Oracle ELSA-2025-1736 OL8 postgresql:13 2025-02-24
Oracle ELSA-2025-1739 OL8 postgresql:15 2025-02-24
Oracle ELSA-2025-1741 OL9 postgresql:15 2025-02-24
Oracle ELSA-2025-1740 OL8 postgresql:16 2025-02-24
Oracle ELSA-2025-1743 OL9 postgresql:16 2025-02-24
Red Hat RHSA-2025:1685-01 EL6 bind 2025-02-20
Red Hat RHSA-2025:1718-01 EL7 bind 2025-02-20
Red Hat RHSA-2025:1687-01 EL8.2 bind 2025-02-20
Red Hat RHSA-2025:1691-01 EL8.4 bind 2025-02-20
Red Hat RHSA-2025:1684-01 EL8.6 bind 2025-02-20
Red Hat RHSA-2025:1681-01 EL9 bind 2025-02-20
Red Hat RHSA-2025:1679-01 EL8.6 bind9.16 2025-02-20
Red Hat RHSA-2025:1678-01 EL8.8 bind9.16 2025-02-20
Red Hat RHSA-2025:1295-01 EL9.2 buildah 2025-02-20
Red Hat RHSA-2025:1372-01 EL8 container-tools:rhel8 2025-02-20
Red Hat RHSA-2025:1207-01 EL8.6 container-tools:rhel8 2025-02-20
Red Hat RHSA-2025:1275-01 EL8.8 container-tools:rhel8 2025-02-20
Red Hat RHSA-2025:1737-01 EL8 libpq 2025-02-20
Red Hat RHSA-2025:1720-01 EL8.2 libpq 2025-02-20
Red Hat RHSA-2025:1735-01 EL8.4 libpq 2025-02-20
Red Hat RHSA-2025:1745-01 EL8.6 libpq 2025-02-20
Red Hat RHSA-2025:1744-01 EL8.8 libpq 2025-02-20
Red Hat RHSA-2025:1738-01 EL9 libpq 2025-02-20
Red Hat RHSA-2025:1725-01 EL9.0 libpq 2025-02-20
Red Hat RHSA-2025:1733-01 EL9.2 libpq 2025-02-20
Red Hat RHSA-2025:1732-01 EL9.4 libpq 2025-02-20
Red Hat RHSA-2025:1755-01 EL9.0 mysql 2025-02-24
Red Hat RHSA-2025:1767-01 EL9.2 mysql 2025-02-24
Red Hat RHSA-2025:1756-01 EL9.4 mysql 2025-02-24
Red Hat RHSA-2025:1766-01 EL8.6 mysql:8.0 2025-02-24
Red Hat RHSA-2025:1757-01 EL8.8 mysql:8.0 2025-02-24
Red Hat RHSA-2025:1296-01 EL9.2 podman 2025-02-20
Red Hat RHSA-2025:1742-01 EL9 postgresql 2025-02-20
Red Hat RHSA-2025:1728-01 EL9.0 postgresql 2025-02-20
Red Hat RHSA-2025:1727-01 EL9.2 postgresql 2025-02-20
Red Hat RHSA-2025:1726-01 EL9.4 postgresql 2025-02-20
Red Hat RHSA-2025:1736-01 EL8 postgresql:13 2025-02-20
Red Hat RHSA-2025:1724-01 EL8.4 postgresql:13 2025-02-20
Red Hat RHSA-2025:1723-01 EL8.6 postgresql:13 2025-02-20
Red Hat RHSA-2025:1729-01 EL8.8 postgresql:13 2025-02-20
Red Hat RHSA-2025:1739-01 EL8 postgresql:15 2025-02-20
Red Hat RHSA-2025:1721-01 EL8.8 postgresql:15 2025-02-20
Red Hat RHSA-2025:1741-01 EL9 postgresql:15 2025-02-20
Red Hat RHSA-2025:1722-01 EL9.2 postgresql:15 2025-02-20
Red Hat RHSA-2025:1730-01 EL9.4 postgresql:15 2025-02-20
Red Hat RHSA-2025:1740-01 EL8 postgresql:16 2025-02-20
Red Hat RHSA-2025:1743-01 EL9 postgresql:16 2025-02-20
Red Hat RHSA-2025:1731-01 EL9.4 postgresql:16 2025-02-20
Red Hat RHSA-2025:1750-01 EL7 python3 2025-02-24
Red Hat RHSA-2025:1813-01 EL9.2 python3.11-urllib3 2025-02-25
Red Hat RHSA-2025:1793-01 EL9.4 python3.11-urllib3 2025-02-25
Red Hat RHSA-2025:0595-01 EL8 redis:6 2025-02-20
Red Hat RHSA-2025:1802-01 EL9.2 tuned 2025-02-25
Red Hat RHSA-2025:1785-01 EL9.4 tuned 2025-02-25
Slackware SSA:2025-051-01 ark 2025-02-20
Slackware SSA:2025-050-01 libxml2 2025-02-19
Slackware SSA:2025-056-02 tigervnc 2025-02-25
Slackware SSA:2025-056-01 xorg 2025-02-25
SUSE SUSE-SU-2025:0719-1 SLE15 SES7.1 oS15.6 Maven 2025-02-26
SUSE SUSE-SU-2025:0601-1 SLE15 oS15.6 brise 2025-02-21
SUSE openSUSE-SU-2025:14829-1 TW chromedriver 2025-02-22
SUSE openSUSE-SU-2025:0070-1 osB15 chromium 2025-02-21
SUSE openSUSE-SU-2025:0074-1 osB15 crun 2025-02-24
SUSE openSUSE-SU-2025:14823-1 TW dcmtk 2025-02-21
SUSE openSUSE-SU-2025:0068-1 osB15 dcmtk 2025-02-20
SUSE SUSE-SU-2025:0599-1 MP4.3 SLE15 oS15.4 oS15.6 emacs 2025-02-21
SUSE SUSE-SU-2025:0611-1 MP4.2 MP4.3 SLE15 SLE-m5.5 oS15.6 google-osconfig-agent 2025-02-21
SUSE openSUSE-SU-2025:14815-1 TW google-osconfig-agent 2025-02-19
SUSE SUSE-SU-2025:0622-1 SLE12 grafana 2025-02-21
SUSE SUSE-SU-2025:0624-1 SLE15 oS15.3 oS15.4 oS15.5 oS15.6 grafana 2025-02-21
SUSE SUSE-SU-2025:0623-1 SLE15 oS15.6 grafana 2025-02-21
SUSE SUSE-SU-2025:0629-1 SLE12 grub2 2025-02-21
SUSE SUSE-SU-2025:0607-1 SLE15 SLE-m5.2 SES7.1 oS15.3 grub2 2025-02-21
SUSE openSUSE-SU-2025:14822-1 TW grub2 2025-02-20
SUSE SUSE-SU-2025:0602-1 SLE15 SLE-m5.5 SES7.1 oS15.6 helm 2025-02-21
SUSE openSUSE-SU-2025:0067-1 osB15 java-17-openj9 2025-02-20
SUSE SUSE-SU-2025:0675-1 SLE12 java-1_8_0-ibm 2025-02-24
SUSE SUSE-SU-2025:0674-1 SLE15 SES7.1 oS15.6 java-1_8_0-ibm 2025-02-24
SUSE openSUSE-SU-2025:14824-1 TW java-23-openjdk 2025-02-21
SUSE SUSE-SU-2025:0517-2 MP4.2 SLE15 SLE-m5.1 SLE-m5.2 SES7.1 oS15.3 kernel 2025-02-21
SUSE SUSE-SU-2025:0603-1 SLE11 kernel 2025-02-21
SUSE openSUSE-SU-2025:14817-1 TW kubernetes1.30-apiserver 2025-02-19
SUSE openSUSE-SU-2025:14816-1 TW kubernetes1.30-apiserver 2025-02-19
SUSE openSUSE-SU-2025:14819-1 TW kubernetes1.30-apiserver 2025-02-19
SUSE openSUSE-SU-2025:14818-1 TW kubernetes1.31-apiserver 2025-02-19
SUSE openSUSE-SU-2025:14832-1 TW libprotobuf-lite28_3_0 2025-02-25
SUSE openSUSE-SU-2025:14825-1 TW luanti 2025-02-21
SUSE SUSE-SU-2025:0605-1 MP4.3 SLE15 SLE-m5.1 SLE-m5.2 SLE-m5.3 SLE-m5.4 SLE-m5.5 SES7.1 oS15.3 openssh 2025-02-21
SUSE SUSE-SU-2025:0659-1 SLE12 openssh 2025-02-24
SUSE openSUSE-SU-2025:14820-1 TW openssh 2025-02-19
SUSE SUSE-SU-2025:0613-1 SLE15 oS15.6 openssl-1_1 2025-02-21
SUSE SUSE-SU-2025:0690-1 MP4.3 SLE15 SLE-m5.3 SLE-m5.4 oS15.4 ovmf 2025-02-24
SUSE SUSE-SU-2025:0609-1 SLE15 SLE-m5.5 oS15.5 ovmf 2025-02-21
SUSE SUSE-SU-2025:0608-1 SLE15 oS15.6 ovmf 2025-02-21
SUSE SUSE-SU-2025:0712-1 SLE-m5.1 SLE-m5.2 SLE-m5.3 SLE-m5.4 SLE-m5.5 pam_pkcs11 2025-02-25
SUSE SUSE-SU-2025:0688-1 SLE12 pam_pkcs11 2025-02-24
SUSE SUSE-SU-2025:0689-1 SLE15 oS15.6 pam_pkcs11 2025-02-24
SUSE SUSE-SU-2025:0606-1 SLE12 postgresql13 2025-02-21
SUSE SUSE-SU-2025:0619-1 SLE15 SES7.1 postgresql13 2025-02-21
SUSE SUSE-SU-2025:0632-1 MP4.3 SLE15 SES7.1 postgresql14 2025-02-21
SUSE SUSE-SU-2025:0615-1 SLE12 postgresql14 2025-02-21
SUSE SUSE-SU-2025:0631-1 SLE15 oS15.6 postgresql14 2025-02-21
SUSE SUSE-SU-2025:0633-1 MP4.3 SLE15 SES7.1 postgresql15 2025-02-21
SUSE SUSE-SU-2025:0634-1 SLE12 postgresql15 2025-02-21
SUSE SUSE-SU-2025:0614-1 SLE15 oS15.6 postgresql15 2025-02-21
SUSE SUSE-SU-2025:0636-1 MP4.3 SLE15 SES7.1 postgresql16 2025-02-21
SUSE SUSE-SU-2025:0637-1 SLE12 postgresql16 2025-02-21
SUSE SUSE-SU-2025:0635-1 SLE15 oS15.6 postgresql16 2025-02-21
SUSE SUSE-SU-2025:0618-1 MP4.3 SLE15 SES7.1 postgresql17 2025-02-21
SUSE SUSE-SU-2025:0655-1 SLE12 postgresql17 2025-02-24
SUSE SUSE-SU-2025:0616-1 SLE15 oS15.6 postgresql17 2025-02-21
SUSE openSUSE-SU-2025:14827-1 TW proftpd 2025-02-21
SUSE SUSE-SU-2025:0692-1 SLE15 SLE-m5.1 SLE-m5.2 SES7.1 oS15.3 qemu 2025-02-24
SUSE openSUSE-SU-2025:14828-1 TW radare2 2025-02-21
SUSE openSUSE-SU-2025:0072-1 osB15 radare2 2025-02-21
SUSE openSUSE-SU-2025:14821-1 TW ruby3.4-rubygem-grpc 2025-02-19
SUSE SUSE-SU-2025:0638-1 MP4.3 SLE15 oS15.4 webkit2gtk3 2025-02-21
SUSE SUSE-SU-2025:0639-1 SLE12 webkit2gtk3 2025-02-21
SUSE SUSE-SU-2025:0691-1 SLE15 oS15.6 webkit2gtk3 2025-02-24
Ubuntu USN-7297-1 20.04 22.04 24.04 24.10 ProFTPD 2025-02-25
Ubuntu USN-7292-1 18.04 20.04 22.04 dropbear 2025-02-25
Ubuntu USN-7281-1 20.04 22.04 24.04 24.10 gnutls28 2025-02-20
Ubuntu USN-7286-1 22.04 24.04 24.10 iniparser 2025-02-24
Ubuntu USN-7269-2 24.04 intel-microcode 2025-02-24
Ubuntu USN-7300-1 14.04 kernel 2025-02-25
Ubuntu USN-7287-1 20.04 22.04 24.04 24.10 libcap2 2025-02-24
Ubuntu USN-7275-2 24.04 libtasn1-6 2025-02-20
Ubuntu USN-7302-1 14.04 16.04 18.04 20.04 22.04 24.04 24.10 libxml2 2025-02-25
Ubuntu USN-7296-1 16.04 18.04 linux, linux-hwe 2025-02-25
Ubuntu USN-7293-1 18.04 20.04 linux, linux-hwe-5.4 2025-02-25
Ubuntu USN-7301-1 22.04 24.04 linux, linux-lowlatency, linux-lowlatency-hwe-6.8 2025-02-25
Ubuntu USN-7288-1 22.04 linux, linux-lowlatency 2025-02-24
Ubuntu USN-7276-1 24.10 linux, linux-lowlatency 2025-02-19
Ubuntu USN-7298-1 14.04 16.04 linux, linux-lts-xenial 2025-02-25
Ubuntu USN-7277-1 24.10 linux-aws, linux-azure, linux-gcp, linux-oracle, linux-raspi, linux-realtime 2025-02-19
Ubuntu USN-7234-5 18.04 linux-aws-5.4 linux-raspi-5.4 2025-02-25
Ubuntu USN-7294-1 18.04 20.04 linux-azure, linux-azure-5.4, linux-bluefield, linux-gcp, linux-gcp-5.4, linux-ibm-5.4 2025-02-25
Ubuntu USN-7289-1 22.04 linux-azure, linux-azure-fde, linux-gkeop, linux-nvidia, linux-oracle 2025-02-24
Ubuntu USN-7289-2 20.04 linux-azure-5.15, linux-azure-fde-5.15, linux-oracle-5.15 2025-02-25
Ubuntu USN-7291-1 20.04 22.04 linux-gcp, linux-gcp-5.15, linux-gke 2025-02-25
Ubuntu USN-7304-1 24.04 linux-gcp, linux-gke, linux-gkeop 2025-02-26
Ubuntu USN-7289-3 22.04 linux-ibm 2025-02-25
Ubuntu USN-7262-2 16.04 linux-kvm 2025-02-24
Ubuntu USN-7288-2 20.04 linux-lowlatency-hwe-5.15 2025-02-25
Ubuntu USN-7303-1 22.04 24.04 linux-nvidia, linux-nvidia-6.8, linux-nvidia-lowlatency 2025-02-26
Ubuntu USN-7305-1 22.04 linux-raspi 2025-02-26
Ubuntu USN-7295-1 20.04 linux-xilinx-zynqmp 2025-02-25
Ubuntu USN-7284-1 16.04 18.04 20.04 22.04 24.04 24.10 netty 2025-02-24
Ubuntu USN-7285-1 20.04 22.04 24.10 nginx 2025-02-24
Ubuntu USN-7278-1 20.04 22.04 24.04 openssl 2025-02-20
Ubuntu USN-7271-2 24.04 python-virtualenv 2025-02-25
Ubuntu USN-7280-1 20.04 22.04 24.04 24.10 python3.10, python3.12, python3.8 2025-02-20
Ubuntu USN-7290-1 16.04 18.04 20.04 22.04 rails 2025-02-25
Ubuntu USN-7279-1 22.04 24.04 24.10 webkit2gtk 2025-02-20
Ubuntu USN-7299-1 20.04 22.04 24.04 24.10 xorg-server, xwayland 2025-02-25
Full Story (comments: none)

Kernel patches of interest

Kernel releases

Linus Torvalds Linux 6.14-rc4 Feb 23
Greg Kroah-Hartman Linux 6.13.4 Feb 21
Greg Kroah-Hartman Linux 6.12.16 Feb 21
Greg Kroah-Hartman Linux 6.6.79 Feb 21
Greg Kroah-Hartman Linux 6.1.129 Feb 21
Luis Claudio R. Goncalves 5.10.234-rt126 Feb 19

Architecture-specific

Build system

Core kernel

Development tools

Device drivers

Gustavo Silva BMI270 data ready interrupt support Feb 19
Varadarajan Narayanan Add PCIe support for Qualcomm IPQ5332 Feb 20
Steffen Trumtrar LED: Add basic LP5860 LED matrix driver Feb 20
Antoniu Miclaus Add support for AD4080 ADC Feb 20
Fabrizio Castro Add DMAC support to the RZ/V2H(P) Feb 20
Gal Pressman Symmetric OR-XOR RSS hash Feb 20
Vinod Govindapillai drm/i915/fbc: FBC Dirty rect feature support Feb 20
Peter Hilber Add virtio_rtc module Feb 19
Dimitri Fedrau via B4 Relay can: flexcan: add transceiver capabilities Feb 21
Manikanta Mylavarapu Add NSS clock controller support for IPQ9574 Feb 21
Krzysztof Kozlowski drm/msm: Add support for SM8750 Feb 21
Aurelien Aptel nvme-tcp receive offloads Feb 21
Jedrzej Jagielski ixgbe: Add basic devlink support Feb 21
bingbu.cao@intel.com Intel IPU7 PCI and input system device drivers Feb 21
Hans-Frieder Vogt via B4 Relay net: tn40xx: add support for AQR105 based cards Feb 22
Thippeswamy Havalige Add support for AMD MDB IP as Root Port Feb 24
Damon Ding Add eDP support for RK3588 Feb 24
Stanimir Varbanov Add PCIe support for bcm2712 Feb 24
Sasha Finkelstein via B4 Relay Driver for Apple Z2 touchscreens. Feb 24
Sasha Finkelstein via B4 Relay Driver for pre-DCP apple display controller. Feb 24
Md Sadre Alam Add QPIC SPI NAND driver Feb 24
Aditya Garg Touch Bar DRM driver for x86 Macs Feb 24
Christian Bruel Add STM32MP25 PCIe drivers Feb 24
Francesco Dolcini ASoC: wm8904: Add DMIC and DRC support Feb 24
Matti Vaittinen Support ROHM BD79124 ADC Feb 24
Clément Le Goffic Introduce HDP support for STM32MP platforms Feb 25
Alexis Czezar Torreno Add support for ADP5055 triple buck regulator. Feb 25
Krishna Chaitanya Chundru PCI: Enable Power and configure the TC956x PCIe switch Feb 25
Nícolas F. R. A. Prado Enable DMIC for Genio 700/510 EVK Feb 25
Svyatoslav Ryhel Tegra114: implement EMC support Feb 25
Sebastian Reichel Rockchip W552793DBA-V10 panel support Feb 25
Alisa-Dariana Roman Add support for AD7191 Feb 26
Biju Das Add support for RZ/G2L GPT Feb 26
Danilo Krummrich Initial Nova Core series Feb 26
Aradhya Bhatia drm/tidss: Add OLDI bridge support Feb 26

Device-driver infrastructure

Documentation

Alejandro Colomar man-pages-6.12 released Feb 24
Mauro Carvalho Chehab Implement kernel-doc in Python Feb 24

Filesystems and block layer

Memory management

Networking

Security-related

Virtualization and containers

Miscellaneous

Tao Chen Add prog_kfunc feature probe Feb 22
Lucas De Marchi kmod 34 Feb 21

Page editor: Joe Brockmeier


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds