|
|
Log in / Subscribe / Register

In search of a stable BPF verifier

By Daroc Alden
April 14, 2025

LSFMM+BPF

BPF is, famously, not part of the kernel's promises of user-space stability. New kernels can and do break existing BPF programs; the BPF developers try to fix unintentional regressions as they happen, but the whole thing can be something of a bumpy ride for users trying to deploy BPF programs across multiple kernel versions. Shung-Hsi Yu and Daniel Xu had two different approaches to fixing the problem that they presented at the 2025 Linux Storage, Filesystem, Memory-Management, and BPF Summit.

BPF in stable kernels

Yu presented remotely about the problem of running BPF programs on stable kernels. He began by recapping the process by which a patch ends up in a stable kernel: by including a "CC: stable@vger.kernel.org" tag in the patch, by being picked up by the AUTOSEL patch-selection process, or when the developer explicitly asks the stable team to include it. A patch that has been identified for inclusion in a stable kernel by one of those means needs to meet three additional criteria: the patch is present in mainline, the patch applies cleanly to the stable tree, and the stable tree builds after applying it.

That whole process encodes some hidden assumptions, Yu said. For one, it assumes that a patch will work as-is when applied to an older code base, or at least that a patch which doesn't will be caught by the stable-release testers. But "the elephant in the room is that it doesn't really work." Patches taken from the current code base and applied to an older code base are not guaranteed to work. Patches sent to stable get less review, and stable-kernel testers do not systematically exercise BPF functionality.

Yu proposed adding the stable trees to the BPF continuous-integration testing, starting with the 6.12 long-term-support kernel and continuing from there. That prospect isn't as simple as just running the tests on a new branch, however; if an error is found, where should it be sent? Yu listed three possibilities: the stable mailing list, a group of volunteers, or the BPF mailing list. Of the three options, he would most prefer to have a dedicated group of volunteers, but that may not be possible — it depends on whether anyone is willing to step up.

Even if the stable tree is added to BPF's automated tests, the way the tests themselves are updated poses a problem. Usually, a fix and a test verifying the fix are submitted as part of the same patch set. If a fix is backported to a stable kernel, though, its accompanying test might not go back along with it. Yu spoke with Greg Kroah-Hartman, who agreed that he would accept BPF selftests into the stable trees, but he needs someone to identify them and ask for them to be backported.

Finally, Yu said, not all fixes can be backported. That's fine — stable trees miss out on a lot of fixes in the name of stability — but it does pose a problem for security-relevant fixes. Changes to the BPF verifier often qualify as security-relevant.

None of these problems are really BPF specific, Yu emphasized. Maybe there is some low-hanging fruit that can make the experience of using BPF on stable kernels less painful, though, and maybe the BPF developers can find someone willing to step up and take on that work.

The assembled BPF developers pointed out some additional challenges that Yu might have overlooked, such as more load on the test machines and on the maintainers, but seemed generally in agreement that something needed to be done.

Modularizing the BPF verifier

Xu had a different vision of how to enable stability for users of BPF. Xu's long-term goal is to ship BPF changes more quickly; the BPF subsystem changes quickly, but long-term support kernels stick around for a long time. The effort of trying to support BPF programs across a number of kernel versions in the face of that reality makes BPF painful to use, he said. His plan is to try to make the BPF subsystem into a kernel module that can be patched, shipped, and updated independently of the main kernel.

That's a long and difficult project, however, so he would like to start with just the verifier. Separating the verifier out into a kernel module is a good place to start, Xu said, because it is already architecturally a pure function — that is, a function that transforms an input into an output without otherwise affecting the state of the system. The verifier takes in information about a BPF program, and outputs a judgment on whether it is safe, plus some information used by the just-in-time compiler — it doesn't call (many) other kernel functions. Also, changes to the verifier are one of the things that causes the most frustration for users.

Xu put together a proof of concept, which he described as a pretty simple change, except for some complexity around supporting out-of-tree builds. From his testing, the modularized verifier seems to work. Currently, his proof of concept only encapsulates a single file: verifier.c. He still thinks that's a useful starting point, however.

[Daniel Xu]

Xu examined the commits between kernel versions 6.3 and 6.13 to get some statistics. Most commits that touch verifier.c, don't change the rest of the kernel. These commits could, theoretically, be easily ported to a standalone stable verifier. Of the commits that affected only verifier.c, there were at least 73 bug fixes, based on the "Fixes" tags in the commits.

So modularizing the verifier will, at least, allow users to receive some bug fixes on an otherwise unchanging kernel. The next question Xu attempted to address with his examination was: are there any other files that could be moved into his kernel module to further decrease the reliance on the rest of the kernel? To answer that question, he looked at files that were edited in the same commits as the verifier.

The most commonly edited was bpf_verifier.h, which is "conceptually private", but in actuality is referenced by a few other files. After that, there was a long tail of other files that were less commonly modified. Xu admitted that this analysis wasn't as rigorous as it could be — for one thing, it should really consider patch sets as well as commits, since applying a single commit from a patch set is occasionally problematic — but he thought that this was still a useful exercise.

For the next steps, Xu wants to move the code to build the modularized verifier out of tree and add continuous-integration testing across many different kernel versions. From his explanation, I believe his intended workflow is for the BPF developers to continue maintaining the BPF verifier in-tree, with the out-of-tree version functioning similarly to stable kernels: as a more stable alternative that still receives cherry-picked bug fixes.

If that goes well, and the modularized verifier is actually helpful, he plans to look at implementing a well-defined interface to make it easier to maintain the verifier out-of-tree. He also intends to modularize more components of the BPF subsystem, work with distributions on distributing appropriate versions of the verifier, and eventually support running the verifier in user space as part of compilers targeting BPF.

Someone in the audience asked why, if Xu wanted to make a version of the verifier that could run across multiple kernels, he didn't use BPF's struct_ops mechanism. Xu replied that it was a cool idea, but that the verifier probably executes a lot more than a million instructions (the current limit for BPF programs). Daniel Borkmann wanted to know whether the verifier was really what was giving users who need to support multiple kernel versions problems — isn't the changing set of kernel functions available to be called by BPF programs a bigger problem? Xu didn't think so, pointing out that functions are either there or they're not, so dealing with missing functions is relatively straightforward. But currently he is dealing with "a bunch" of different verifiers in production, and would really like to get it down to just two or three.

Eduard Zingerman asked whether Xu expected to be able to reduce the entanglement between the existing verifier code and the rest of the kernel further; Xu thought that it was probably possible, and shared a list of kernel symbols that the verifier currently refers to for the assembled developers to ponder. Another person wanted to know whether a modularized verifier would allow for more thorough testing. Xu thought so. Fuzz testing, especially, would be easier without having to run the whole kernel. BPF development would also be nicer if tweaking the verifier did not require rebuilding the whole kernel.

Alexei Starovoitov asked whether Xu had thought about how to handle changes to the verifier that actually change the memory layout of the verifier state. Xu replied that it would need to be addressed on a case-by-case basis. Other developers expressed skepticism that a modular verifier would actually be helpful to users, given that users often need to wait on new kernel functions to become available to BPF programs. David Faust, on the other hand, was enthusiastic about the idea of being able to run the verifier code in user space, since he has repeatedly asked for a way to let GCC verify the BPF code it generates.

Whether the modular verifier will be adopted remains to be seen — and many BPF developers are skeptical — but Xu's working proof of concept suggests that it may not be as daunting a task as it first appears.


Index entries for this article
KernelBPF/Verifier
ConferenceStorage, Filesystem, Memory-Management and BPF Summit/2025


to post comments

Is BPF’s nature part of the problem?

Posted Apr 14, 2025 18:58 UTC (Mon) by DemiMarie (subscriber, #164188) [Link]

One of the main problems with BPF seems to be that the verifier is extremely ad hoc, with no formal specification of what should and should not be accepted. Would it make sense to have such a spec?

Userspace BPF runtimes?

Posted Apr 15, 2025 0:37 UTC (Tue) by ringerc (subscriber, #3071) [Link]

It'd be very interesting to see how this crosses over with userspace BPF runtimes like bpftime (https://github.com/eunomia-bpf/bpftime), rbpf (https://github.com/qmonnet/rbpf) etc.

Being able to run a verifier in userspace would be extremely handy in testing etc.

Just drop the verifier

Posted Apr 15, 2025 6:18 UTC (Tue) by epa (subscriber, #39769) [Link] (7 responses)

Why does the BPF verifier have to live in the kernel at all? If you have root, and it’s not some locked-down Secure Boot system, then at any time you can build some C code and load it into the kernel as a module. There is no verification that the C code doesn’t have infinite loops or memory trampling. You run it at your own risk. So why can’t you load arbitrary BPF? Of course you’d probably want to verify it first, but that can be done in user space using the verifier of your choice.

Just drop the verifier

Posted Apr 15, 2025 7:27 UTC (Tue) by kxxt (subscriber, #172895) [Link]

The most attracting point of BPF for me is that I can rest assured that most of the time loading a BPF program will not mess up my system. Having root or not does not really matter in such cases. If something can be done in a BPF program and product A implements that, then I wouldn't choose product B where it is implemented via out of tree kernel modules.

Just drop the verifier

Posted Apr 15, 2025 7:33 UTC (Tue) by taladar (subscriber, #68407) [Link]

I think dropping the verifier is a bad idea since the idea that people will be disciplined enough to use it anyway has been thoroughly debunked but it could maybe move to user-space with callbacks from the kernel to invoke it similar to some other existing parts of the kernel.

Just drop the verifier

Posted Apr 15, 2025 8:45 UTC (Tue) by ballombe (subscriber, #9523) [Link]

The verifier needs to match the kernel evaluator.

Just drop the verifier

Posted Apr 15, 2025 17:05 UTC (Tue) by raven667 (guest, #5198) [Link] (3 responses)

Isn't that how SystemTap worked, building little tracing modules and loading them. I think for the kernel though you can't exclude some use case just because it's not relevant to you, it's relevant to someone so the central kernel has to work within those constraints, and there is less value to maintaining separate behavior when those constraints aren't active than just engineering with those constraints all the time. I'm not sure what the use case would be for loading unvalidated BPF that wouldn't be better served by just loading a custom module, the whole point of using BPF is to provide a safety shield, using BPF without that safety doesn't seem to make sense.

Just drop the verifier

Posted Apr 15, 2025 21:17 UTC (Tue) by epa (subscriber, #39769) [Link] (2 responses)

I’m not saying to drop the safety and load BPF without validating (although there are those taking that stronger position, arguing that runtime checks are enough). I am saying to run the validator in user space and then once you’re satisfied tell the kernel to load the code. Those separate steps could be packaged up in some tool. There is no need for ‘nannying’ from the kernel to verify the code itself. The computer is there to obey the user, not the other way round: if you have root access and you have decided some BPF is safe to load, it should let you.

Just drop the verifier

Posted Apr 16, 2025 2:22 UTC (Wed) by raven667 (guest, #5198) [Link] (1 responses)

I guess I don't know what the standard is for trusting the result of a process in user space, I think in most contexts the kernel does some sanity checking of anything crossing the boundary from user space, to protect its own address space.

This makes me think of the swap convo, maybe a special namespace could be standardized which contains only a limited image of necessary trusted utilities, maybe keeping the initial ramdisk around for this purpose, that the kernel can trust is only accessible to the legit admin and is somewhat inside the security boundary of the kernel although not in it's address space. A special admin container

Just drop the verifier

Posted Apr 16, 2025 16:53 UTC (Wed) by epa (subscriber, #39769) [Link]

I guess you're right -- if invalid BPF could result in memory trampling then the kernel needs to protect itself. Though, again, in principle you could build a kernel module in C and load that without any checks.

I was thinking more of the verification that checks BPF programs always terminate and other higher-level properties. It should be possible to say "trust me" on that.

Downstream CI could be helpful for catching regressions

Posted Apr 15, 2025 7:39 UTC (Tue) by kxxt (subscriber, #172895) [Link]

For any serious BPF projects, it would be good to have continuous integrations that tests the BPF program across various kernel versions. (My current strategy is testing on 6.1, 6.6, 6.12 LTS, latest stable kernel and latest RC kernel).

Personally I currently only tests whether the BPF program loads or not on various kernel versions: https://github.com/kxxt/tracexec/blob/19669575d66f112fbdf...

That has caught one stable kernel regression in 6.6 LTS: https://lore.kernel.org/all/MEYP282MB2312C3C8801476C4F262...

(Thanks to Shung-Hsi Yu for fixing it!)

Drop it

Posted Apr 15, 2025 9:00 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

Seriously, drop the verifier. It's your fifth wheel.

BPF already has time based termination, and soon runtime-based lock order checking. Just add mandatory bounds checking, and that's it.


Copyright © 2025, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds