Unprivileged BPF and authoritative security hooks

By Jonathan Corbet
April 27, 2023

When the developers of the Linux security module (LSM) subsystem find themselves disagreeing with other kernel developers, it tends to be because those other developers don't think to — or don't want to — add security hooks to their shiny new subsystems. Sometimes, though, the addition of new hooks by non-LSM developers can also create some friction. Andrii Nakryiko's posting of a pair of BPF-related security hooks raised a couple of interesting questions, one of which spurred a fair amount of discussion, and one that did not.

Nakryiko proposed the addition of two new LSM hooks to control access to BPF functionality. The first would govern the creation of BPF maps, while the second was meant to control the loading of BPF type format (BTF) data that describes functions and data structures within the kernel. The plan is to not stop there, though:

This patch set implements and demonstrates an overall approach starting with BPF map and BTF object creation, first two steps in the lifetime of a typical BPF applications. Next step would be to do similar changes for BPF_PROG_LOAD command to allow BPF program loading and verification.

There is nothing in this part of the plan that is inherently controversial; if there are use cases for access control over these features beyond checking for the CAP_BPF capability, then the addition of these hooks to enable the creation of a policy to implement that control can make sense. But that is not quite how these hooks are meant to operate. Instead, they can be used to bypass the CAP_BPF check entirely, meaning that they can make the covered functionality available to processes that lack that capability.

Authoritative hooks

The LSM subsystem has its origin in the first Kernel Summit in 2001. At that time, there was a desire to get an early version of SELinux into the kernel, but Linus Torvalds pointed out that there were other approaches to increased security under development, and he did not want to commit the kernel to any one of them. Instead, he asked for the creation of a framework that would allow multiple security mechanisms to be supported.

That framework, implementing an extensive set of hooks that can make security decisions at the relevant points in the system-call paths, eventually was merged as the Linux security module subsystem. But, before that could happen, there was a heated discussion (covered in LWN at the time) over whether the LSM subsystem should support hooks that could grant privileges that a process did not have, or whether they would only be able to add restrictions to those already implemented by the kernel's other access-control mechanisms. In the end, the decision was made that "authoritative hooks" — those that could increase privilege — would not be allowed. Among other things, this rule was seen as a way of keeping security modules from introducing security holes in their own right.

There have been a number of security modules added in the 21 years since that decision was made, but they have all been held to that rule. Easing the ban on authoritative hooks has occasionally been discussed over those years, but has never really been considered. So, when Nakryiko proposed adding a couple of authoritative hooks, LSM maintainer Paul Moore quickly responded:

One of the hallmarks of the LSM has always been that it is non-authoritative: it cannot unilaterally grant access, it can only restrict what would have been otherwise permitted on a traditional Linux system. Put another way, a LSM should not undermine the Linux discretionary access controls, e.g. capabilities.

The real solution, he said, would be to revise how the BPF code implements the CAP_BPF capability. Kees Cook disagreed, suggesting that these hooks could be seen as "fine-grained access control" rather than actually bypassing enforcement, but Moore stood firm in his opposition to the idea.

Nakryiko protested that the idea was to increase security by making it finer-grained than the single CAP_BPF capability allows. The restriction-only model, he said, would be more brittle in the end. He also added that there are a couple of real problems with capability-based enforcement when user namespaces are involved. The first is that many BPF programs, such as those that interact with tracing, inherently have a view of the entire system and cannot really be contained within a namespace. So a capability check for CAP_BPF cannot be namespace-aware.

Beyond that, though, it is currently not even possible to give a process CAP_BPF if it's running within a user namespace due to the way that the capability checks are implemented in the BPF subsystem. As a result, he argued, it is not really possible for programs running within a user namespace to make use of BPF at all. The proposed hooks were intended to provide a way around this shortcoming.

Casey Schaufler, who had been in favor of authoritative hooks back in 2001, was unsympathetic now:

This doesn't sound like a problem, it sounds like BPF is explicitly designed to prevent interference by namespaces. But in some cases you now want to limit it by namespaces.
It appears that the desired uses of BPF are no longer compatible with its original security model. That's unfortunate, and likely to require a significant change to the implementation of BPF.

Or, as Moore put it: "Changing the very core behavior of the LSM layer in order to work around an issue with another access control mechanism is a non-starter". Nakryiko has received the message and has promised to come back with a different approach. It thus seems that a complete solution to the problems encountered by the BPF community is a somewhat distant prospect at this point.

Unprivileged BPF

The quiet part of the discussion is an apparent change within the BPF community with regard to security. Quoting again from Nakryiko's cover letter:

Such LSM hook semantics gives ability to have safer-by-default policy of not giving applications any of the CAP_BPF/CAP_PERFMON/CAP_NET_ADMIN capabilities, normally required to be able to use BPF subsystem in the kernel. Instead, all the BPF processes could be left completely unprivileged, and only allowlisted exceptions for trusted and verified production use cases would be granted permission to work with bpf() syscall, as if those application had root-like capabilities.

In the early days of extended BPF, some effort went into making it possible to use BPF without any special privileges. By 2019, though, the idea of unprivileged BPF use had been explicitly deprecated. BPF co-maintainer Alexei Starovoitov described Linux as "a single-user system" and proclaimed that no further attempts would be made to enable use of BPF without privilege. The amount of pain involved in keeping the system secure had simply become too much; the advent of the Spectre vulnerabilities just made things worse.

So it is interesting to see the BPF developers talking about unprivileged operation again, even if done under the watchful eye of a security policy. There does not appear to have been any discussion on the BPF list about changes in the privilege model overall, so it is not entirely clear how this all came about.

What does seem clear is that, if the BPF developers want to move away from the simple CAP_BPF check, they are going to have to revisit many of the security-related decisions that they have made so far. The method of adding authoritative LSM hooks does not appear to be viable for mainline inclusion, so some thought is going to have to be put into other solutions, including perhaps rethinking the user-namespace issue. This does not look like a problem that is amenable to a quick solution.

Index entries for this article
Kernel	BPF/Security
Kernel	Modules/Security modules
Kernel	Security/Security modules

Unprivileged BPF and authoritative security hooks

Posted Apr 27, 2023 15:23 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

For a second, I thought that Linux would add authoritative LSM hooks. Which would have been great. Oh well...

Unprivileged BPF and authoritative security hooks

Posted Apr 27, 2023 19:57 UTC (Thu) by geofft (subscriber, #59789) [Link] (7 responses)

I'm thinking about the suggestion in the list discussion to grant all users CAP_BPF and use the LSM to restrict it again, and the absurdum that you might as well grant all users root and use the LSM to restrict their abilities, and whether it actually is absurd.

If you look at it from another angle, it seems like the problem is that the standard Linux security controls - caps, file ownership/permissions, user IDs and their namespacing, etc. - basically constitute an LSM with unusual rules: it cannot be turned off and it sits in front of any other LSM. What you really want is for the standard controls to be treated like any other stackable LSM (now that LSMs are stackable), removable, and most importantly configurable. Just like you can tell AppArmor or SELinux to e.g. label or not label packets, you should be able to tell the standard security controls to get out of the way for a particular thing, like BPF. Or like user account checks, for that matter.

There is in fact a little bit of precedent for turning off user account checks in the form of the CONFIG_MULTIUSER option, which when disabled gets rid of UIDs and GIDs and capabilities and all associated system calls and effectively runs everything as root. And now that I go look up the history of it, I see there was objection to it precisely because of the no-authoritative-LSMs rule: https://lwn.net/Articles/631853/

So maybe this really isn't that absurd and we should treat "running without permission checks" as no more technically unreasonable than "running without SELinux." This would basically redefine the problem so that the authoritativeness of an LSM is no longer a concept: if the traditional-permissions LSM is loaded and is configured to enable checks for an operation, then it stacks just like any other LSM, and the remaining LSMs in the stack are "non-authoritative" in current speak. If not, they're effectively "authoritative."

Unprivileged BPF and authoritative security hooks

Posted Apr 28, 2023 7:31 UTC (Fri) by taladar (subscriber, #68407) [Link] (5 responses)

The problem would still exist though. Even if every mechanism to make security decisions is a stackable LSM (or some other, newly designed security plug-in system) you still need to decide what each of those can do.

Can each LSM only block operations, making the operation forbidden if any of them does? Can each LSM on its own allow operations even if other LSMs want to block it? Can each LSM veto decisions of other LSMs that run earlier or later than itself only?

If you aren't careful with permission systems you design something overly complex that leads to more accidental errors while its expressiveness goes unused due to its complexity.

Unprivileged BPF and authoritative security hooks

Posted Apr 28, 2023 11:10 UTC (Fri) by farnz (subscriber, #17727) [Link] (4 responses)

If everything's stackable LSMs, and you assume a competent administrator, you only need the ability to remove permissions at each layer.

Before the LSM stack gets to make decisions, your user is omnipotent, and can do everything. Each layer of the stack can reject the user's request; if nothing in the stack rejects the request, then the user is allowed to do the thing.

It then becomes debuggable - all requests are approved by default, and the kernel can tell you which part of the LSM stack rejected any given request. You can thus design your stack so that all layers deny by default, and use the kernel's advice to open up permissions as and when you need them.

Unprivileged BPF and authoritative security hooks

Posted May 4, 2023 6:35 UTC (Thu) by ringerc (subscriber, #3071) [Link] (3 responses)

This sounds good in theory. What I've found is that you often land up with situations where only layer C has the knowledge necessary to say that some particular case should be an exception to the more general rules applied by higher layers. But doesn't know enough to enforce the restriction elsewhere.

The "campus" layer knows you can only use your key card at the campus you study at. The correct keycard is sufficient to grant access.

The "police and emergency services" layer knows that you can get if you know the right keypad code, but doesn't know anything about key cards and campuses.

The "rich alumini" layer knows that no matter what anyone else says, you WILL let them in anywhere unconditionally because you want their cheque book.

It's hard to compose security with orthogonal layers of checks if each layer can only give a final denial or a no-decison.

With that said, authoratative permit rules are massive foot guns and it's incredibly hard to design a system that's secure and easy to understand when you use them. I'm more interested in having a default-deny model where each layer can say "yep that's ok", "abstain" or "deny". An approve decision requires at least one explicit approval from a module and no denies. Most of the time most modules would abstain or approve, not deny.

I've always found it easier to layer these sorts of models where you default to denying then you layer grants.

Unprivileged BPF and authoritative security hooks

Posted May 4, 2023 10:14 UTC (Thu) by Karellen (subscriber, #67644) [Link]

> I've always found it easier to layer these sorts of models where you default to denying then you layer grants.

I personally think this sounds like a better approach than "default allow with layered rejections", because it sounds like it has a better chance to "fail closed", which seems like a better security paradigm than "fail open".

Unprivileged BPF and authoritative security hooks

Posted May 4, 2023 11:30 UTC (Thu) by farnz (subscriber, #17727) [Link]

I like this - because you have three states ("no decision", "allow", "deny"), the policy can fail closed on no decision.

And it avoids the composability issue that "authoritative allow" brings in - if your policy includes a "deny this access" rule, you can't be surprised by a later "allow this access" rule, since the "allow" rule can't override you.

Unprivileged BPF and authoritative security hooks

Posted May 4, 2023 15:10 UTC (Thu) by Wol (subscriber, #4433) [Link]

This is why I don't like the Windows and Linux implementations of ACLs as I understand it. All sorts of confusing rules.

Pr1me ACLs were simple. Default whatever (defaulted to none). Groups were additive and over-rode default. Named were absolute and over-rode everything else.

So if I didn't want Jo Bloggs to see anything in my project directory, an acl of "Jo Bloggs : none" was definitive.

So provided your security layer could categorise an "allow" or "deny" as being at the group or personal level, a personal deny would be final, a group allow could be over-ridden.

Cheers,
Wol

Unprivileged BPF and authoritative security hooks

Posted May 1, 2023 18:27 UTC (Mon) by bartoc (guest, #124262) [Link]

One problem with this is that it means you are trusting the LSM to actually get the security model right, and when designing a new subsystem you then have a wider field of possible security models that you need to analyze to figure out if what you're doing is secure or could break any of them.

It might make sense to do this but have the LSM implementing the initial restrictions developed as part of the kernel and always applied in front of any other LSM, just like the classical DAC model is used today. However, because LSMs are code this doesn't really help you that much. There also may well be some hooks that you want this "base" LSM to have access to but shouldn't be available to other LSMs, and at that point you've almost gotten back to where you started.

Ultimately you need a security model to _actually_ analyze. Today most LSMs act more as security firebreaks that have a high probability of mucking up at least one link in some exploit chain than they do as airtight security where any mechanism of bypassing them is considered a bug.

Unprivileged BPF and authoritative security hooks

Posted Apr 28, 2023 1:38 UTC (Fri) by developer122 (guest, #152928) [Link] (9 responses)

If linux has truly become single user (single user of servers, personal computers, and embedded devices) then might as well strip out all the user IDs, group IDs, filesystem permissions, access lists, and all the other access control mechanisms. After all, who needs to access control one's self? :P

Unprivileged BPF and authoritative security hooks

Posted Apr 28, 2023 2:09 UTC (Fri) by geofft (subscriber, #59789) [Link] (3 responses)

Android uses user IDs to great effect to sandbox applications from each other, even though Android is almost always used as a single-user OS. (On the other hand, iOS runs everything as a single user and has a separate kernel sandboxing thing, kind of like a mix between seccomp and LSMs. Both approaches have had bugs but have basically been sound designs overall, so maybe this is an argument that single-user machines don't really need UIDs.)

I'm not sure if this is what was meant, but I can see the argument that Linux is a single-person OS and powerful features like BPF should be controlled and assigned to UIDs by a single person.

Unprivileged BPF and authoritative security hooks

Posted Apr 28, 2023 3:21 UTC (Fri) by raven667 (subscriber, #5198) [Link] (2 responses)

> can see the argument that Linux is a single-person OS

Sure, most Linux systems are owner-operated, but as soon as you accept a use case where this isn't true, then you end up needing all the complexity and policy for multi-user systems, so you might as well plan for that from the start.

Unprivileged BPF and authoritative security hooks

Posted Apr 29, 2023 6:36 UTC (Sat) by developer122 (guest, #152928) [Link] (1 responses)

I wonder if the entire concept of user IDs, filesystem permissions, etc could be exported into one or more LSMs.

Unprivileged BPF and authoritative security hooks

Posted May 3, 2023 10:36 UTC (Wed) by smurf (subscriber, #17840) [Link]

No reason it can't be AFAIK.

Unprivileged BPF and authoritative security hooks

Posted Apr 28, 2023 4:45 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

Quite a few container workloads basically do just that. I won't mind an option to just disable all DAC entirely in containers for a small speedup.

Unprivileged BPF and authoritative security hooks

Posted Apr 28, 2023 22:36 UTC (Fri) by dbnichol (subscriber, #39622) [Link]

CAP_DAC_OVERRIDE?

Unprivileged BPF and authoritative security hooks

Posted Apr 28, 2023 9:24 UTC (Fri) by farnz (subscriber, #17727) [Link]

Even though I am the only user of my laptop, I have multiple Linux users on it with different permissions; they provide a form of sandboxing between tasks for me, so that (for example) I can run a build as a user that can only pull from my local git repo, and cannot read my files otherwise, nor is it permitted network access. This, in turn, helps me catch stupid mistakes before I trigger CI - forgetting to git add a new file is one of my favourite tricks.

I was inspired to do this by Android, which uses a similar trick for isolation between applications.

Unprivileged BPF and authoritative security hooks

Posted Apr 28, 2023 13:59 UTC (Fri) by ballombe (subscriber, #9523) [Link]

Please read this in the context it was written.
The context was that userspace bpf made so easy do privilege escalation we could as well run everything as root.
It was not a comment on personnal computer use.
<https://lwn.net/ml/netdev/20190813215823.3sfbakzzjjykyng2...>

Unprivileged BPF and authoritative security hooks

Posted Apr 28, 2023 18:17 UTC (Fri) by Karellen (subscriber, #67644) [Link]

Well, sometimes it's nice to be unable to accidentally overwrite your boot sector when you're just trying to dial the modem. ;-)

Unprivileged BPF and authoritative security hooks

Posted Apr 29, 2023 22:39 UTC (Sat) by rcampos (subscriber, #59737) [Link]

Seccomp notify (notify to a user space agent when a sus all is done) wouldn't be useful here?

I mean, the agent can execute the syscall in behalf of the unprivileged process if it deems it safe to do so. The process has no privileges nor capabilities.

Or going to user space is too slow for this use case?