Security requirements for new kernel features
Most of the operations that can be performed within io_uring follow the usual I/O patterns — open a file, read data, write data, and so on. These operations are the same regardless of the underlying device or filesystem that is doing the work. There always seems to be a need for something special and device-specific, though, and io_uring is no exception. For the kernel as a whole, device-specific operations are made available via ioctl() calls. That system call, however, has built up a reputation as a dumping ground for poorly thought-out features, and there is little desire to see its usage spread.
In early 2021, io_uring maintainer Jens Axboe floated an idea for a command passthrough mechanism that would be specific to io_uring. A year and some later, that idea has evolved into uring_cmd, which was pulled into the mainline during the 5.19 merge window. There is a new io_uring operation that, in turn, causes an invocation of the underlying device or filesystem's uring_cmd() file_operations function. The actual operation to be performed is passed through to that function with no interpretation in the io_uring layer. The first user is the NVMe driver, which provides a direct passthrough operation.
Missing security hooks
Just over one year ago, there was a bit of a disagreement after the developers of the kernels Linux Security Module (LSM) and auditing subsystems figured out that there were no security or auditing hooks in all of that new io_uring code. That put io_uring operations outside the control of any security module that a given system might be running and made those operations invisible to auditing. Those gaps were filled in, but not before the security developers expressed their unhappiness about how io_uring had been designed and merged without thought for LSM and audit support.
Given that, one might expect that the addition of a new feature like uring_cmd would have seen more involvement from the security community. To an extent, that happened; Luis Chamberlain posted a patch adding LSM support back in March. In short, it added a new security_uring_async_cmd() hook that would be called before passing a command through to the underlying code; it could examine that command and decide whether to allow or deny the operation. There were some disagreements over how well this would work; in particular, Casey Schaufler complained that security modules would have to gain an understanding of every device-specific command, which clearly would not scale well. The conversation wound down shortly thereafter.
When the new feature was pushed into the mainline, there was no LSM support included with it. On July 13, Chamberlain reposted his patch adding the new security hook. Schaufler was equally unimpressed this time around:
You're passing the complexity of uring-cmd directly into each and every security module. SELinux, AppArmor, Smack, BPF and every other LSM now needs to know the gory details of everything that might be in any arbitrary subsystem so that it can make a wild guess about what to do. And I thought ioctl was hard to deal with.
SELinux and audit maintainer Paul Moore agreed
with that assessment. The end result, he said, was that security modules
would be unable to distinguish between low-level operations, so they
would end up simply enabling all io_uring passthrough commands for any
given subsystem or none of them; "I think we can all agree that is not a
good idea
". He later acknowledged
that there does not appear to be a better solution at hand and merging
Chamberlain's patch looked like the only path forward: "Without any
cooperation from the io_uring developers, that is likely what we will have
to do
". The current plan appears to be to get Chamberlain's patch into
the mainline during the next merge window, with backports to the stable
kernels to be done thereafter.
Grumpiness
This particular problem appears to be solved, albeit in a way that is less than satisfying to the security community. A better solution may materialize in the future, though providing a way to control access to device-specific functionality in a general way is a hard problem. But a harder problem may be addressing the residual grumpiness in the security community and preventing such problems from recurring in the future. As Moore put it:
I feel that expressing frustration about the LSMs being routinely left out of the discussion when new functionality is added to the kernel is a reasonable response; especially when one considers the history of this particular situation.
For his part, Axboe acknowledged that the security concerns should not have been allowed to fall through the cracks, but he didn't necessarily offer a lot of hope for changes in the future:
I guess it's just somewhat lack of interest, since most of us don't have to deal with anything that uses LSM. And then it mostly just gets in the way and adds overhead, both from a runtime and maintainability point of view, which further reduces the motivation.
Even when the motivation is there, mistakes can happen. Kernel development is a complex business. A lot of effort has gone into making the kernel sufficiently modular that developers need not worry about what is happening in the rest of the system, but there are limits to how far that process can go.
For example, developers must be aware of locking and the locking requirements of subsystems they call into or things may go badly wrong. Memory must be handled according to the constraints placed on the memory-management subsystem, and developers creating complex caches may have to implement shrinkers to release memory on demand. CPU hotplug affects many subsystems and must be taken into account. The same is true of power-management events. Changes to the user-space API can create unhappiness years later. Inattention to latency constraints may create trouble in realtime applications. A failure to properly document a subsystem will make life harder for developers and users — but they are all used to that by now.
And, of course, a failure to provide proper security hooks will hobble the ability of administrators to control process behavior by way of LSM policies.
The fact that developers do not always succeed in keeping all of these constraints in mind — and consequently make mistakes — is unsurprising. Catching such omissions is one of the reasons for the existence of the kernel's sometimes tiresome review process. But nothing ensures that a given change will be properly reviewed by, for example, a developer who understands the needs of Linux security modules, and there is little that forces the suggestions from any such review to be heeded.
So important things will occasionally fall through the cracks, and it is not clear that much can be done to improve the situation. It would be wonderful if more companies would pay developers to spend more time reviewing patches to provide, as an example, an overall security-oriented eye on code heading into the mainline, but that does not appear to be the world that we are living in. Attempts to impose requirements with a more bureaucratic process would mostly create friction and lead to the distribution of more out-of-tree (and severely unreviewed) code.
The best path toward improvement may be, as Axboe put
it, "one subsystem being aware of another one's needs
". Working
toward that goal — and the ability to fix mistakes in the stable kernels
when they do happen — seems to work reasonably well most of the time.
Index entries for this article | |
---|---|
Kernel | Development model/Code review |
Kernel | io_uring |
Kernel | Security/Security modules |
Posted Jul 28, 2022 14:51 UTC (Thu)
by khuey (guest, #158560)
[Link] (2 responses)
Posted Jul 28, 2022 15:02 UTC (Thu)
by magfr (subscriber, #16052)
[Link]
In a perfect world all of the security stuff would be unnecessary but the world is sadly not perfect.
Posted Jul 28, 2022 22:51 UTC (Thu)
by cschaufler (subscriber, #126555)
[Link]
Posted Jul 29, 2022 8:24 UTC (Fri)
by zdzichu (subscriber, #17118)
[Link] (23 responses)
Posted Jul 29, 2022 13:01 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (22 responses)
Posted Jul 29, 2022 15:17 UTC (Fri)
by jhoblitt (subscriber, #77733)
[Link]
Posted Jul 29, 2022 18:14 UTC (Fri)
by josh (subscriber, #17465)
[Link] (19 responses)
Posted Jul 30, 2022 20:52 UTC (Sat)
by andresfreund (subscriber, #69562)
[Link]
For production use having to build a custom kernel requires a decent scale to be a good decision. The overhead of various unused features in common distribution kernels is a problem.
Posted Aug 4, 2022 21:44 UTC (Thu)
by cschaufler (subscriber, #126555)
[Link] (17 responses)
I cherish the memory of the Unix system that ran a sophisticated management program five to ten times faster when audit was enabled than when it wasn't. When the characteristics of disparate sub-systems provide mutual benefit it's a wonderful thing. You'll never know that can happen if you don't at least try.
Posted Aug 5, 2022 1:28 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (16 responses)
And it's pretty much used in these situations now. SELinux is useful if you are a giant corp with a huge development staff that is OK with torturing themselves by writing SELinux policies.
> Today the system that doesn't use security modules is an odd duck indeed.
Like, pretty much all classic desktops? I've yet to see a developer with "serious" LSMs like SELinux turned on.
I think, some serious soul-searching on the side of LSM developers is in order.
Posted Aug 5, 2022 8:47 UTC (Fri)
by Wol (subscriber, #4433)
[Link] (1 responses)
The OP said that Android comes with SELinux switched on.
I think you're forgetting that (a) your "classic desktop" is actually a niche use case for Linux, and (b) even then, all of the "big boys" - RH, Ubuntu, SUSE - probably do have SELinux switched on. It's just not that visible ...
My system is gentoo - of course I haven't enabled it. But as Linux goes, gentoo and stuff like that is very much the minority ...
Cheers,
Posted Aug 9, 2022 12:58 UTC (Tue)
by anton (subscriber, #25547)
[Link]
Posted Aug 5, 2022 9:03 UTC (Fri)
by mw_skieske (guest, #144003)
[Link] (1 responses)
Hi there!
Do you know every Fedora Desktop has, in fact SELinux in enforcing mode?
❯ getenforce
kind regards
People who don't do SELinux are just lazy.
Posted Aug 5, 2022 12:39 UTC (Fri)
by corbet (editor, #1)
[Link]
Posted Aug 5, 2022 13:33 UTC (Fri)
by pizza (subscriber, #46)
[Link]
Fedora and RHEL, at least, have SELinux on out of the box.
All but one of my Linux installations have SELinux enabled (the exception is a heavily-used snowflake shell server whose install predates this SELinux stuff), including my two daily-use desktops (and several others I am responsible for).
Additionally, nearly every Android device out there relies on SELinux to enforce app isolation.
Posted Aug 8, 2022 22:43 UTC (Mon)
by cschaufler (subscriber, #126555)
[Link] (10 responses)
Posted Aug 8, 2022 22:57 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (9 responses)
AppArmor in Ubuntu is a bit more sane, because it doesn't require crazy labelling and impenetrable policies.
> And there's every cloud provider
Not every. And I know how one large provider works internally (a couple of years outdated, but I doubt it has changed much).
Heck, here's what EC2 offers for their own supported in-house distribution:
> [ec2-user@ip-172-31-0-166 ~]$ getenforce
> When we search our souls it's not about whether we should do a better job of getting out of the way, it's about how we can provide more of the features developers are screaming for and still maintain performance.
How long did it take to build stackable LSMs? For a decade the inability to run multiple LSMs made anything but SELinux/AppArmor impractical.
Sorry. But right now LSMs are just an impediment that most people try to wave away so it won't bother them. Large companies like Google have time and money to invest in getting it into shape, sure. But that's a far cry from being a useful and productive feature. Unlike cgroups or namespaces that are widely accepted by developers.
Posted Aug 9, 2022 16:16 UTC (Tue)
by cschaufler (subscriber, #126555)
[Link] (7 responses)
Posted Aug 9, 2022 20:15 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link] (6 responses)
It's not that it's not useful, additional mitigations are great. It's that the amount of effort that needs to be expended to make use of SELinux is just not comparable with the amount of protection it provides. I long ago tried to make sense of policies and to create my own toy policies, but failed miserably. TOMOYO is rigorously undocumented and I haven't touched Smack because it doesn't even look in any way "simplified".
AppArmor is a bit better, since it at least doesn't require labelling across all of the filesystem which is nothing but security theater compared to just using paths. Its policies are also easier to understand.
One feature that I really personally would have liked is an ability to use LSMs to _grant_ permissions instead of taking them away.
Posted Aug 9, 2022 20:24 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Aug 9, 2022 20:29 UTC (Tue)
by corbet (editor, #1)
[Link] (4 responses)
Posted Aug 10, 2022 21:39 UTC (Wed)
by cschaufler (subscriber, #126555)
[Link] (3 responses)
Posted Aug 11, 2022 0:09 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
Various systems (like IAM policies in AWS or ACLs in Windows) typically consider "Deny" to be a veto on any allowing ACLs/policies.
Posted Aug 11, 2022 17:50 UTC (Thu)
by cschaufler (subscriber, #126555)
[Link] (1 responses)
Posted Aug 11, 2022 18:50 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link]
That would actually help and make time investment into SELinux be worthwhile, as it will open up _new_ possibilities. Performance impact is another question, and it'd be interesting to see if removing the DAC entirely in favor of MAC would help.
Posted Aug 23, 2022 7:26 UTC (Tue)
by daenzer (subscriber, #7050)
[Link]
I've been working on the graphics stack (mostly between Mesa / mutter / Xwayland) as part of the Red Hat desktop group for 3 years. In this time, I've never had to disable SELinux on Fedora. AFAIK my colleagues are leaving it enabled as well.
There can sometimes be minor SELinux related issues when upgrading to a new beta release, but those are usually quickly fixed.
It seems to me what you think you know about this is hearsay and/or outdated.
Posted Jul 30, 2022 10:14 UTC (Sat)
by Wol (subscriber, #4433)
[Link]
That way, people who don't want the hassle/overhead just don't bother registering god with io_uring. People who are paranoid, or need accounting, or whatever, configure god to reject calls it doesn't know about (and the writers of said calls will quickly get bug reports saying "your io_uring call doesn't work - missing security module").
And if this is added *quickly*, before io_uring gets too embedded, it means that "no security module no run" is a realistic option. The later it gets left, the harder it gets to turn that on without all hell breaking loose ...
Cheers,
Posted Aug 15, 2022 18:03 UTC (Mon)
by jezuch (subscriber, #52988)
[Link]
> And, of course, a failure to provide proper security hooks will hobble the ability of administrators to control process behavior by way of LSM policies.
My $DAYJOB recently introduced a checklist in the pull request template. It pertains mostly release notes and documentation, but I imagine it could at least help here. Of course people will ignore it, will mis-judge the requirements etc, but maybe in the case of the bigger pull requests someone will insist on it being at least seriously considered.
Security requirements for new kernel features
Security requirements for new kernel features
Security requirements for new kernel features
Performance impact
Performance impact
Performance impact
Performance impact
Performance impact
Performance impact
Performance impact
Performance impact
Wol
All I could find says that Ubuntu does not have SELinux enabled by default. You apparently don't count Debian among the big boys, but it does not have SELinux enabled by default, either.
Performance impact
Performance impact
Enforcing
That last line could really have been done without; there is no need to insult people you disagree with on something like this. Please don't do that here.
Performance impact
Performance impact
Performance impact
Performance impact
> Disabled
Performance impact
Performance impact
Performance impact
It seems you have one point of agreement with Casey, anyway: at one point, at least, he too wanted authoritative hooks in the LSM subsystem.
Authoritative hooks
Authoritative hooks
Authoritative hooks
Authoritative hooks
Authoritative hooks
Performance impact
Performance impact
Wol
Security requirements for new kernel features