|
|
Subscribe / Log in / New account

A security model for systemd

[LWN subscriber-only content]

By Joe Brockmeier
November 5, 2025

All Systems Go!

Linux has many security features and tools that have evolved over the years to address threats as they emerge and security gaps as they are discovered. Linux security is all, as Lennart Poettering observed at the All Systems Go! conference held in Berlin, somewhat random and not a "clean" design. To many observers, that may also appear to be the case for systemd; however, Poettering said that he does have a vision for how all of the security-related pieces of systemd are meant to fit together. He wanted to use his talk to explain "how the individual security-related parts of systemd actually fit together and why they exist in the first place".

I did not have a chance to attend the All Systems Go! conference this year, but watched the recording of the talk after it was published. The slides are also available.

What is a security model?

Poettering said that when he started drafting his slides it dawned on him that he had used the phrase "security model" frequently, but without knowing its formal definition. So he turned to Wikipedia's definition, which states:

A computer security model is a scheme for specifying and enforcing security policies. A security model may be founded upon a formal model of access rights, a model of computation, a model of distributed computing, or no particular theoretical grounding at all.

That definition was pleasing, he said, because he could just "pull something out of my hair and it's a security model." Of course, he wanted to be a bit more formal than that. Considering the threats in the world we actually live in was the place to begin.

Thinking about threats

Today's systems are always exposed, he said. They are always connected; even systems that people do not think about, such as those in cars, are effectively always online waiting for updates. And systems are often in physically untrusted environments. Many systems are hosted by cloud providers and outside the physical control of their users. Users also carry around digital devices, such as phones, tablets, and laptops: "So it is absolutely essential that we talk about security to protect them both from attacks on the network and also locally and physically."

The staff here at LWN.net really appreciate the subscribers who make our work possible. Is there a chance we could interest you in becoming one of them?

The next thing is to think about what is actually being attacked. Poettering described some of the possible scenarios; one type of attack might take advantage of a vulnerability in unprivileged code, while another might try to exploit privileged code to make it execute something it was not supposed to. It could be an attack on the kernel from user space. "We need to know what's being attacked in order to defend those parts from whomever is attacking them."

Attacks also have different goals, he said. Some attacks may target user data, others may attempt to backdoor a system, and still others may be focused on using a system's resources, or conducting a denial-of-service (DoS) attack. The type of attacks determine the type of protections to be used. Encryption, he said, is useful if one is worried about data exfiltration, but not so much for a DoS.

Poettering said that he also thought about where attacks are coming from. For example, does an attacker have physical access to a system, is the attack coming over a network, or is the attack coming from inside the system? Maybe a user has a compromised Emacs package, or something escapes a web browser's sandbox. Not all of these attack sources are relevant to systemd, of course, but thinking about security means understanding that attacks can come from everywhere.

FLOUTing security

The bottom line is that the approach to defending against attacks depends on where they come from and what the intention of the attack is. Poettering put up a new slide, which he said was the most important of all the slides in his presentation. It included his acronym for systemd's security model, "FLOUT":

Frustrate attacks
Limit exposure after successful attacks
Observe attacks
Undo attacks
Track vulnerabilities

"I call this 'FLOUT': frustrate, limit, observe, undo, and track. And I think in systemd we need to do something about all five of them".

The first step is to "frustrate" attackers; to make attacks impossible. "Don't even allow the attack to happen and all will be good." But, it does not always work that way; software is vulnerable, and exploits are inevitable. That is why limiting exposure with sandboxing is important, he said. If a system is exploited, "they might do bad stuff inside of that sandbox, but hopefully not outside of it."

Since exploits are inevitable, it also necessary to be able to observe the system and know not only that an attack happened, but how it happened as well. And, once an attack has happened and been detected, it must be undone. With containers and virtual machines, it is less important to have a reset function, Poettering said: "Just delete the VM or container, if you have the suspicion that it was exploited, and create a new one". But that approach does not work so well with physical devices. "We need to always have something like a factory reset that we can return to a well-defined state" and know that it is no longer exploited. Finally, there is tracking vulnerabilities. Ideally, he said, you want to know in advance if something is vulnerable.

Poettering returned to a theme from the beginning of the talk; the fact that Linux, and its security features, were not designed "in a smooth, elegant way". There are so many different security components, he complained, ranging from the original Unix model with UIDs and GIDs, to user namespaces. "And if you want to use them together, its your problem". Too much complexity means less security.

He said that he preferred universal security mechanisms to fine-grained ones. This means finding general rules that always apply and implementing security policies that match those rules, rather than trying to apply policies for specific projects or use cases. He gave the example that device nodes should only be allowed in /dev. That is a very simple security policy that is not tied to any specific hardware.

But that is not how many of Linux's security mechanisms are built. SELinux, for instance, requires a specific policy for each daemon. Then, one might write the policy that forbids that daemon from creating device nodes. But that is much more fragile and difficult to maintain, he said. "It's much easier figuring out universal truths and enforcing them system-wide". To do that, components should be isolated into separate worlds.

Worlds apart

Poettering said that he liked to use the word "worlds" because it's not used much in the Linux community, so far. The term "worlds" could be replaced with "containers", "sandboxes", "namespaces", and so on. The important concept is that something in a separate world is not only restricted from accessing resources that are outside of that world, it should not see those resources at all.

So to keep the complexity of these sandboxes small, it's good if all these objects are not even visible, not even something you have to think about controlling access to, because they are not there, right?

Security rules should be that way, he said, and deal with isolation and visibility. That is different than the way SELinux works; everything still runs in the same world. An application may be locked down, but it still sees everything else.

The next fundamental thing to think about, he said, is to figure out what an application is in the first place and how to model it for security. It is not just an executable binary, but a combination of libraries, runtimes, data resources, configuration files, and more, all put together. To have a security model, "we need to model apps so that we know how to apply the security" to them.

Ideally, an app would be something like an Open Container Initiative (OCI) image or Flatpak container that has all of its resources shipped in an "atomic" combination; that is, all of the components are shipped together and updated together. In this way, he said, each application is its own world. Here, Poettering seemed to be comparing the update model for Docker-type containers and Flatpak containers to package-based application updates, where an application's dependencies might updated independently; he said that "non-atomic behavior" is a security vulnerability because different components may not be tested together.

Another piece of a security model is delegation; components need to be able to talk to one another and delegate tasks. On the server side, the database and web server must be able to talk to one another. On the desktop, the application that needs a Bluetooth device needs to be able to talk to the application that manages Bluetooth devices.

Security boundaries

Poettering also talked about different types of security boundaries. Security sandboxes are one type of boundary that most people already think about, and boundaries between user identities (UIDs). A system's different boot phases are yet another type of boundary; for example, during certain parts of the boot process there are values that are measured into the TPM. After that phase of the boot process is finished it "kind of blows a fuse" and the system can no longer modify those values, which provides a security boundary.

He said that there are also distinctions that are important between code, configuration, and state. Code is executable, but the configuration is not. The resources should be kept separate; state and configuration should be mutable, but code should not be mutable "because that's an immediate exploit, basically, if some app or user manages to change the code".

Along with the security boundaries are the technologies that enforce those boundaries; for example, Linux namespaces, SELinux security labels, CPU rings, and others.

Distributions

The Linux distribution code-review model is supposed to be a security feature, he said. It means that users do not have to download software from 500 different sources they "cannot possibly understand if they are trustworthy or not". Instead, users rely on distributions to do some vetting of the code.

However, Poettering said that there are problems with this model: namely that it does not scale and it is too slow. Distributions cannot package everything, and they cannot keep up with how quickly developers release software. Plus, code reviews are hard, even harder than programming. "So do we really trust all the packagers and the distributions to do this comprehensively? I can tell you I'm not." This is not to disrespect distribution packagers, he said: "I'm just saying that because I know I'm struggling with code reviews, and so I assume that other people are not necessarily much better than me".

One never knows, he said, if distribution packagers are actually reviewing the code they package, and "sometimes it becomes visible that they don't; let's hope that those are the exceptions". Sandboxing and compartmentalizing, Poettering said, is essential to ensure that users do not have to rely solely on code review for protection.

Rules

Having examined all the things that one has to think about when creating a security model, Poettering wanted to share the rules that he has come up with. The first is that kernel objects should be authenticated before they are instantiated. "We should minimize any interaction with data, with objects, with stuff that hasn't been authenticated yet because that is always where the risk is."

Poettering also said that security should focus on images, not files; look at the security of an entire app image, rather than trying to examine individual files (or "inodes" as he put it). "We should measure everything in combination before we use it". He brought up sandboxing again, and said that it was necessary to "isolate everywhere".

Another rule is that a comprehensive factory reset is a must, he said. This cannot be an afterthought, but something that needs to be in the system right away. And, finally, "we need to really protect our security boundaries".

But, he said, a security model still has to be useful. And, "as most of us here are hackers" there needs to be a break-glass mode that allows for making temporary changes and debugging. A break glass mode should be a measured and logged event, though: "Even if you are allowed to do this, there needs to be a trace of it afterward". Such a mode should not allow a developer to exfiltrate data from a system, and possibly even invalidate data in some way.

Linux misdesigns

Next, Poettering identified some of the things he felt were misdesigns in the Linux and Unix security models that he does not want to rely on. His first gripe was with the SUID (or "setuid") bit on files. This is not a new topic for him; Poettering said that general-purpose Linux distributions should get rid of SUID binaries in 2023, in response to a GNU C library (glibc) vulnerability. Instead, he suggested using interprocess communication (IPC) to manage executing a privileged operation on behalf of an unprivileged user.

He also felt that the Linux capabilities implementation is a terrible thing. The feature is "kind of necessary", but a design mistake. For example, CAP_SYS_ADMIN is "this grab bag of privileges of the super user". He complained that it had a privilege "so much bigger than all the other ones that it's a useless separation" of privileges. However, complaints about CAP_SYS_ADMIN are neither new nor rare; Michael Kerrisk, for example, enumerated several in his LWN article about it in 2012.

In any case, Poettering did acknowledge that capabilities are "not entirely useless", and that systemd makes heavy use of capabilities. However, "we only make use of it because it's there, and it's really basic, and you cannot even turn it off in the kernel".

One of the core Unix designs that Linux has inherited is "everything is a file". That is, he said, not actually true. There are certain kinds of objects that are not inodes, such as System V semaphores and System V shared memory. That is a problem, because they are objects with a different type of access control than inodes where "at least we know how security works".

Implementation in systemd

"Now, let's be concrete", Poettering said. It was time to explain how systemd implements the security model that he had discussed, and where its components fit into the FLOUT framework. The first was to sandbox services, to limit exposure; systemd has a number of features for putting services into their own sandbox.

Another is using dm-verity and signatures for discoverable disk images (DDIs) that are inspected to ensure they meet image policies. Verifying disk images would frustrate attackers, as well as provide observability; if a disk image does not match the signature, that is a sign of tampering. Systemd's factory reset features provide the "undo" part of the FLOUT framework; in systemd v258 the project added the ability to reset the TPM as well as disk partitions. LWN covered that in August 2025.

Poettering said that we should also "try really hard to do writable XOR executable mounts". A filesystems should be mounted writable so that its contents can be modified, or it should be mounted as executable so that binaries could be run from it. But a filesystem should never be both. If that were implemented through the whole system, he said, it would be much more secure. Systemd provides tools to do this, in part, with its system extension features. Systemd can mount system extension images (sysext) for /usr and /opt, and configuration extension images (confext) for /etc. The default is to mount these extension read-only, though it is possible to make them writable.

Systemd also uses the TPM a lot, "for fundamental key material" to decrypt disks (systemd-cryptsetup) and service credentials (systemd-creds). That, he said, helped to frustrate attackers and limit access. Finally, he quickly mentioned using the varlink IPC model for delegating and making requests to services, which also helped as a way to limit access.

Questions

One member of the audience wanted to know how Poettering would replace capabilities if he had a magic wand capable of doing so. "If you don't like it, what would you like to see instead?" Poettering responded that his issue was not with the capability model per se, but with the actual implementation in Linux. He said that he liked FreeBSD's Capsicum: "if they would implement that, that would be lovely".

Another attendee asked when systemd would enable the no new privileges flag. Poettering said that it was already possible to use that flag with systemd because it does not have SUID binaries. "We do not allow that". But, he said, it does not mean that the rest of the system is free of SUID binaries. It should be the goal, "at least in well-defined systems" to just get rid of SUID binaries.


Index entries for this article
ConferenceAll Systems Go!/2025



to post comments

OpenBSD pledge & unveil are also nice

Posted Nov 5, 2025 16:15 UTC (Wed) by rbranco (subscriber, #129813) [Link]

There are implementations for Linux using seccomp & Landlock:
- https://github.com/jart/pledge/
- https://github.com/marty1885/landlock-unveil

Relationship with kernel

Posted Nov 5, 2025 17:49 UTC (Wed) by SLi (subscriber, #53131) [Link] (1 responses)

I know this is neither the intent nor quite true, but one gets the feeling that the mentality is a bit of a resigned "this is what the kernel folks have decided to give us, and we'll do our best with it"—i.e. there's a large development barrier between systemd and the kernel. This is, of course, probably both good and bad; but given that systemd is relatively central to modern Linux systems, do you think it would make sense to even try to develop them together? If not now, then at some point in the future?

Now it has a bit of a feeling of waterfall development with agreed responsibilities and "you stay there, I stay here".

I'm not even saying this is bad. It's actually very good that the userspace/kernel API gets defined well and narrowly. Rather, do you see this as a hindrance?

Relationship with kernel

Posted Nov 5, 2025 18:17 UTC (Wed) by bluca (subscriber, #118303) [Link]

Fortunately it is not quite the case, generally speaking. Lots of stuff gets added because we ask for them and are the primary users - see most of the PID FD interfaces that were added in the past few years, and lots of cgroups stuff before that too.
However, there are tons of _existing_ interfaces/systems/whatnot that can't really change, as it would be a massive compat break to do so, and an humongous task on top of that, so it is true that we are resigned to e.g. file caps being what they are.
Adding new things is much much easier than changing existing, entrenched subsystems.

So as always it's nuanced, and there's a bit of both at play.

capabilities and, er, capabilities

Posted Nov 5, 2025 18:16 UTC (Wed) by smcv (subscriber, #53363) [Link]

> how Poettering would replace capabilities if he had a magic wand capable of doing so ... He said that he liked FreeBSD's Capsicum

... so, he'd like to replace the thing that is named "capabilities" (but, confusingly, is not a capability-based security model with the meaning used in e.g. https://en.wikipedia.org/wiki/Capability-based_security) with a capability-based security model. Terminology is hard!

Images are a false simplification

Posted Nov 5, 2025 18:32 UTC (Wed) by nim-nim (subscriber, #34454) [Link] (10 responses)

Images are no easier to check or proof than packages. A giant store of isolated images is just a giant store of isolated things, much like Amazon or Alibaba is a giant store of isolated gadgets, where you have no clue if anything is genuine or safe to use, and most often it is no genuine or safe and getting an idea if it is genuine or safe requires someone poking inside products looking what they are built from. Which is more or less equivalent into decomposing an app system into packages.

The disconnect is expecting too much of package validation and too little of image validation, because we wish for things to be simpler than they are.

That being said containment of app systems is clearly worthwhile however you assemble those app systems.

Images are a false simplification

Posted Nov 5, 2025 20:28 UTC (Wed) by ebee_matteo (subscriber, #165284) [Link] (9 responses)

I think the point being made is rather that, by virtue of using immutable images that can be verified before being started, you can trust that they were not tampered with from the point where they were signed.

Of course, you are right that this does solve just the integrity problem and does not prove authenticity.

The trust boundary however is pushed a bit further: at the point where you can validate an image was signed by the right people with the right keys.

Compare with a Debian package, whose files can be modified on disk after installation by a malicious user. And of course, an image also often implies a reproducible environment (e.g. controlled env variables, etc.) which makes it a bit harder to exploit.

Images are a false simplification

Posted Nov 5, 2025 21:05 UTC (Wed) by bluca (subscriber, #118303) [Link] (7 responses)

> Of course, you are right that this does solve just the integrity problem and does not prove authenticity.

It provides authenticity too, as the point being made was about signed dm-verity images. The signature is verified by the kernel keyring, so both authenticity and integrity are covered.

Of course this is not the case when using more antiquated image formats such as Docker, where it's just a bunch of tarballs in a trenchcoat, but systemd has been supporting better image formats for a long time now.

Images are a false simplification

Posted Nov 5, 2025 22:06 UTC (Wed) by nim-nim (subscriber, #34454) [Link] (2 responses)

Either way the value is limited.

We’ve known for quite a long time it is useless to install genuine foo if genuine foo can not resist exploitation as soon as it is put online (as companies deploying golden Windows images discovered as soon as networking became common), and we’ve known for quite a long time attackers rarely bother altering components in flight they typo squat and trick you into installing genuine authenticated malware (not different from the counterfeits that flood marketplaces and that Amazon or Alibaba will happily ship you in their genuine state).

Security comes from the ability to rebuild and redistribute stuff when it has a hole (honest mistakes) and from poking inside stuff that will be redistributed to check it actually is what it pretends to be (deliberate poisoning). And then you can sign the result and argue if your signing is solid or not, but signing is only worthwhile if the two previous steps have been done properly.

Images are a false simplification

Posted Nov 6, 2025 1:22 UTC (Thu) by Nahor (subscriber, #51583) [Link] (1 responses)

> And then you can sign the result and argue if your signing is solid or not, but signing is only worthwhile if the two previous steps have been done properly.

If what you built and distributed can easily be replaced without you knowing, then those two steps are of limited value too.

And continuing your line of thought, if signing/immutability/rebuild/distribution are all done right, they are useless if you don't very the source code you're using.

And even if the source code verification is done right, it is useless if the person doing the verification and signing of the code can be corrupted or coerced with a $5 wrench.

And even if [...]

TLDR; what you're arguing is that security is pointless and of limited values because there will always be a point where you have to trust something or someone. There will always be a weak link. All we can do is ensuring that most links are safe to reduce the attack surface. Using images is one step in that direction.

Images are a false simplification

Posted Nov 6, 2025 7:32 UTC (Thu) by nim-nim (subscriber, #34454) [Link]

That’s the “trust the vendor” thought but in out world we definitely do not trust the vendor.

We trust a system, where maximum transparency, accountability and absence of lockdown keep vendors honest. We trust the regulator, that forces vendors to provide a minimum of information on their pretty boxes, we trust consumer associations, that check the regulator is not captured by vendors, we trust people that perform reviews, tests and disassembly of products, the more so they are diverse and independent and unable to collude with one another, we trust competition and the regulations that enforce this competition and prevent vendors from cornering and locking down some part of the market.

And then you can add a certificate of genuine authenticity to the mix but most of the things you'll buy in real life don’t come with those because that’s icing on the cake no more. Trusting the vendor produces 737 maxes. This is not a fluke but human nature. You're usually better served by checking other things such as the quality of materials and assembly.

Performing third party checks is hard, and long, and those checks are usually incomplete, be it by distributions or in the real world, while printing authenticity certificates is easy. It is very tempting to slap shiny certificates on an opaque box and declare mission accomplished but it is not. Moreso if the result is reducing vendor accountability and dis-incentivize doing things right (I’m not saying that’s the case here but it is the usual ending of let’s trust the vendor initiatives).

Images are a false simplification

Posted Nov 5, 2025 22:17 UTC (Wed) by ebee_matteo (subscriber, #165284) [Link] (3 responses)

> It provides authenticity too, as the point being made was about signed dm-verity images. The signature is verified by the kernel keyring, so both authenticity and integrity are covered.

Yes, authenticity against a digital signature.

But trust has to start somewhere. You need to trust the signing keys, or somebody that transitively approved the key, e.g. as a CA.

In other words, you can prove an image was signed against a key, but if I manage to convince you to trust my public key, I can still run malicious software on your machine.

I still haven't seen the problem of supply-chain attacks being solved (by anybody, regardless of the technology employed).

Images are a false simplification

Posted Nov 5, 2025 22:23 UTC (Wed) by bluca (subscriber, #118303) [Link] (2 responses)

> But trust has to start somewhere.

Yes, and this is a solved problem on x86-64: you trust the vendor who sold you your CPU. You have to anyway, since it's your CPU, and it's silly to pretend otherwise.
That CPU verifies the firmware signature, which verifies the bootloader signature, which verifies the UKI signature, which verifies the dm-verity signature.

Images are a false simplification

Posted Nov 6, 2025 1:32 UTC (Thu) by Nahor (subscriber, #51583) [Link] (1 responses)

> this is a solved problem on x86-64

Not really. It's more like it is an unsolvable problem (or at least impractical to do so) so we choose to stop there.

> you trust the vendor who sold you your CPU

Plenty of people will argue you can't ("blabla manufacturing blabla China blabla" and "blabla NSA blabla backdoor blabla")

Images are a false simplification

Posted Nov 6, 2025 2:46 UTC (Thu) by intelfx (subscriber, #130118) [Link]

> Plenty of people will argue you can't ("blabla manufacturing blabla China blabla" and "blabla NSA blabla backdoor blabla")

That's the point of the GP, which I believe you have missed.

If you don't trust your CPU vendor enough to believe that their root of trust implementation is not subverted by your malicious actor of choice, then why would you trust *anything* that comes out of that CPU against the same malicious actor? The only logical choice of action would be to throw the CPU away immediately.

And if you haven't done that, then it necessarily follows that you *do* trust the CPU vendor, so it's fine if they implement a root of trust too.

Images are a false simplification

Posted Nov 6, 2025 8:10 UTC (Thu) by taladar (subscriber, #68407) [Link]

On the other hand images are one step further away from the actual source of the code (as in the dev team, not the files with the lines) which means there is one more layer that knows and cares less about the details, one more layer to be outdated and more layer you have to penetrate to figure out which open security issues exist and one more layer you need to rebuild once a security issue is fixed.

Images have quite frankly left me totally unconvinced that those who build them do actually care about security issues enough to even check for open issues, much less rebuild them every single time one gets fixed.

What good is having the authentic image if the image contains a mere few hundred open security holes of various (but not just low) severity?

SELinux and containers are complementary

Posted Nov 6, 2025 7:54 UTC (Thu) by tomf (subscriber, #113110) [Link] (1 responses)

> But that is not how many of Linux's security mechanisms are built. SELinux, for instance, requires a specific policy for each daemon. Then, one might write the policy that forbids that daemon from creating device nodes. But that is much more fragile and difficult to maintain, he said. "It's much easier figuring out universal truths and enforcing them system-wide". To do that, components should be isolated into separate worlds.

I'm reminded of maybe a Dan Walsh quote in which he says SELinux policy complexity reflects an underlying oversharing, and that the SELinux policy files for containers are simple precisely because containers define separate worlds.

Meanwhile, SELinux has prevented some container escapes. So I think of containers and SELinux as being complementary.

I see the author also wrote https://www.redhat.com/en/blog/selinux-mitigates-containe... :)

SELinux and containers are complementary

Posted Nov 6, 2025 7:59 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

SELinux policy reflects the inherently wrong approach it takes to security. It starts with a simple idea that you just need to be able to compare the label lists. It can be formally analyzed (it's just set operations after all).

But then it turns out that you need to be able to label everything. And propagate the labels. And then have escape hatches from all of that. So pretty much every complex Linux installation ends up with disabled SELinux, some vendors don't even bother with it. Amazon ships their Amazon Linux with SELinux disabled by default.

AppArmor offers comparable security but much simpler policies. Yet it has never gained any traction because it's not complex enough, apparently.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds