Systemd heads for a big round-number release
Systemd heads for a big round-number release
Posted May 7, 2024 20:42 UTC (Tue) by intelfx (subscriber, #130118)In reply to: Systemd heads for a big round-number release by bluca
Parent article: Systemd heads for a big round-number release
You mean "disk-image" container managers? Not sure how to call it, but I'm pretty sure podman is already truly unprivileged...
Anyway, I can see this could be pretty useful but also dangerous because it would allow the kernel to trip over a potentially malicious filesystem image. I'm assuming polkit/verity integration is there for this exact reason, with polkit covering the workstation use-case and verity covering the "container fleet" use-case?
Posted May 7, 2024 21:17 UTC (Tue)
by bluca (subscriber, #118303)
[Link] (14 responses)
Nope, uses setuid binaries
> Anyway, I can see this could be pretty useful but also dangerous because it would allow the kernel to trip over a potentially malicious filesystem image. I'm assuming polkit/verity integration is there for this exact reason, with polkit covering the workstation use-case and verity covering the "container fleet" use-case?
Yes, pretty much
Posted May 8, 2024 1:01 UTC (Wed)
by intelfx (subscriber, #130118)
[Link] (1 responses)
Okay, what am I missing here?
# find /usr -perm /u+s,g+s -print0 | parallel -0 -X pacman -Qqo | sort -u | grep -Fxf <(pactree --linear --unique podman)
Posted May 8, 2024 15:00 UTC (Wed)
by pbonzini (subscriber, #60935)
[Link]
Posted May 8, 2024 1:07 UTC (Wed)
by geofft (subscriber, #59789)
[Link] (11 responses)
So, for instance, this is more unprivileged than anything that requires a new version of systemd to be running as root: an unprivileged user will often be on some system that sets up subuids/subgids for them and has the setuid helpers because that's what the distro does by default, but the system is running some existing stable release of a distro. (Also, all of this only works if unprivileged user namespaces are enabled, which some distros are clamping down on.)
It is possible to get podman to work without the setuid binaries at all, though - assuming you don't have subuids set up, you can just do podman run --uidmap 0:0:1, and it will realize it doesn't need to map things. (If you do have /etc/subuid and /etc/subgid files but newuidmap/newgidmap aren't actually setuid, e.g. because you're in some no-new-privs mode or using strace or whatever, it's currently a little bit annoying but doable. I got this working with suitable fake subuidmap and subgidmap commands.)
But also, this is a sort of different thing from what systemd-nspawn has gained a privileged helper for. There isn't anything that handles UID/GID mapping in this setup, is there? So the limitation, which would apply to both systemd-nspawn and fully unprivileged podman, is that you only get a single UID/GID inside the container. Which is often fine but not always.
The privileged helper for systemd-nspawn is, as I understand it, an IPC interface to get a mounted root filesystem for the container. podman does not need privilege for that - but it does need FUSE accessible to unprivileged users, which is another common but not guaranteed configuration. (Or, I think, it can just copy a bunch of files with the default "vfs" driver.) I'm curious if you considered a FUSE approach of some sort to allow using untrusted images.
Posted May 8, 2024 1:12 UTC (Wed)
by intelfx (subscriber, #130118)
[Link]
Posted May 8, 2024 1:23 UTC (Wed)
by geofft (subscriber, #59789)
[Link] (5 responses)
Posted May 8, 2024 9:52 UTC (Wed)
by bluca (subscriber, #118303)
[Link] (4 responses)
https://www.freedesktop.org/software/systemd/man/devel/sy...
Posted May 8, 2024 14:11 UTC (Wed)
by intelfx (subscriber, #130118)
[Link] (3 responses)
Posted May 8, 2024 14:28 UTC (Wed)
by bluca (subscriber, #118303)
[Link] (1 responses)
Posted May 8, 2024 14:34 UTC (Wed)
by daroc (editor, #160859)
[Link]
Posted May 8, 2024 14:35 UTC (Wed)
by paulj (subscriber, #341)
[Link]
Posted May 8, 2024 9:55 UTC (Wed)
by bluca (subscriber, #118303)
[Link] (2 responses)
Posted May 8, 2024 13:47 UTC (Wed)
by rbranco (subscriber, #129813)
[Link] (1 responses)
Posted May 8, 2024 13:53 UTC (Wed)
by bluca (subscriber, #118303)
[Link]
Posted May 8, 2024 21:55 UTC (Wed)
by paravoid (subscriber, #32869)
[Link]
I think you're talking about the fuse-overlayfs driver. That's actually not needed anymore: the kernel's overlayfs these days* supports user namespaces, and can thus be used without root (assuming unprivileged user namespaces are enabled of course). So FUSE is not required anymore.
*: Linux >= 5.11
Posted May 7, 2024 23:05 UTC (Tue)
by smcv (subscriber, #53363)
[Link] (4 responses)
Sadly, no. If it was, it would have the same limitations as bubblewrap, which can only provide two user IDs (the one that is mapped to the caller's uid, and the kernel's overflow uid, which you can think of as "me" and "not me" respectively) and the analogous situation for group IDs.
This is a kernel-imposed restriction, because the kernel doesn't know that it's OK for me (uid 1000, say) to run arbitrary code as some mapped uid (uid 100000, say). That policy is provided by /etc/subuid and /etc/subgid, which are read by setuid programs like newuidmap but are not special to the kernel.
That's usually fine for a typical bubblewrap use-case like Flatpak that only wants to run a single app, with the wider system protected from the app, but no further privilege separation within it; but it would not be enough for a whole-system container manager like Podman or Incus that wants to distinguish between various different uids inside the container, or perhaps even run a whole OS from init upwards.
Posted May 8, 2024 1:04 UTC (Wed)
by intelfx (subscriber, #130118)
[Link]
Ah, I see. I did not know it worked this way.
Posted May 8, 2024 1:21 UTC (Wed)
by geofft (subscriber, #59789)
[Link]
(on a side note, thank you for maintaining bubblewrap, it's awesome)
Posted May 8, 2024 4:43 UTC (Wed)
by josh (subscriber, #17465)
[Link] (1 responses)
This was the problem I attempted to solve many years ago with "supplementary UIDs", which would have allowed the login mechanism to give the user access to a range of additional UIDs that it could use as it saw fit. It used a "setusers" syscall, analogous to "setgroups".
Unfortunately, in the course of developing this, the theoretical possibility came up of a file that gives more permission to "group" or "other" than to "user", so the ability to drop a UID from your current identity was considered a security issue. (The observation that supplementary GIDs would already allow that as well led to a security bugfix to prevent that, which is why container GID management has the extra "setgroups" hoop to jump through.)
Because of that, I gave up on the patch. I would love to see someone revive it; it would be really useful for containers.
Posted May 11, 2024 11:38 UTC (Sat)
by ringerc (subscriber, #3071)
[Link]
Systemd heads for a big round-number release
Systemd heads for a big round-number release
dbus
krb5
pam
shadow
util-linux
Systemd heads for a big round-number release
podman uses the newuidmap/newgidmap commands to get access to a range of subuids. There are cases where this is close enough to "unprivileged," because those commands can come from the OS and have a pretty stable/minimal interface, and an unprivileged user can install whatever version of podman they like via whatever means (e.g. building from source) and it will work with the existing newuidmap/newgidmap binaries - nothing in podman itself needs to be setuid.
podman unprivileged (Systemd heads for a big round-number release)
podman unprivileged (Systemd heads for a big round-number release)
podman unprivileged (Systemd heads for a big round-number release)
podman unprivileged (Systemd heads for a big round-number release)
podman unprivileged (Systemd heads for a big round-number release)
podman unprivileged (Systemd heads for a big round-number release)
podman unprivileged (Systemd heads for a big round-number release)
podman unprivileged (Systemd heads for a big round-number release)
podman unprivileged (Systemd heads for a big round-number release)
And this is an IPC API, so anything else can make use of it, if it is enabled on the system, not just nspawn.
podman unprivileged (Systemd heads for a big round-number release)
podman unprivileged (Systemd heads for a big round-number release)
podman unprivileged (Systemd heads for a big round-number release)
Systemd heads for a big round-number release
Systemd heads for a big round-number release
Well, bubblewrap's manpage explicitly mentions using those tools:
Systemd heads for a big round-number release
So I don't think that this is really a limitation of either podman or bubblewrap. By themselves they can only put you in single-UID mode (see my other comment for how to do this with podman); with the setuid helpers, they can both support multi-UID mode. podman defaults to calling the helpers and bubblewrap won't do it on its own, but that's more because of what the two projects are trying to be.
Systemd heads for a big round-number release
Systemd heads for a big round-number release