Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Posted Sep 1, 2023 19:27 UTC (Fri) by mb (subscriber, #50428)In reply to: Race-free process creation in the GNU C Library by bluca
Parent article: Race-free process creation in the GNU C Library
One additional nail into the coffin of unprivileged containers?
>The way polkit/dbus
I'm talking about the fundamental pidfd API. Any process could use pidfds.
Posted Sep 1, 2023 19:35 UTC (Fri)
by bluca (subscriber, #118303)
[Link] (16 responses)
I'm pretty sure those can have /proc too?
$ id -u
> I'm talking about the fundamental pidfd API. Any process could use pidfds.
Sure, to do process tracking - what kind of process would you need to track in a chroot? Besides, it's all moot, this is not glibc's fault, the kernel provides this interface, so that's what glibc can use to provide an abstraction
Posted Sep 1, 2023 19:36 UTC (Fri)
by bluca (subscriber, #118303)
[Link]
Posted Sep 1, 2023 20:57 UTC (Fri)
by pbonzini (subscriber, #60935)
[Link] (11 responses)
Any process that wants to spawn a process and use pidfd, but also write the pid in a log file or debug trace? Ignoring portability for a second, it could even be something like make or cargo.
Posted Sep 1, 2023 21:19 UTC (Fri)
by bluca (subscriber, #118303)
[Link] (10 responses)
Posted Sep 1, 2023 21:30 UTC (Fri)
by pbonzini (subscriber, #60935)
[Link] (9 responses)
Posted Sep 1, 2023 23:23 UTC (Fri)
by bluca (subscriber, #118303)
[Link] (4 responses)
Posted Sep 1, 2023 23:46 UTC (Fri)
by josh (subscriber, #17465)
[Link] (3 responses)
Posted Sep 2, 2023 0:43 UTC (Sat)
by bluca (subscriber, #118303)
[Link] (2 responses)
Posted Sep 2, 2023 1:08 UTC (Sat)
by josh (subscriber, #17465)
[Link] (1 responses)
(That operation would still be useful when passed a pidfd from elsewhere, but not *necessary* for the common case where you got the pidfd by creating a process.)
Posted Sep 2, 2023 1:37 UTC (Sat)
by bluca (subscriber, #118303)
[Link]
Posted Sep 3, 2023 4:14 UTC (Sun)
by IanKelling (subscriber, #89418)
[Link] (3 responses)
I don't think it is hypothetical. From my sysadmin perspective, I often build software in a chroot without a /proc mount. Very rarely, the build has needed it and I wanted to know why. Bind bounding /proc, I see find shows 546,160 user-listabable files and 304,803 user readable files. Making that a requirement to create processes just because opt-in to an api that avoids a race condition would be roughly a regression in my book.
Posted Sep 3, 2023 10:26 UTC (Sun)
by bluca (subscriber, #118303)
[Link] (2 responses)
Posted Sep 4, 2023 9:16 UTC (Mon)
by taladar (subscriber, #68407)
[Link] (1 responses)
Posted Sep 4, 2023 9:53 UTC (Mon)
by bluca (subscriber, #118303)
[Link]
Posted Sep 1, 2023 22:07 UTC (Fri)
by geofft (subscriber, #59789)
[Link] (2 responses)
Meanwhile, the kernel has a feature where, if your current /proc is in any way overmounted, you're not allowed to mount a new /proc - because that would give you access to the files that are supposed to be hidden to you. This is also, in isolation, an understandable / defensible feature.
The intersection of these features is that you can't correctly mount /proc inside a nested container or container-like thing inside a non-privileged Kubernetes container. If you make a new pidns (either because you're root or via a new userns, as in your example), all the paths in /proc are wrong because they refer to outer PIDs.
(The intersection of these features also ceases to be really defensible in the case where you don't allow your Kubernetes workloads to run as pid 0, which is a really good idea on its own.)
There have been some patches for a second procfs (whose exact name I'm forgetting) that provides /proc/$pid/ and the /proc/self/ symlink but not anything else in /proc, but I don't think they've been merged. If those could get merged and guaranteed mountable by anyone with CAP_SYS_MOUNT in their current namespace, regardless of what the existing /proc outside it looks like or even whether it exists, that would satisfactorily address the issue.
I suppose another option would be for /proc to always enumerate the calling process's PID namespace, but maybe that gets weird with open file descriptors passed between PID namespaces.
Posted Sep 1, 2023 22:28 UTC (Fri)
by bluca (subscriber, #118303)
[Link] (1 responses)
Posted Sep 2, 2023 1:56 UTC (Sat)
by cyphar (subscriber, #110703)
[Link]
In fact this also means you can bypass the check entirely -- if you have a "safe" subset=pids mount in your namespace, the kernel will allow you to mount an unmasked (fully-fledged) procfs.
Race-free process creation in the GNU C Library
1000
$ unshare -U -m --mount-proc -p -f
$ mount | grep img
proc on /tmp/img type proc (rw,nosuid,nodev,noexec,relatime)
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library