Filesystem images and unprivileged containers
Filesystem images and unprivileged containers
Posted Sep 15, 2016 7:38 UTC (Thu) by TheJH (subscriber, #101155)Parent article: Filesystem images and unprivileged containers
No. That ID maps to some UID outside the container, but *not* the "nobody" user. Just some UID that is not special in any way except that it's not listed in /etc/passwd, but instead (if you're using unprivileged containers) in a range in /etc/subuid.
There is a situation with user namespaces where you see "nobody" users, but that's when you are *inside* the namespace and looking at stuff outside that has a UID that isn't mapped into the namespace, not the other way around.
> It has effectively bet its cloud business on user namespaces.
That's pretty bold.
> That would mean there would be a "double shift" of IDs: once from the namespace to kernel and then from the kernel to the filesystem view.
That's how everything(*) works with user namespaces. As soon as a UID crosses the kernel-userspace boundary, it is mapped. Same thing if you e.g. use SCM_CREDENTIALS: The sender's values are mapped to the kernel view when sent and mapped back to the user view when received.
> Shiftfs is effectively an in-kernel bindfs that uses ID ranges to avoid that problem in bindfs.
My main concern with shiftfs is that, as far as I understand, this will reduce the boundary between containers and even between a container and the init namespace. With bindfs, the backing filesystems for all containers will use the same UID ranges, right? So a container root could store a setuid container-uid-0 binary inside the container and thereby create a setuid shared-container-root-uid binary on the host system? Even if that binary is inside a uid-0-owned, mode-0700 directory, it still seems a bit brittle.
*: Except signal sending, and that's ugly and slightly buggy. kill() takes the kuid, maps it down, then userns_fixup_signal_uid() maps it back up again and then back down into another namespace.
