User: Password:
Subscribe / Log in / New account

user namespace and uid namespace

user namespace and uid namespace

Posted Apr 1, 2011 1:19 UTC (Fri) by mfedyk (guest, #55303)
Parent article: The 2.6.39 merge window concludes

I'd like to point out that we had/have the uid namespace that went in around 2.6.27, which given the existence of user namespace, apparently didn't cover capabilities within the uid namespace (ie, leaving capabilities global to all namespaces).

I haven't been following this part of the namespace progression so I can only surmise what happened to the uid namespace.

Are there two namespaces now, uid and user with user ns being a superset of uid ns? Or was uid ns extended to cover capabilities within uid ns and renamed to user ns?

Now let me quote from the linked article (

> The user that creates the namespace will have all capabilities in that namespace, not just the set of capabilities they have in the parent. Essentially, the creator has the privileges of the root user in any namespace he or she creates.

Now since LXC doesn't have OpenVZ's simfs that lets you create a mountpoint based on any arbitrary directory, if you leave an unmodified distro in a container and you haven't used a separate filesystem (or btrfs subvolume), that distro can remount the host's filesystem as read-only (which typically happens just at the end of a halt or reboot inside the container).

One current workaround for this is to disable the CAP_SYS_ADMIN (or VXC_SECURE_MOUNT in Linux VServer[1]) capability. Since the allowed capabilities are reset fully open upon the creation of a new user namespace, how do you limit child namespaces from causing trouble on your host system and share a filesystem with LXC?

OpenVZ is great in this respect because you can have one filesystem with many containers on it without needing to use image files and loop mounts or lvm.


(Log in to post comments)

user namespace and uid namespace

Posted Apr 3, 2011 11:02 UTC (Sun) by ebiederm (subscriber, #35028) [Link]

The uid/user namespace that went in around 2.6.27 is the same one under discussion. Unfortunately the implementation was massively incomplete and did not handle the case of where anything from different user namespaces were mixed.

In particular the user namespace is still moving in the direction of converting all of the checks from simple uid equality to comparing the tuple of usernamespace and uid.

The specific question about remounting a filesystem, the filesystem of piece of the permission checks has yet to be updated.

The reason getting a full set of capabilities will be harmless is because it is actually equivalent to dropping all capabilties. The capabilities will only apply to objects and namespaces created after you create the user namespace. So once properly implemented you simply won't be able to do anything dangerous but you will be able to use facilities that today are root only, only because suid root applications could be spoofed.


Copyright © 2018, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds