User: Password:
Subscribe / Log in / New account

Sysfs and namespaces

Sysfs and namespaces

Posted Aug 28, 2008 9:26 UTC (Thu) by liljencrantz (guest, #28458)
Parent article: Sysfs and namespaces

Uninformed question:

Roughly how close are we to having fully working, usable namespaces in mainline kernel?

By «fully working, usable» I mean a setup where you can run multiple fake operating systems under the same actual kernal, each one with their own init process, each one running a different set of services. Basically everything you do today using Xen, but at a higher speed and with lower memory overhead and without the option of running different kernel versions on different systems.

(Log in to post comments)

Sysfs and namespaces

Posted Aug 28, 2008 12:51 UTC (Thu) by danpb (subscriber, #4831) [Link]

We are working on supporting this in libvirt's "LXC" driver (LinuX Containers). This driver uses the clone() syscall along with the new CLONE_NEW{PID,UTS,USER,NS,IPC,NET} flags to create a container that is isolated from the "host" operating system.

There are roughly two ways of using this capability

- Workload isolation for applications. The application shares the same root filesystem as the host, perhaps with a few extra mounts points and custom networking.

- Security isolation for applications. The application has a totally isolated private root filesystem, custom networking, etc - nothing is shared with the host OS.

As of 2.6.26, only the workload isolation use case is usable. Well, actually not quite true, we can do the private root filesystem too, but it is not secure because we're lacking some kernel capabilities still. For workload management we will be integrating with cgroups to control CPU/memory/etc limits

For the security isolation use case to be usable in real world, the sysfs namespace patch is one of the core missing pieces. The second is device namespace - so the nodes in /dev/ and /dev/pts inside the container are separated from those of the host OS. It is not clear what the timeframe on this latter capability is going to appear. If it appears before 2.6.29 i'd be surprised...

Sysfs and namespaces

Posted Aug 28, 2008 16:20 UTC (Thu) by iabervon (subscriber, #722) [Link]

Note that there's a different variation that might be useful (and might be complete either before or after that): being able to have different users see a partially different system. For example, giving each non-root user a different /tmp directory (subdirectories of the real /tmp). It would also be possible to have a single machine with multiple heads, where each of these would appear as the only (or, at least, main) head; if you plug a USB mouse into the USB hub built into your monitor, it controls your pointer and not anybody else's, for example, and you own the auto-mount of the USB memory stick you plug in. And it might be nice to be able to have a developer on a shared system able to run an instance of postgres that seems to that user to be system-wide, but is actually private, without the postgres processes able to tell that they're not system-wide.

Sysfs and namespaces

Posted Aug 28, 2008 18:01 UTC (Thu) by ebiederm (subscriber, #35028) [Link]

From a high level it looks something like:
- The last couple of bugs with signal handling and init
fixed in the pid namespace

- sysfs

- The uid namespace

If you are someone who can take less than perfection you can build
a better chroot today.

I'm hoping once the current round of changes settles out we
can get a chroot like tool out to people so non-experts can
start using this code.

The short term goal is not to be a Xen replacement but to correctly
implement the namespaces we have and to do something useful. Which
basically amounts to building a better chroot, and to start reducing
the differences between vserver and openVZ.


Sysfs and namespaces

Posted Sep 3, 2008 18:37 UTC (Wed) by jlokier (guest, #52227) [Link]

I find myself wondering if these containers are nestable.

That is, the whole reason we need any virtualisation is applications (whole working systems) expect something which strongly resembles a single Linux box. Virtualisation provides that illusion, while isolating the application.

In the old days, it was enough to use 'processes' and 'directories' :-)
But applications grew, and did cleverer things like configure their own firewalls and virtual networks, and decided they really depend on a thing which looks strongly like a single Linux box.

Pretty soon, someone is going to decide that these containers are really neat, that you can put Apache in one, DNS in another, SMTP in another, etc., and build whole working systems like that.

Then someone else is going to want to take that working system, and run _that_ in a container... Will it work? Will the containers nest?

Sysfs and namespaces

Posted Sep 4, 2008 18:06 UTC (Thu) by adobriyan (guest, #30858) [Link]

It should, in theory, work and nest.

Sysfs and namespaces

Posted Sep 4, 2008 20:18 UTC (Thu) by ebiederm (subscriber, #35028) [Link]

Yes. The in kernel solutions are nestable. The out of tree solutions like OpenVZ and Vserver appear to have architecture limits that keeps them from nesting today.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds