Reworking User-Mode Linux
[Posted November 26, 2002 by corbet]
User-Mode Linux (UML) is Jeff Dike's "port" of the Linux kernel to itself;
a UML instance runs as a set of processes on a "real" Linux system. UML
has long been useful as a kernel development tool - it's nice to have a
development environment which can be tweaked with normal debuggers, and
which can crash without taking down the host system. In recent times,
there has been a growing level of interest in UML for virtual hosting and
honeypot applications as well. Users (or attackers) can be given root
access to a UML instance without, one hopes, endangering the host system.
UML has traditionally worked by running every UML process as a process on
the host system. The kernel lives up at the top of each process's address
space; transitions to and from "kernel mode" are handled with signals. The
problem with this mode of operation is that it is hard to make secure,
since the UML kernel's memory range is accessible to the processes it is
running. This mode is also slow, since it involves frequent memory
protection changes and signals.
So Jeff has released a patch which fixes
these problems by radically changing how UML works. In the new scheme, a
UML instance runs as exactly two processes on the host system. One is the
UML kernel, while the other takes turn running user-space processes. The
result is more secure (kernel space, being in a separate process, is now
completely inaccessible), and significantly faster as well. There is,
according to Jeff, only one disadvantage to the new way of doing things: it
can't actually be implemented on a stock Linux kernel. This is the sort of
nagging little problem that has been the downfall of many a great
development project.
The problem has to do with how the user-space process works. That process
needs to run each UML process in its own address space. In other words,
every time the UML kernel decides to switch to a new process, the
host-system process running the UML processes needs a whole new memory
management data structure. The Linux kernel does not currently have the
ability to switch a process's memory environment in this manner.
Jeff's solution is to create a magic file called /proc/mm.
Opening this file creates a new address space; that address space can be
modified by writing to the file. When the file is closed, the address
space is deleted. Then, there is a set of ptrace() extensions,
one of which allows the caller to change the address space of the traced
process. By using /proc/mm to create a separate address space for
each UML process, the UML kernel can give each of its processes its own
view of the world within a single host system process. Problem solved.
It all looks like it works well. The /proc/mm approach may run
into some rough sailing on linux-kernel; a system call
implementation (or even /dev) might be better received. However
it is implemented, this new feature is exactly that: a new
feature. Adding new features into the virtual memory and process
management subsystems is exactly what is not supposed to happen during this
phase of 2.5 development.
(
Log in to post comments)