Kexec
[Posted November 13, 2002 by corbet]
One of the remaining features that may yet get merged is the "Kexec" patch
by Eric Biederman. This patch performs what may seem to be a
straightforward task - it reboots the system directly into a new kernel.
Things are not always as simple as they seem, however, and this patch has
been through an extended period of reworking on its way toward (probable)
inclusion.
One might wonder what the use of Kexec is, given that people have somehow
managed to reboot their systems for years now. Kexec differs from a normal
reboot in that the old kernel loads the new one, and jumps to it,
directly. There is no need to reset the hardware and go through the whole
BIOS startup routine. So, reboots are faster and, perhaps, more reliable.
There is also an obvious advantage for kernel developers, who can simply
say "boot that image" without having to tell a boot loader (such as LILO)
about it first.
Rebooting on the fly in this manner is not an entirely easy thing to do.
The new kernel, after all, probably wants to sit in the same part of memory
as the current one. So the new kernel can not be put into its real place
until the old kernel has finished shutting down gracefully. But, by that
point, the old kernel is no longer in a position to load the new one from
user space, or from anywhere else.
So the Kexec code has to start by buffering a copy of the new kernel
somewhere else in memory. When user space indicates that it has a new
kernel to boot, the Kexec code allocates a big pile of memory pages to hold
the kernel code. This code is spread out through (non-high) memory, and is
not contiguous or otherwise ready to execute.
Also allocated along with the memory for the kernel code is the "reboot
code buffer." This buffer is typically just a single page.
When the time
comes to boot into the new kernel, the Kexec code does the following:
- Shuts down the kernel, and tries to reset devices to a known state.
The code does not unmount filesystems, kill processes, etc.; that work
is expected to have been done by user space prior to the reboot call.
- Copies a small bit of assembly code into the reboot code buffer. This
code's job is to take the set of pages holding the new kernel and copy
them into their real destination - typically overwriting the old
kernel.
- Jumps (via a return, actually) into the new kernel.
The original Kexec patch created a kexec() system call which would
load the new kernel image as described above, and immediately reboot into
that image. That approach, however, wasn't
quite what Linus had in mind, even though Linus likes the Kexec idea in
general. Why not, asked Linus, split up the operations of loading the new
kernel and rebooting into it?
The reasoning for splitting these operations has mostly to do with other
possible uses for Kexec. For example, one can imagine all kinds of things
that could be done when the kernel panics: boot into a debugger or crash
dump generator, or just bring up that old 2.2 kernel that always worked.
The problem is that, when the system has gone into a panic, you really do
not want it digging around in the filesystem looking for an image to boot;
that needs to have been set up ahead of time. And the only way to do that
is to split the load and reboot steps.
So the current patch has a
kexec_load() system call which loads a
kernel image into memory. Then, a new LINUX_REBOOT_CMD_KEXEC
command for the existing reboot() call finishes the task. This
version of Kexec still does not handle the panic case, but it has most of
the infrastructure needed to do that.
(
Log in to post comments)