initramfs and where user space truly begins
[Posted July 11, 2006 by corbet]
The
initramfs mechanism was
added to the 2.5.46 kernel. With initramfs, a boot-time filesystem can be
created (in
cpio format) and appended to the kernel image file.
When the system boots, it will have access to the filesystem from the very
beginning of the bootstrap process - far before it reaches the point of
being able to mount disks. Initramfs works much like the venerable initrd
facility, but, unlike initrd, initramfs does not require the system to be
able to mount a disk and find the filesystem image.
Initramfs is increasingly useful as hardware becomes more complex. Often,
simply finding the root filesystem can involve complex hardware setup,
conversations across the network, getting cryptographic keys, piecing
together RAID or LVM volumes, and more. Currently, much of this work is
done inside the kernel itself, leading to kernel code which duplicates
user-space tools - but with less review and maintenance. Moving this work
into a user-space boot-time filesystem promises to shrink the kernel, make
the boot process more reliable, and allow distributors (and users) to
customize the early bootstrap process in interesting ways.
Thus far, however, use of initramfs has been limited; in particular, all of
the early boot code remains in the kernel. One of the blocking points has
been the need for a minimal C library which would work in that
environment. This library (klibc) has been under development, slowly, for
years. That work has recently culminated in a set of klibc patches posted by
H. Peter Anvin. Klibc is now in a position to help rework the Linux
bootstrap process - and to force discussion of just how the kernel should
interact with tightly-coupled utilities.
The core klibc patch includes replacements for a long list of C library
functions and system call wrappers. It is sufficient, for example, to
support a minimal shell called "dash" and a port of the gzip utility.
There is a root filesystem mounting utility which can handle several
filesystem types, obtaining an IP address using bootp or DHCP, NFS mounts,
assembly of RAID volumes, resuming of suspended systems, and more. Much of
the code which performs those functions can then be removed from the kernel
itself. Klibc and the kinit program which comes with it appear to be
getting close to ready for real use.
This code, like other efforts to move core kernel features into user space,
raises a number of questions. Some of these are likely to come up at the
kernel summit in Ottawa, but a real solution is likely to be rather longer
in coming.
The fundamental question is this: are klibc and kinit part of the kernel?
They consist of code which used to be part of the kernel itself, and which
is a necessary part of the kernel bootstrap process - if the related code
is removed from the kernel, the kernel will not be able to run
without kinit. Both components are tightly tied to the kernel, to the
point that a kernel upgrade may often require upgrading kinit and klibc as
well. A system where the kernel and kinit go out of sync may well fail to
boot.
To many developers, these reasons are more than adequate to justify
packaging (and building) kinit and klibc with the kernel itself. If the
code is kept and built together, it has a much higher chance of continuing
to function as a coherent whole. Every kernel/kinit combination will have
been tested together and will be known to work. If, instead, the two are
separated, the resulting kinit will be, in essence, a large body of kernel
code which is not reviewed and maintained with the rest of the system. The
quality of kinit could be expected to suffer, complaints from users could
grow, and differences between distributions could increase.
On the other hand, if kinit must be part of the kernel, one could well ask
just where the line should be drawn. Should udev, which has
suffered from (rare) kernel version incompatibilities, be included? How
about the user-space software suspend code? Cluster membership utilities?
Filesystem checkers? Wireless network authentication daemons? Unless
Linux is going to head toward a more BSD-like organization (an unlikely
prospect), we will not see all of the above tools included in the kernel
tarball anytime soon. And so, according to some, kinit and klibc should be
maintained as out-of-kernel packages like any other user-space code.
There is another important issue here, however: compatibility between
distributions and between kernel versions. Earlier this year, your editor
had a system running a development distribution fail to boot; that
distribution's maintainers had concluded that, since the
distribution-specific initrd image mounted /proc and
/sys, there was no reason for the initialization scripts to do so
as well. Your editor, who has never had much use for initrd, was left with
a system which was unable to run a vanilla kernel.org kernel. That
particular change was (after your editor complained) backed out, but the
issue remains: distribution-specific initialization code can make it
impossible to run kernels obtained from elsewhere. Ted Ts'o has also pointed out an initialization problem which
makes RHEL4 unable to run current kernels on some systems. He says:
Kinit SHOULD be merged into the kernel, and the responsibility of
creating the initrd/initramfs image should be moved from the
distribution into the kernel build process. There can and should
be a way for distro's to add their own "value add specials" into
the initrd/initramfs image, but we have to take over creating the
base initial userspace environment.
This is a discussion which could go on for some time; it could become one
of the more contentious issues at the kernel summit. There is a subset of
the kernel development community which has a strong desire to move as much
code as possible into user space. Not everybody agrees that this is the
right approach, but, to the extent that code is shoved out of kernel space,
there must be a vision describing how all of the pieces will continue to
work well together into the future. That vision does not yet appear to
exist.
(
Log in to post comments)