LWN.net Logo

Advertisement

E-Commerce & credit card processing - the Open Source way!

Advertise here

KS2008: Bootstrap code

By Jonathan Corbet
September 16, 2008
LWN's 2008 Kernel Summit coverage
Initramfs is a useful tool; it allows a filesystem (in cpio format) to be tacked on to the end of the kernel executable image. When the kernel boots, it unpacks the filesystem into RAM and mounts it as the initial root filesystem. Therein will be found enough bootstrap code to get the system properly initialized and running from the real root filesystem. It is possible to boot a system without an initramfs, but essentially all distributors make use of this facility.

Dave Jones, the Fedora kernel maintainer, made the claim that the initramfs code is one of the most boring parts of any distribution. Even so, all distributors still roll their own initramfs code. It is a pain, and it doesn't make any sense. So Dave looked into what's going on in this code to see if the situation could be made any better.

The Red Hat initramfs image, used in Fedora, is the product of many years' worth of heritage and workarounds. Whenever the developers have run into an early bootstrap problem, they have thrown another hack into the initramfs code to make things work again. This code is ugly, but nobody wants to switch to anybody else's version. They fear that a different initramfs will lack all those hard-earned workarounds, and, besides, everybody feels that their particular solution is the best.

So what does the initramfs code do? Its job is to load any necessary storage drivers, then wait for the storage devices to settle. The swap system needs to be enabled. If the swap partition contains a hibernation signature, a resume from disk operation is begun. Otherwise the initramfs code must find the root filesystem (an operation which may require setting up the device mapper or getting networking going), mount it, then switch over to the real operating system. Red Hat's version has to support a wide variety of root filesystems, and contains a lot of crufty code.

The situation is pretty much the same with the other distributors. Where things differ, it often has to do with differing kernel configurations, and, in particular, differences of opinion over whether specific code should be built into the kernel or built as a module.

Differences between initramfs setups can create some annoying problems. Sometimes these differences are enough to cause some kernel configurations to fail on one distribution. It would make life easier for everybody if a more uniform set of tools were used for early system initialization. This code could be part of the kernel tree, and it could change, when needed, in response to kernel changes. In the end, things would just work.

There's a few details that would have to be dealt with. Some distributions use the in-kernel hibernation (suspend-to-disk) code, while others are using TuxOnIce. It seems like maybe it's time for everybody to standardize on one hibernation solution. While most distributions have long since switched over to the parallel ATA drivers, some are still using the older IDE subsystem. Not everybody supports root filesystems on iSCSI devices. And so on. But these are problems which should be amenable to a solution.

Dave is going to start by adding a "make mkinitrd" option to the kernel build system; it will create a version of the Fedora mkinitrd for now. Others will be encouraged to join in and help make it work for everybody.

Beyond that, Dave suggested that the developers could start to build a set of reference boot scripts in the kernel. Once again, this is an area where distributors tend to roll their own code; they could benefit from bits of code showing the best way to initialize parts of the system. Al Viro pointed out that there will be problems coming from the fact that different distributors use different shells in their early boot code. That led to an extended discussion of the evils of nash and the celebrations which will ensue upon its eagerly-awaited demise.

There was some brief discussion of klibc - a small version of the C library intended for use in initramfs code. That project has been stalled for some time due to lack of interest; it could probably be restarted without too much trouble. The problem is that, despite all their wishes, distributions often end up having to use glibc in their initramfs filesystems. The biggest driver here appears to be internationalization, which is not properly handled by the various stripped-down libc implementations out there.

Getting back to the concept of a uniform set of initramfs tools, Linus suggested that the process could start with some baby steps. The kernel could include some bits of code which are automatically added into whatever initramfs image the distributor provides. There are challenges to making that work too, of course. The best way, perhaps, is just to dump everybody's initramfs and start over with a new, clean version. That project may get underway before too long.


(Log in to post comments)

KS2008: Bootstrap code

Posted Sep 16, 2008 9:23 UTC (Tue) by nix (subscriber, #2304) [Link]

I'd be really annoyed if it became difficult to roll your own initramfs. I've got systems which run entirely from initramfs and systems which do very peculiar things from it (such as kicking up NBD connections to remote hosts and bringing up RAID arrays from them, and even, in one case, getting a root filesystem from an LVM-atop-two-RAIDs-atop-multiple-NBDs). Having to ditch all that or work around attempts to jam another initramfs system down our throats would be extremely irritating. (Of course I could just not use that initramfs and forward-port the existing assemble-your-own-initramfs-from-usr/ code, but that's extra work for, from my perspective, no gain at all.)

KS2008: Bootstrap code

Posted Sep 16, 2008 10:47 UTC (Tue) by drag (subscriber, #31333) [Link]

I have found that working with Debian's Initramfs stuff isn't too difficult.

For example, if you have busybox installed when Debian generates the initrd then it includes that. Then throwing a extra 'sh' in a script or whatnot is very easy thing to do and makes it easy to troubleshoot.

Adding your own functionality is pretty decent.

There are two places in the file system were you would have to concern yourself with when editing the initramfs. One is the /etc/initramfs/ directory and the second is the /usr/share/initramfs

The directory layout of both are the same, with the exception of the /usr/share/initramfs having the init and hook-functions scripts. Users are suppose to edit the stuff in the /etc directory and packages install their hooks and whatnot into the /usr/share/ stuff.

You add your own hook scripts to for adding binaries and such. Then you have directories like scripts/init-premount/ or scripts/local-bottom were you stick your scripts for adding functionality.

And not only that, it's decently documented. 'man initramfs-tools' has useful documentation on the boiler plate code your scripts need and what the different helper functions are available and whatnot.

It's nice that it's fairly modular and adding or removing scripts for waking up from hibernation (for example) won't break the scripts for doing nfs-root and whatnot.

It's not perfect, of course.

Now compared to that... I tried working with Fedora's initramfs stuff figuring they have something similar that would be editable.

Needless to say I didn't get very far at all. It was very unpleasant.

--------------------------

Now it's all fine and dandy to have some humongous distro-specific init script you just shove into the root of your initramfs and have all sorts of uncommented magic for setting everything up...

But from third party's perspectives this is unusable. If they want to standardize on a initrd that can support multiple distros it needs to be treated as a mini-Linux distribution in it's own right. Something that has a clean directory structure, good documentation, and is familiar.

If you add busybox support you should be able to shell out into it. It should have a init script structure that is similar to what is used in a regular Linux distribution.

A user will need to be able to see the directory structure and the files and whatnot that will be included in the final initramfs image. Something that they can chroot into and fart around with would be very handy. So you should be able to have a option in the initramfs build scripts that instead of cpio and compressing the directory it should stop there and leave the directories in a /tmp directory some were.

That sort of thing.

-------------------

As a general rule:
If you don't document stuff well and you don't make it simple for people to understand and work with then it's going to be much easier for end users to write their own stuff then try to bang their heads on their tables after trying to understand yours.

THAT is why each distro writes their own right now.

KS2008: Bootstrap code: kdump kernel

Posted Sep 16, 2008 10:46 UTC (Tue) by maneesh_soni (subscriber, #7770) [Link]

Initramfs gets little more complicated in case of kdump kernel boot. It is used to automaticaly save the old kernel's image (/proc/vmcore) to a local filesystem or remotely thru NFS/FTP/SCP etc. It also does things like filtering the vmcore or other housekeeping jobs. Here also different distros have different approaches.

klibc

Posted Sep 16, 2008 17:29 UTC (Tue) by katzj (subscriber, #23350) [Link]

The other reason glibc ends up being pulled into the initramfs is that distributions don't want to have multiple copies built of all of the tools used in the initramfs. When "all of the tools" was an all-in-one tool that loaded some modules and did mount(2), that wasn't such a big deal. But once you get to wanting to do raid, lvm, encryption, network setup, nfs, nbd, iSCSI, etc you end up with a much longer list of binaries. And it just means more bits on disk, more things to have to build and inevitably more bugs to track down.

klibc

Posted Sep 16, 2008 18:48 UTC (Tue) by drag (subscriber, #31333) [Link]

that's why it's important to be modular.

With Debian's initramfs, for example, the packages for NFS, uswsusp, dm-crypt, etc will install their own scripts for pulling in binaries from the file system and init scripts to the /usr/share/initramfs/ and get those included. Then when you install those packages the initramfs-based initrd is rebuilt.

This way your only including support for things that are installed and your probably going to want to use. If you don't have uswsusp support in your OS then you don't have it in your initrd.

klibc

Posted Sep 16, 2008 21:27 UTC (Tue) by jengelh (subscriber, #33263) [Link]

And Redhat's initramfs is anything but modular. Heck, it even replicated a lot of features in its "nash" thing, like raidautorun, which, ahem, got obsoleted like when mdadm came to life.

klibc

Posted Sep 17, 2008 9:36 UTC (Wed) by rh-kzak (subscriber, #51571) [Link]

I think it's better to have multiple copies built of the tools (=same code) than *duplicate code* in the nash. It's cheaper to rebuild against klibc than maintatin completely separate code in (Fedora/RHEL) nash.

KS2008: Bootstrap code

Posted Sep 16, 2008 23:47 UTC (Tue) by jstultz (subscriber, #212) [Link]

I just wish boot-required kernel modules were automatically attached to the kernel file. Cleanly splitting the kernel specific portions of the initrd (keeping them with the kernel) and the generic distro portions of the initrd.

That way there could ideally be one distro-specific initrd file for all of the kernels one might build or install.

KS2008: Bootstrap code

Posted Sep 17, 2008 21:42 UTC (Wed) by vmole (subscriber, #111) [Link]

The best way, perhaps, is just to dump everybody's initramfs and start over with a new, clean version.

Yes, let's throw away 10 years of bug fixes. That worked really well for the Netscape->Mozilla project...

Better to spend some time looking at what the different big distributions do (because they probably have hit the most weird situations), pick one, and ask the other distributions "Why won't this work for you."

KS2008: Bootstrap code

Posted Sep 18, 2008 3:15 UTC (Thu) by rahulsundaram (subscriber, #21946) [Link]

As you can see from the rest of the article. this is precisely what is being done. Using Fedora's as a starting point.

KS2008: Bootstrap code

Posted Sep 18, 2008 16:33 UTC (Thu) by vmole (subscriber, #111) [Link]

That's what Dave Jones is doing. If you read the rest of the article, you see there are some other approaches being considered :-).

I think the whole thing is going to be a tough sell. The first time a customer's system won't boot because of the new init code, there will be strong temptation to revert.

KS2008: Bootstrap code

Posted Sep 19, 2008 2:39 UTC (Fri) by opalmirror (subscriber, #23465) [Link]

I think the whole thing is going to be a tough sell. The first time a customer's system won't boot because of the new init code, there will be strong temptation to revert.

Like the PATA hard disk support for SCSI? Initial pain, long-term gain.

solving the wrong problem?

Posted Sep 19, 2008 2:13 UTC (Fri) by roelofs (subscriber, #2599) [Link]

This code is ugly, but nobody wants to switch to anybody else's version. They fear that a different initramfs will lack all those hard-earned workarounds, and, besides, everybody feels that their particular solution is the best.

That's sociology, and I claim it's the fundamental problem. The rest of the article mostly talks about technology and implies that it will somehow address the sociological issues, but it seems clear to me that that's not the case. (Or at least it's clear to me that the argument has not been successfully made.)

Better to spend some time looking at what the different big distributions do (because they probably have hit the most weird situations), pick one, and ask the other distributions "Why won't this work for you."

As you can see from the rest of the article. this is precisely what is being done. Using Fedora's as a starting point.

To restate it slightly: a Red Hat/Fedora guy is starting with the Red Hat/Fedora solution and hoping/expecting that it will work for everybody after a bit (or even a lot) of tinkering.

That, too, is sociology, but of a different sort. Assuming we can agree that the fundamental problem is sociological--and I'm under no illusions that we'll even agree on that, but bear with me ;-) --then it would seem that the approach Dave has taken could do with a fair bit of improvement. In particular, if you really want to convince others that "we can all just get along," maybe the starting point should be someone else's solution (perhaps Debian's or SuSE's or whatever). That would make a strong statement, and it would become an even stronger statement if/when Fedora and Red Hat officially adopted the new initramfs. (Of course, if they never did, that would also make a strong statement, albeit of a different flavor. :-) )

Greg

solving the wrong problem?

Posted Sep 22, 2008 19:43 UTC (Mon) by tytso (subscriber, #9993) [Link]

To restate it slightly: a Red Hat/Fedora guy is starting with the Red Hat/Fedora solution and hoping/expecting that it will work for everybody after a bit (or even a lot) of tinkering.

Actually, what Dave said is that all three distro's initrd systems suck, just in different ways, and he's going to start from scratch to create something cleaner. It will be something that is designed to work on Fedora, yes, but other distro's will be welcome to comment and submit improvements to allow it to work for them.

Dave didn't write the original Red Hat / Fedora initrd, so he doesn't have any pride/ego investment in the current code. So honestly, what he offered to do is really the best the community could for hope for, and I think will ultimately lead to improvements for everyone.

solving the wrong problem?

Posted Sep 22, 2008 23:00 UTC (Mon) by rahulsundaram (subscriber, #21946) [Link]

You might want to read

http://kernelslacker.livejournal.com/130745.html

Might clear up some wrong assumptions here.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds