Managing dynamic device naming
Richard Gooch beat the rush and started worrying about this problem some years ago; the result was devfs. The devfs code has been in the mainline kernel since the 2.3 days, but it is not heavily used. It puts naming policy firmly in the kernel itself (you get /dev/disc whether you like it or not), and it solves persistent permissions issues by way of a deamon process and a "make a tarball at shutdown" technique that strikes some as inelegant. Some kernel developers have also made a longstanding hobby of complaining about the quality of the devfs code.
The end result is that there would seem to be an opening for a different approach. One alternative began to come into focus this week with the release of udev 0.1. udev is an effort by Greg Kroah-Hartman (and others) to push the device naming issue completely into user space, with the result that the kernel hackers would be free to go off and argue about something else. The current udev implementation is a minimal demonstration of the concept, but the longer-term vision calls for three distinct components:
- "namedev" is a subsystem which has the job of coming up with useful
names for devices. It could make use of whatever information is
available: device numbers, hardware ID numbers, filesystem labels,
etc.; it would then apply the site's particular policy to produce a
suitable name. On simple systems, a simple flat file (or hardcoded
names) would suffice; the 4000-disk monster system could dedicate one
drive to a relational database for device naming.
- "libsysfs" would provide a common API for obtaining information about
devices from sysfs.
- "udev" is a separate application which is run in response to hotplug events; it uses the above two modules to gather the information it needs, then creates or removes device nodes as appropriate.
In the current release, everything is bundled together into a single "udev" binary. It requires a series of patches on top of 2.5.67 to create hotplug events when kobjects are registered (these patches have been merged into Linus's BitKeeper repository, and thus will be unnecessary for 2.5.68 and later kernels), and, even then, can only work with devices which export their device number via sysfs. Still, your editor had no trouble making it work on his sacrificial system. Loading the simple block driver from the driver porting series caused a set of block device nodes to be created in /udev - with no changes to the driver required. The basic idea works.
A lot of work remains to be done before udev is ready for prime time, however. Some of the issues needing resolution are:
- Robust management of device events. The current hotplug mechanism
creates a separate process for each event, each of which runs whatever
program has been designated to handle those events. Among other
things, this mechanism has race conditions; if a device is quickly
attached and removed, the unplug event could end up being processed
first. Attaching a large disk array could create an "event storm"
that threatens to overwhelm the system. So there is a fair amount of
interest in serializing events, but little agreement on how that
should be done.
- A related issue is that multiple programs may want to receive hotplug
events. One might load a driver, another runs udev, yet another
mounts partitions on a newly-attached disk, etc. Possible solutions
here include using Greg's /sbin/hotplug
multiplexor, distributing events in user space with D-BUS, or
distributing them in the kernel via a new
event interface.
- How desirable is per-site device naming policy anyway? A world where each distribution, if not each installation, has its own device naming scheme does not look like an improvement to a lot of people. Vendors cringe at trying to support that sort of setup. So there is a need for some sort of common policy. The Linux Standard Base decrees that the LANANA devices.txt file is the definitive authority for standard device names, which is a start. But there is a strong desire for more flexible and generic naming (all disks under /dev/disk, for example, with no distinction between SCSI and IDE drives); the device list will probably have to be revised to fit the dynamic, very large systems of the future.
All of these issues should be solvable, of course, and the fact that they
are being discussed indicates that people are getting serious about solving
the problems. The 2.6 kernel will probably go out with the larger
dev_t and, perhaps, some hooks for udev-like programs. Things
could get more interesting once the 2.7 development series opens up,
however.
