Managing dynamic device naming

[Posted April 15, 2003 by corbet]

The coming increase in the size of dev_t adds to the urgency of the device naming problem. Even if device numbers remain entirely static, there will be management issues to deal with. Consider the case of SCSI disks, for example. The wider dev_t will make it possible to have thousands of disks on a single system, and the maximum number of partitions will be increased to 64. /dev is already a big directory on modern distributions - over 12,000 entries on a Red Hat Linux 7.3 system, 2000 in the cciss subdirectory alone. It is unwieldy to work with now, but consider what happens with the device names for all those new drives and partitions are added; now /dev has several hundred thousand entries. And we haven't even begun to look at all those new serial ports, tape drives, printers, and CueCat barcode readers we'll be able to add.

Richard Gooch beat the rush and started worrying about this problem some years ago; the result was devfs. The devfs code has been in the mainline kernel since the 2.3 days, but it is not heavily used. It puts naming policy firmly in the kernel itself (you get /dev/disc whether you like it or not), and it solves persistent permissions issues by way of a deamon process and a "make a tarball at shutdown" technique that strikes some as inelegant. Some kernel developers have also made a longstanding hobby of complaining about the quality of the devfs code.

The end result is that there would seem to be an opening for a different approach. One alternative began to come into focus this week with the release of udev 0.1. udev is an effort by Greg Kroah-Hartman (and others) to push the device naming issue completely into user space, with the result that the kernel hackers would be free to go off and argue about something else. The current udev implementation is a minimal demonstration of the concept, but the longer-term vision calls for three distinct components:

"namedev" is a subsystem which has the job of coming up with useful names for devices. It could make use of whatever information is available: device numbers, hardware ID numbers, filesystem labels, etc.; it would then apply the site's particular policy to produce a suitable name. On simple systems, a simple flat file (or hardcoded names) would suffice; the 4000-disk monster system could dedicate one drive to a relational database for device naming.
"libsysfs" would provide a common API for obtaining information about devices from sysfs.
"udev" is a separate application which is run in response to hotplug events; it uses the above two modules to gather the information it needs, then creates or removes device nodes as appropriate.

In the current release, everything is bundled together into a single "udev" binary. It requires a series of patches on top of 2.5.67 to create hotplug events when kobjects are registered (these patches have been merged into Linus's BitKeeper repository, and thus will be unnecessary for 2.5.68 and later kernels), and, even then, can only work with devices which export their device number via sysfs. Still, your editor had no trouble making it work on his sacrificial system. Loading the simple block driver from the driver porting series caused a set of block device nodes to be created in /udev - with no changes to the driver required. The basic idea works.

A lot of work remains to be done before udev is ready for prime time, however. Some of the issues needing resolution are:

Robust management of device events. The current hotplug mechanism creates a separate process for each event, each of which runs whatever program has been designated to handle those events. Among other things, this mechanism has race conditions; if a device is quickly attached and removed, the unplug event could end up being processed first. Attaching a large disk array could create an "event storm" that threatens to overwhelm the system. So there is a fair amount of interest in serializing events, but little agreement on how that should be done.
A related issue is that multiple programs may want to receive hotplug events. One might load a driver, another runs udev, yet another mounts partitions on a newly-attached disk, etc. Possible solutions here include using Greg's /sbin/hotplug multiplexor, distributing events in user space with D-BUS, or distributing them in the kernel via a new event interface.
How desirable is per-site device naming policy anyway? A world where each distribution, if not each installation, has its own device naming scheme does not look like an improvement to a lot of people. Vendors cringe at trying to support that sort of setup. So there is a need for some sort of common policy. The Linux Standard Base decrees that the LANANA devices.txt file is the definitive authority for standard device names, which is a start. But there is a strong desire for more flexible and generic naming (all disks under /dev/disk, for example, with no distinction between SCSI and IDE drives); the device list will probably have to be revised to fit the dynamic, very large systems of the future.

All of these issues should be solvable, of course, and the fact that they are being discussed indicates that people are getting serious about solving the problems. The 2.6 kernel will probably go out with the larger dev_t and, perhaps, some hooks for udev-like programs. Things could get more interesting once the 2.7 development series opens up, however.

Managing dynamic device naming

Posted Apr 17, 2003 4:50 UTC (Thu) by komarek (guest, #7295) [Link]

I've always been on the side that claims devfs works well. There's a few persistance-related things I don't care for, but overall it seems to be a good solution. One thing I've always thought would be nice is multiple dev filesystems union on top of each other. You could boot with a basic devfs that had your console and such, and then mount devfs over the top of that with whatever union policy was used.

That still doesn't take care of ownership and permissions, but it could eliminate the tarball mess and perhaps even reduce the duties of the daemon.

What I'd really like to know is when we'll have union mounts. Last I checked the kernel code, they weren't supported. We don't need to be fancy about it, like the last time I saw someone making excuses for why the community hadn't done it yet. Union mounts would be tremendously helpful in many areas. My pet favorite is union mounting a microdrive over the top of a skeleton root filesystem on my iPaq.

-Paul Komarek

Managing dynamic device naming

Posted Apr 17, 2003 13:35 UTC (Thu) by ken (subscriber, #625) [Link] (3 responses)

This is hopeless people will never agree how this should be done.
Personally I thought the issue would go away when Richard got
devfs in but apparently not.

Is it really a problem that the name of the device is set by the kernel?
It's not like you can name them anything you want anyway they better be
static or users would go nuts.

Managing dynamic device naming

Posted Apr 17, 2003 17:12 UTC (Thu) by cpeterso (guest, #305) [Link] (1 responses)

since no one will ever agree on the proper naming scheme, I think pushing the naming policy to userland makes sense. There is only one "official" Linux kernel, so someone will always be upset with a kernel naming policy. In userland, there can be infinite competition for udev-like projects. Sounds good to me!

The kernel should implement mechanism, not policy.

Managing dynamic device naming

Posted Apr 18, 2003 23:43 UTC (Fri) by wolfrider (guest, #3105) [Link]

> In userland, there can be infinite competition for udev-like projects. Sounds good to me!

"Are you brain-dead?!" You *want* your devices to be named consistently across distros, you know.

--How about a "limited devfs" where only the types of devices that NEED lots of space (disks) have to go thru it?

Managing dynamic device naming

Posted Apr 17, 2003 18:42 UTC (Thu) by iabervon (subscriber, #722) [Link]

One important aspect of the device naming problem is that it isn't uniform. There are a number of devices which are completely standard (e.g., /dev/null). There are a number where the permissions don't need to be persistent, because they will be set when the device is opened (ttyp*). There are a number of symlinks. There are a few things that aren't devices but are in /dev (initctl, log). There are a few cases where there are a ton of nodes which follow a pattern (fd*, partitions of disks).

This all means that it's easy to find an entry that justifies just about any idea, but hard to find a design that fits all of the behaviours desired even for the current set of devices.

larger dev_t ?

Posted Apr 17, 2003 17:14 UTC (Thu) by cpeterso (guest, #305) [Link] (1 responses)

Will Linux 2.6's dev_t be 32 bits as earlier planned or 64 bits? There was some experimental kernel patches for 64 bits recently. I think Linus should just go all the way to 64 bits. It seems inevitable, so why create two dev_t disruptions (16->32 now and 32->64 later) instead of just one (16->64)?

larger dev_t ?

Posted Apr 17, 2003 17:16 UTC (Thu) by corbet (editor, #1) [Link]

64 bits looks like the likely outcome, but we'll only know for sure when Linux merges a patch...

Managing dynamic device naming

Posted Apr 17, 2003 18:37 UTC (Thu) by oneukum (guest, #3970) [Link]

I am afraid that your description of the important race in the userland approach
is wrong. With an unplug before a plug we can live. The fix is obvious.

The really difficult race is a replug race. It goes like this.
- device A plugged in
- udev configures A (including permissions)
- device A unplugged
- device B plugged in

This is the culprit. under the current system device numbers are reused which
means that for a short window a device can be accessed with permissions it
should not have.
Unfortunately any userland solution does have this problem.

Managing dynamic device naming

Posted Apr 19, 2003 2:37 UTC (Sat) by giraffedata (guest, #1954) [Link]

First of all, the kernel is naming the device under any system. In the non-devfs approach, it is naming it with a number; with devfs it names it in a more friendly and flexible name space.

With devfs, you are not stuck with the name the device driver chose. User space can generate symbolic links to satisfy custom naming requirements.

Note that in the proc filesystem, the kernel chooses the file names; I've never heard anyone complain about not being able to choose the name for /proc/filesystems or /proc/scsi/qla2000.

We must strive to get away from device numbers. They are primitive. We're beyond that, and need something easier on us humans.

Managing dynamic device naming

Posted Apr 24, 2003 9:02 UTC (Thu) by job (guest, #670) [Link] (1 responses)

I did never understand why some people doesn't like devfs and why
surprisingly many distros doesn't ship it by default. I've used it extensively for
over four years and it has been very good to me over the years.

I love the newer naming scheme they use where a disk can be addressed
both using its IDE ID and system disk number (under /dev/discs). You can
immideately see what peripherals the system can see (plugged in, working and
with the right driver in the kernel!) in the file system.

Together with LVM I've learned to love it. That way whole volume groups just
show up *using their symbolic names* in /dev, independent of on which port I
plug in the disks (and note: plural)! And now that the first release of EVMS 2
using the 2.5 DM is out, I'm keen to switch over to that system instead as it is
much more flexible.

Managing dynamic device naming

Posted Jul 31, 2003 10:39 UTC (Thu) by jdthood (guest, #4157) [Link]

I agree. I have never had a single problem with devfs.
And I have never heard a single argument against devfs
that wasn't easily refuted. If there is a killer argument
against devfs that I haven't heard yet, please post it here.

Managing dynamic device naming

Posted May 2, 2004 15:48 UTC (Sun) by RicoWSuaveIII,Esq (guest, #21321) [Link]

Every system having private device naming schemes seems like a M$ conspiracy to put GNU/Linux out of commission ;-)

Userspace? Fine. But the naming scheme just *has* to be uniform over all systems and all distros. Let the kernel implement policy in these cases. It is a reasonable way to enforce uniformity (major and minor numbers are policy too! ;)

How can the kernel get away from using policy altogether, and still insert device drivers as pluggable modules? (and we don't want hardware access strictly in userland!) I don't want the kernel having to consult text files for this! That's ridiculous.

One repeated argument for devfs (applies to udev now too) is that progs can scan the fs to discover if a device is present. If you change the naming, then you lose the capability. Your "special" distro has to alter every app to look for devices in your "special" places.

Besides, if you had to have your own personal device naming scheme (foreseeably necessary for a super-secure router or some such) you could take advantage of the openness of the source. In closed source, separation of policy and mechanism is much more important. With open source, hard-coding certain policies is necessary to maintain compatibility (which everyone I've heard speak, thinks is the biggest obstacle to open source).

About serialization of the udev processes to eliminate race conditions:

It seems to me that locks or semaphores could be used to eliminate this problem. Until a device has been removed, the device name cannot be reused. Of course, this *needs* to be stored in (or at least be changed from) userspace, so that ophaned locks or semaphores can be dealt with (when processes die without running their cleanup routines, or when orphans are created by bugs).

If the locks or semaphores are stored in the kernel, then an interface could be provided to some userspace program for fixing incorrectly locked states. (It's the kernel, so reentry isn't too important unless you're using smp, right? Even with clustering, machines don't share a single kernel image in memory, right? --Correct me, please, if I'm mistaken)