A constant feature of development kernel summaries is "device model work."
Perhaps it's time to take a look at what the device model actually is, and
where it's going.
The device model effort has its roots in the 2001 Kernel
Summit. It had become clear, at that point, that support of advanced
power management would require a more structured approach to the management
of devices in the Linux kernel. There has traditionally been no
centralized registry of devices in the kernel - no way to just ask the
system what devices were connected to it. Power management needs not only
the answer to that question, but also some idea of how all the devices are
plugged together. It doesn't do to shut down a SCSI controller before
stopping all of the peripherals connected to that controller, for example.
So the device model work, done mainly by Patrick Mochel, started by
adapting the existing PCI device scheme to represent a full system. At the
center of the scheme is struct device, which, of course,
represents a single device in the system. This structure contains quite a
few fields, including no less than six different list heads; some of these
fields will be examined shortly.
One type of device, of course, is a bus. There is a device
structure for each bus, along with a bus_type structure for each
type of bus. Almost every device on a system is reached via (at least) one
bus, and the device model topology reflects that. Each bus device
maintains, via the children list in its device structure,
a list of all devices plugged into that bus. By looking at the
bus_list field of any device in the system, the kernel can find
all other devices attached to the same bus.
Each device structure also maintains a parent pointer (to
another struct device, of course), and an entry into another
list (called simply node) of all its siblings under the same
parent. This hierarchy may look a lot like the bus lists already
mentioned, but that is not the case. A device may be on a USB bus, but its
parent may be the USB hub to which it is connected. Similarly, a SCSI tape
drive may be reached through a PCI bus, but its parent is the SCSI host
adaptor.
Thus, it is the parent and node lists that model the true
hierarchy of the devices in the system. One could suspend a computer by
starting at the top-level devices and doing a depth-first traversal of the
device hierarchy via each device's children list. In fact, the
device model makes this sort of traversal easy by maintaining a separate
"global device list" which contains every device on the system, in the
depth-first order.
As an example, your editor's system is represented in the driver model with
a hierarchy like the following:
root
pci0
PCI host bridge
ISA bridge
IDE interface
USB controller
USB bus
Lexar SmartMedia reader
ACPI bridge
SCSI adaptor
SCSI bus 0
Target 0 (disk drive)
Partition 1
Partition 2
Target 1 (DAT tape)
st0
nst0
...
Target 4 (CDRW)
Audio controller
MIDI port
Ethernet controller
Graphics card
sys
Interrupt controller
8253 Interval timer
floppy controller
Each entry in the hierarchy above is one device structure in the
model; each device's children list holds each indented entry below
that device. The global device list, instead, contains the full hierarchy
shown above, in order from top to bottom. ("sys" is a virtual bus
for devices not otherwise connected to a system bus).
The model, as described so far, shows the hierarchy of the system, but does
not allow the kernel to actually do much with those devices. The
next step involves a new generic structure:
struct device_driver, which is registered for each driver in
the system. This structure tells the system what type of bus the driver
expects to work with, and provides a set of useful functions. One of those
functions is probe; when a new device is discovered on the system
the base code calls the probe function of every likely-looking
driver for the relevant bus until a driver agrees to manage the device.
The system then sets the driver pointer in the device
structure, and knows how to find the right driver for the device from then
on.
This driver pointer is not used for normal, user-space accesses to
the device - that is still handled through the device arrays (indexed by
the device's major number). What that pointer can be used for,
however, is power management and hotplug events. If the kernel has been
told to suspend the system, for example, it now need only pass through the
global device list, calling the suspend function found in the
device driver structure for each device. Similarly, if the user unplugs a
device, the kernel can call that device's remove function to let
the driver know.
The above is sufficient to handle the basic functions needed by power
management and to support hotpluggable devices. It also unifies much of
the device probing and accounting logic in the kernel, allowing the removal
of a great deal of duplicated code. The device model work
has not stopped there, however. One recent (2.5.32) addition is the notion
of device classes and interfaces. The "class" of a device is the basic
function that it performs - it could be an "input" or "storage" device, for
example. Not much is done with the class information currently, but the
structure is there for class-level drivers to affect how the device is
managed.
"Interfaces" are paths to the device from user space - normally entries in
/dev. Devices which implement a given interface can be expected
to respond in certain, well-defined ways. As with classes, about all that
is done with interfaces, for now, is to remember them. But that could
change.
This discussion, so far, has left out an important subsystem which, while
technically not part of the device model, is intimately tied in with it.
"driverfs" is a virtual filesystem which provides a userspace
representation of the driver model data structure. This filesystem,
normally mounted at /devices, contains (currently) three top-level
directories:
- root contains the entire device tree in the usual
hierarchical form. By digging around in /devices/root, users
(or code) can get a handle on how the system is put together.
Driverfs also makes it easy for devices to export tunable parameters
(much like those found in /proc/sys) which can be found - and
tweaked - in the device tree.
- class contains an entry for each device class
registered in the system. Further down, an entry for every device
which implements that class can be found (it's a symbolic link to the
entry in the /devices/root tree). There are also entries for
each interface registered with a class, and, again, a symbolic link
for every device implementing the interface.
- bus lists each bus type (not each physical bus) on
the system and the devices managed by each.
(See
this example /devices listing,
which corresponds to the system hierarchy shown above, to see how it all
goes together).
Some readers may be noting a certain similarity between driverfs and
devfs. They do resemble each other in that they are both kernel-generated
virtual filesystems which contain entries for the devices in the system.
They differ, however, in that driverfs is intended to be a physical
representation of the system, while devfs is intended to provide user-space
access to the devices themselves. A devfs user can mount
/dev/discs/disc0; somebody perusing driverfs can, with sufficient
typing pain, find the directory
/devices/root/pci0/00:0e.0/scsi0/0:0:0:0/0:0:0:0:p1, but there's
nothing there to mount. Instead, a bunch of information - including the
device's major and minor numbers - is available.
So devfs and driverfs serve different purposes, but driverfs (with
/sbin/hotplug) could conceivably
supplant devfs in future kernels. While driverfs is not intended to be the
way users access devices, all the information needed to create
/dev nodes is (or can be) there. In the future, the /sbin/hotplug
script may be used to configure all devices as they are discovered in the
system; there is no reason why that script can not use the driverfs
information (including class and interface information) to create
/dev nodes implementing whatever policy the system administrator
likes. The result would be a flexible device naming and administration
scheme which removes policy from the kernel code.
That all remains in the future, however; the device model and driverfs are
still works in progress. Most driver code does not yet interface with the
device model; thus far, there has been little need to change the drivers
themselves, since the PCI code has done the necessary device registration.
Full implementation of classes and interfaces, however, is likely to
require digging into the driver code, and that could take a little while.
It could yet happen for 2.6, however.
(
Log in to post comments)