The Driver Porting Series now includes several articles on how kobjects
work as a way of tieing together data structures and managing reference
counts. Experience shows, however, that truly envisioning how
kobject-linked data structures tie together is a difficult task. In the
hope of shedding a bit more light in this direction, and as a way for your
editor to exercise his minimal skills with the "dia" diagram editor, this
article will show how some of the crucial data structures in the block
layer are connected.
The core data structure in this investigation is the kobject. In the
diagrams that follow, kobjects will be represented by the small symbol you
see to the right. The upper rectangle represents the kobject's parent
field, while the other two are its entries in the doubly-linked list that
implements a kset. Not all kobjects belong to a kset, so those links will
often be empty.
At the root of the block subsystem hierarchy is a subsystem called
block_subsys; it is defined in drivers/block/genhd.c. As
you'll recall from The Zen of Kobjects, a
subsystem is a very simple structure, consisting of a semaphore and a
kset. The kset will define, in its ktype field, what type of
kobjects it will contain; for block_subsys, this field is set to
ktype_block. Pictorially, we can show this structure as seen on
Each kset contains its own kobject, and block_subsys is no
exception. In this case, the kobject's parent field is explicitly set to
NULL (indicated by the ground symbol in the picture). As a
result, this kobject will be represented in the top level of the sysfs
hierarchy; it is the kobject which lurks behind /sys/block.
A block subsystem is not very interesting without disks. In the block
hierarchy, disks are defined by a struct gendisk, which can be
found in <include/linux/genhd.h>. The gendisk interface is
described in this article. For our
purposes, we will represent a gendisk as seen on the left; note that it has
the inevitable embedded kobject inside it. A gendisk's kobject does not
have an explicit type pointer; its membership in the block_subsys
kset takes care of that. But its parent and kset
pointers both point to the kobject within block_subsys, and the
kset pointers are there too. The result, for a system with two disks,
would be a structure that looks like this:
Things do not end there, however; a gendisk structure is a complicated
thing. It contains, among other things, an array of partition entries (of
type struct hd_struct),
each of which has embedded within it, yes, a kobject. The parent of each
partition is the disk which contains it. It would have been possible to
implement the list of partitions as a kset, but things weren't done that
way. Partitions are a relatively static item, and their ordering matters,
so they were done as a simple array. We depict that array as seen on the
As you can see, the kobject type of a partition is ktype_part.
This type implements the attributes you will see in the sysfs
entries for each partition, including the starting block number and size.
Another item associated with each gendisk is its I/O request queue. The
queue, too, contains a kobject (of type queue_ktype) whose parent
is the associated gendisk. The I/O scheduler ("elevator") in use with an
I/O request queue is also represented in the hierarchy. The scheduler's
kobject's type depends on which scheduler is being used; the (default)
anticipatory scheduler uses as_ktype. The resulting piece of the
puzzle looks as portrayed on the left.
The request queue and I/O scheduler information in sysfs is currently
read-only. There is no reason, however, why sysfs attributes could not be
used to change I/O scheduling parameters on the fly. The selectable I/O scheduler patch uses sysfs
attributes to change I/O schedulers completely, for example.
Putting it all together
So far, we have seen a number of disconnected pieces. The full diagram can
be found on this page
is a bit wide to be placed inline with the text (a small, illegible
version appears to the right). Also on that page, you'll
find a corresponding diagram showing the sysfs names the correspond to each
The data structure as described is the full implementation of the
/sys/block subtree of sysfs. The full sysfs tree contains rather
more than this, of course. For each gendisk which shows up under
/sys/block, there will be a separate entry under
/sys/devices which describes the underlying hardware. Internally,
the link between the two is contained in the driverfs_dev field of
the gendisk structure. In sysfs, that link is represented as a symbolic
link between the two sub-trees.
Hopefully this series of pictures helps in the visualization of a portion
of the sysfs tree and the device model data structure that implements it.
The device model brings a great deal of apparent complexity, but, once the
underlying concepts are grasped, the whole thing is approachable.
to post comments)