|
|
Subscribe / Log in / New account

Toward a safer sysfs

The story of sysfs (and the device model in general) is a long and complicated one. The creation of a single data structure to represent the system's hardware and software configuration was long overdue; many tasks (power management, for example) cannot be done properly without it. Sysfs adds value to that structure by representing it to user space. This structure is useful in many ways, but it has also brought its share of hassles. Exposing kernel data structures to user space makes it hard to change those structures without breaking the user-space API; it also exposes every one of them to user-space initiated lifecycle problems.

Internally, the core building block for the device model is the kobject. Objects represented in sysfs - devices, for example - each contain a kobject which, among other things, is the focal point for sysfs access. The kobject also contains a reference count for the containing object which is used to manage its lifecycle. A given kobject and its containing data structure can be deleted when the reference count goes to zero - and not before. Reference counting works, but it can lead to surprises.

As an example, consider a USB device - a mouse, say. When this device is plugged into the system, a suitable device structure (containing a kobject) is created and registered with the kernel. When the mouse is unplugged, that structure is released. But imagine what happens if a user-space process opens a sysfs file associated with the mouse device while it is present, and keeps that file open long after the physical device goes away. The kernel must be able to handle operations on that open sysfs file, even though the driver thinks that the device it represents is long gone. The reference counting in the kobject makes this work - most of the time. The potential for confusion is high, though, especially with drivers which have not been written with this sort of lifecycle management in mind.

Back at the end of March, Tejun Heo posted a discussion of device model lifecycle issues which points out this problem and a few others. His argument is that the need to manage objects with different lifecycles makes programming with the device model hard - something developers have known for some time. Even the core device model maintainers will admit that it's easy to get things wrong.

More recently, Tejun has followed up with a patch set which attempts to simplify the situation. There is a great deal of cleanup work in these patches, and one small API change, but the core change is this: it enables a clean separation of the lifecycles of sysfs objects and the underlying data structures they represent. As a result, it is no longer necessary for code outside of sysfs to be concerned about the fact its data structures may have a shorter life than the sysfs objects representing those structures.

A sysfs directory (which represents a kobject) is represented within the kernel by struct sysfs_dirent. In current kernels, if the sysfs_dirent structure exists, its underlying kobject is expected to exist as well. It is not possible for the kobject to go away as long as the sysfs_dirent structure exists; that means that the structure containing the kobject must continue to exist as long as any references to the sysfs files exist. Tejun's patch works by eliminating that requirement.

In the modified sysfs, each sysfs_dirent contains a new counter called s_active. This counter tracks the number of active references to the object; these references are the ones which involve the associated kobject at the current moment. A user-space process which is holding a sysfs file open will not increase the s_active count until it performs an actual operation on that file, and the reference remains only for as long as it takes to complete the operation. Since most sysfs operations are quite fast, active references will not normally be held for long.

The active count, as it happens, is maintained with an rwsem - a reader/writer semaphore. Active references are tracked as readers, so there can be any number of them outstanding at a given time. The code to obtain an active reference works with a call to down_read_trylock(), meaning that it will take a "lock" (a reference) if one is available, but it will not block if the operation fails. All of the relevant sysfs operations have been changed to obtain active references before referencing the kobject - and they make sure that the reference was granted. If an attempt to obtain an active reference fails, sysfs fails the higher-level operation with -ENODEV.

The only way down_read_trylock() will fail is if another thread holds a writer lock on the semaphore - or is in the process of waiting for the readers to get out of the way so it can get that lock. Should something happen which causes the underlying kobject to go away, the cleanup code will call down_write() on the s_active rwsem in the sysfs_dirent entry, thus taking a writer lock. This call will cause any future attempts to obtain an active reference to fail; it will also block until all currently-existing active references are released.

The end result of all this is that, once the final kobject_put() call has completed for a given kobject, there will be no further attempts to access that kobject from sysfs. The kobject (and its containing data structure) can be safely deleted, and the driver need worry no more about it.

As an added bonus, there is no longer any need to increase module reference counts when sysfs attributes are being accessed. A driver which is being unloaded will release all of its devices, meaning that sysfs will no longer make any calls into the driver module anyway; the module reference count becomes superfluous. So Tejun's patch removes the owner field from attribute structures - a change which ripples through a significant amount of driver code.

There have been some comments on how the patches are implemented, but no disagreement with the ultimate goal; these changes could go in as soon as 2.6.22. Tejun would appear to have more improvements in mind, but, even with no further changes, the current patches go a long way toward making sysfs safer and easier to work with.

Index entries for this article
KernelDevice model
KernelSysfs


to post comments

Toward a safer sysfs

Posted Apr 12, 2007 2:58 UTC (Thu) by flewellyn (subscriber, #5047) [Link] (2 responses)

Okay, so what DOES happen when you try to read from, or write to, a sysfs file for which the
device no longer exists?

I mean, obviously this new code will prevent the system from causing weasels to erupt from your
nostrils or whatever other catastrophic event might occur, but surely there should be some
defined behavior here? Something along the lines of "device no longer exists, go away"?

Toward a safer sysfs

Posted Apr 12, 2007 3:09 UTC (Thu) by corbet (editor, #1) [Link]

Exactly that - it returns ENODEV.

Toward a safer sysfs

Posted Apr 19, 2007 18:55 UTC (Thu) by jd (guest, #26381) [Link]

Does that mean a device driver to make weasels erupt from a user's
nostrils will be broken? Can we add a special API for such drivers, say
next April 1st?


Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds