LWN.net Logo

Reworking disk events handling

By Jonathan Corbet
January 19, 2011
A look inside any contemporary desktop-oriented system is likely to reveal a process which is steadfastly polling removable drives on the off chance that somebody might have removed or inserted a disk. Indeed, as your editor can attest, it can be hard to turn that polling off; there's little room in the world for strange people who have their own ideas of what they want to happen when they put a disk into a drive. Be that as it may, if the system is going to poll drives, it would be nice to do so in the best way possible. That is not currently the case, but, thanks to a patch by Tejun Heo, drive polling should be better in 2.6.38.

There are a few problems with how polling is done on Linux; these were nicely outlined by Tejun in the patch changelog. Polling on Linux requires opening the device; this is a somewhat heavyweight operation which does not naturally line up with other operations which might wake the processor. Opening the device in this way might interfere with other users; optical disk burning, in particular, is susceptible to this kind of problem. And polling the disk in this way generates a different set of commands than Windows uses; as Linux driver developers have discovered many times, behavior that differs from Windows is not well tested by vendors and tends to have unpleasant bugs. All that notwithstanding, user-space polling works well enough most of the time, but it would still be nice to make it better.

Tejun's patch works by moving the polling into the kernel. That makes the polling more efficient by removing the need to open the device and by allowing the kernel to delay polling wakeups until something else is going on as well. There is a new function added to struct block_device_operations which should be implemented by drivers:

    unsigned int (*check_events) (struct gendisk *disk, unsigned int clearing);

This function should check the device for new events and return a mask of any which were found. Two events are currently defined: DISK_EVENT_MEDIA_CHANGE and DISK_EVENT_EJECT_REQUEST, the latter of which is new. The clearing parameter is a mask of events which should be cleared until they happen again.

The old media_changed() block device operation still exists, but its use has been deprecated; drivers should be updated to use check_events() instead. Drivers should also, before adding a block device, initialize two new struct gendisk fields:

    unsigned int events;
    unsigned int async_events;

A mask of all events which can be reported by the device should be stored in events, while async_events should list the events which can be reported without needing to poll the device.

A new sysctl knob (block.events_dfl_poll_msecs) tells the kernel how often it should (by default) poll block devices. A value of zero (the default, currently) disables polling entirely. Polling intervals for specific devices can be set in their sysfs directories. If a device says that it can report all events asynchronously, and no polling interval has been explicitly set for it, that device will not be polled at all.

Since user space is no longer polling the device with this scheme, it needs a new way to find out when a disk event has happened. These events are now signaled via a uevent, meaning they can be handled via udev or some other utility which is watching those events. Note that any driver which handles asynchronous event reporting must call kobject_uevent_env() itself to send the event to user space. No driver in 2.6.38-rc1 does that; the first developer to add such a call may want to add a helper function to the core block code for that purpose.

Since polling is disabled by default, the kernel will behave the way it always has and existing user space applications will work. Once the user space environments have been changed to take advantage of this feature, they can turn on kernel polling and stop opening the devices themselves. That should lead to better power consumption and more reliable operation, which can only be a good thing.


(Log in to post comments)

Reworking disk events handling

Posted Jan 23, 2011 2:36 UTC (Sun) by dankamongmen (subscriber, #35141) [Link]

superb; optical drive polling is why i force HAL off my machines (and subsequently lose access to the various .deb's which dep upon it).

Reworking disk events handling

Posted Mar 10, 2011 19:18 UTC (Thu) by riteshsarraf (subscriber, #11138) [Link]

Thanks to the dev. This will be a far better option than what HAL/udisks did.

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds