LWN.net Logo

Catching code which sleeps on the job

The kernel is full of code which is not allowed to sleep. Anything which is handling an interrupt or otherwise running out of process context, for example, should not try to go to sleep. This particular case is easy to catch in the scheduler, but others are not. For example, any code which is holding a spinlock can not sleep either. Sleeping in this situation can lead to deadlocks (some other process spinning on the lock can prevent the holder from running again and releasing the lock), mutual exclusion failures (on uniprocessor systems where spinlocks are optimized out), or, at a minimum, excessive lock hold time and lock contention.

The problem is that it can be easy to sleep in the wrong places. Sleeps are often not done directly; instead, a piece of atomic code calls a function which calls some other function which sleeps. The "sleep tendency" of functions is not always documented, and, in any case, kernel hackers, being human, can make mistakes. Even if it seems, at times, that they don't sleep.

Until recently, these mistakes have been hard to catch. There was no "I'm running in an atomic section" flag, and thus no way for the kernel to know that it is sleeping in a bad place - until something went badly wrong. The preemptible kernel patch changed all that, however. Any place where the code can not sleep is also certainly a bad place for that code to be preempted. So the functions which mark atomic sections (such as spinlock operations) now set a "don't preempt me" flag.

But once you have that flag, why not use it to detect sleeps in the wrong place? Andrew Morton posted a patch which does exactly that, and Linus merged it on the spot. The patch was titled "increase traffic on linux-kernel," and it has done exactly that. There are, it turns out, quite a few places where sleeping functions are called within code that is supposed to be atomic. These mistakes are being fixed almost as quickly as they are found. A small patch has done a lot to eliminate a whole class of kernel programming errors.


(Log in to post comments)

Copyright © 2002, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds