Weekly Edition Return to the Kernel page |
The Big Kernel Lock lives on
It was recently noted that
ioctl() system calls are still executed with the Big Kernel Lock
(BKL) held. A suggestion was made that drivers which can implement
ioctl() without the BKL held should be specially flagged as a way
of increasing parallelism. That suggestion looks like it will not get very
far. But it did pique your editor's interest in current use of the BKL.
Besides, there hasn't been a whole lot else going on this week.
The BKL is an artifact from when the Linux kernel first supported multiprocessor systems. Making the kernel safe for concurrent access from multiple CPUs has been a multi-year task; it is not a job that could have been done all at once at the beginning. So Linux 2.0 supported SMP systems by way of the BKL, which only allowed one processor to be running kernel code at any given time. The BKL is essentially a spinlock, but with a couple of interesting properties:
The BKL made SMP Linux possible, but it didn't scale very well. Its overhead could be felt even with two processors, and it made running on anything larger problematic. So the kernel developers have been breaking the BKL into finer-grained locks ever since. Thus, for example, the block I/O subsystem went from the BKL to its own lock (io_request_lock) in 2.2, and from that to individual queue locks in 2.6. The kernel now has thousands of locks, and some people had assumed that the BKL would be gone by 2.6. As it turns out, there are still over 500 lock_kernel() calls in the 2.6.6 kernel. For the curious, here are some of the places which still rely on this old, system-wide lock:
Given how poorly the BKL is viewed, it may be surprising that so many places in the kernel still use it. The simple fact is that, with regard to the BKL, all of the low-hanging fruit has long since been taken. For most of the remaining calls, removing the BKL is not worth the trouble and code churn. So, while removal of the remaining calls over the 2.7 development series looks entirely possible, it would not be surprising if that does not happen. (Log in to post comments)
The Big Kernel Lock lives on Posted May 27, 2004 4:21 UTC (Thu) by ncm (subscriber, #165) [Link] Would somebody please explain, briefly, how the BKL and its users interact with the (approximately) myriad other locks in the kernel? I.e. does the BKL only guard what is not guarded by any other lock? Might a driver need to take the BKL and another, finer-grained lock, before proceeding? Is there a natural order in which locks are taken?
The Big Kernel Lock lives on Posted May 27, 2004 11:24 UTC (Thu) by kunitz (guest, #3965) [Link] Alan Cox, I believe, emphasized: Locks protect data; not threads. As long as two threads don't access the same data, they are not required to share the same lock. Today most of the kernel data is protected by granular locks; however there is still data protected by the big kernel lock. So finding all the users of the big kernel lock is the easy part, you must find out which data is actually protected and you must introduce granular locks to protect that data.Even in the pre-SMP times you had to lock data against interrupt handlers. Linus simply disabled and enabled interrupts in the critical sections using the infamous cli()/sti() pairs. I believe, the simplicity of that solution inspired the big kernel lock.
The Big Kernel Lock lives on Posted May 27, 2004 11:54 UTC (Thu) by corbet (editor, #1) [Link] The BKL is a special lock; its purpose still, essentially, is to protect resources not covered by some other lock. Modern code running under the BKL may well take other locks, but it will be unaware of it - the locks will be taken further down the call chain. Once the code itself becomes lock-aware, the need for the BKL should go away.And yes, it is actually quite important to define the order in which locks are taken. If the same two locks can be taken in either order, the system will eventually deadlock. Lock ordering rules (and, in general, figuring out which locks you need) get to be a real problem as the number of locks grows; people like Larry McVoy have been warning for years that overly fine-grained locking leads to an unmaintainable kernel.
The Big Kernel Lock lives on Posted May 27, 2004 12:33 UTC (Thu) by brugolsky (subscriber, #28) [Link] I'm sure that you meant this, but just to clarify: fine-grained locks, in and of themselves, are not the problem. One can lock a list, or lock the individual elements; the choice generally impacts performance. Excessive lock depth (i.e., level of nesting) results in an unmaintainable code. It seems to be generally agreed that the cliff lies not far beyond four locks.
The Big Kernel Lock lives on Posted May 27, 2004 23:43 UTC (Thu) by nix (subscriber, #2304) [Link] `Seven, plus or minus two'... and since we don't want to restrict kernel maintainership to those who are lucky enough to have big short-term memories, less than five seems a good point to stop.
The Big Kernel Lock lives on Posted Jun 2, 2004 22:30 UTC (Wed) by shane (subscriber, #3335) [Link] Not to speak for Mr. Corbet, but I'm pretty sure he actually wasreferring to having too many locks. The problem is deadlock: one thread holding lock A, waiting for lock B; the other holding lock B, waiting for lock A. This is the simplest example (well, holding A and waiting for A is simpler, but you get the idea). Any circular chain of references is possible, and causes the same problem. This problem is easier to hit when you use many different locks. A programmer's natural inclination is to lock each resource as you need it. However, in order to prevent deadlock you should always lock in the same order. Which means that if any thread ever needs lock A and lock B, it always locks A and then lock B. This is not always optimal, as lock A may be held for a period of time when it is not needed.
The Big Kernel Lock lives on Posted May 27, 2004 17:54 UTC (Thu) by stuart2048 (subscriber, #6241) [Link] OK, so the BKL is a big ugly spin lock (or small and simple, depending on your perspective ;-). What about the thousands of smaller grained locks in the kernel (thousands -- really???). I'm curious how they are implemented.
The Big Kernel Lock lives on Posted May 28, 2004 0:13 UTC (Fri) by giraffedata (subscriber, #1954) [Link] They're the same kind of spin lock. But because they're small, and consequently not ugly, they are preferable. Small just means only a few things use each one.There's no reason that one CPU shouldn't access a proc file while another CPU accesses a sound card. But today that can't happen because they both use the BKL. The proc file access uses it to serialize proc file accesses and the sound card uses it to serialize sound card accesses, and as a byproduct they also mutually exclude each other. The only reason they both use the BKL is programmer laziness. If we find the energy, we can make one lock for proc files and another for sound cards and remove the ugliness. (Actually, I'm sure we would go much finer grained than that). I guess I should admit that the BKL isn't really the same as the fine-grained locks because of the BKL's unique property that it gets automatically released across sleeps. It would be even uglier if it didn't do that.
The Big Kernel Lock lives on Posted May 27, 2004 17:57 UTC (Thu) by iabervon (subscriber, #722) [Link] I'd guess that most of the uses of the BKL will go away for the purpose of removing the BKL code, or for the purpose of determining and documenting what data each area touches. What exactly does rpciod use that's protected by the BKL? Is that data still protected by the BKL? Will somebody know when redoing the locking wherever it is that rpciod has to be changed accordingly?
The Big Kernel Lock lives on Posted Jun 15, 2006 12:14 UTC (Thu) by shamalwinchurkar (guest, #38390) [Link] Would somebody please explain, briefly, write() and read() system calls of device drives also called by kernel after holding BLK lock?
|
Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.