Brief itemsannounced by Linus on May 22. The most significant changes are certainly the scheduling domains patch, and, surprisingly, the full set of object-based reverse mapping patches, including the anon_vma work. This patch also includes a generic msleep() function for millisecond-scale waits, a CPU frequency control update, a set of autofs4 patches, a set of patches to shrink the heavily-used dentry structure, the "filtered wakeup" mechanism (see the May 5 Kernel Page), a libata update, some architecture updates, the removal of the Intermezzo filesystem due to lack of use and support, a sysctl variable giving "huge page" access to a administrator-specified group, the ability to re-enable interrupts while waiting in spin_lock_irqsave() (for all architectures now), support in reiserfs for quotas and external attributes, the NUMA API, a big ramdisk fixup, and lots of fixes. See the long-format changlog for the details.
Linus's BitKeeper repository contains an implementation of separate interrupt stacks for the PPC64 architecture, an ALSA update, and a fair number of fixes.
The current tree from Andrew Morton is 2.6.6-mm5. Recent additions to -mm include a reworking of the symbolic link following code (allowing the eventual increase of the maximum symbolic link depth from five to eight), a new block I/O request barrier implementation (for IDE and SCSI), and the usual collection of fixes. Andrew has also quietly restored the 8KB stack option on x86 systems.
The current 2.4 prepatch is 2.4.27-pre3; no prepatches have been released since May 18.
Kernel development news
Linus defends the change in this way:
Most "implementation details" fit into rather less than 40 individual patches, do not involve difficult special cases (such as making all uses of mremap() work correctly), and avoid making significant changes to core parts of the virtual memory subsystem. That said, one should note that the core decision-making VM code has not been changed; the algorithm for choosing pages to move into and out of memory is the same as before. It is also notable that there have been almost no VM-related problem reports since 2.6.7-rc1 was released. This particular change may just work out in the short term after all.
A related topic is the 4G/4G patch, which separates kernel and user space entirely so that each can make full use of the 4G virtual address space on 32-bit systems. This patch has been considered for merging for some time, but has never quite found its way in. Most developers see it as an ugly hack (though, perhaps, a necessary one), and there is fear of the (possibly overstated) performance overhead that the 4G/4G mode imposes. Even so, some people wonder when this patch might be merged.
The answer seems to be "never, if at all possible." The motivations behind this patch are (1) to make more kernel-space low memory available on large-memory systems, and (2) to provide a larger virtual address space for applications. The first reason may well have just become moot; the anon_vma patch was merged because, among other things, it significantly reduces the amount of low memory used by the VM subsystem. The initial reports suggest that the current VM code handles 32GB of memory nicely on 32-bit systems. Since 32-bit systems rarely come more heavily loaded than that (so far), it is thought that the VM has gotten as good as it needs to be on those systems.
The real hope, however, is that a serious transition to 64-bit systems will happen before too long. The x86 architecture has been stretched much further than anybody would have expected it to go, and x86_64 makes the transition so easy that there is very little reason not to do it. The 4G/4G patch is likely to hang around (and be included by some distributors) for some time; if nothing else, all of the currently-deployed monster x86 systems are likely to go on running for a while yet. But the mainline kernel may just get away with saying "switch to 64-bit" and leaving that particular patch out.noted that ioctl() system calls are still executed with the Big Kernel Lock (BKL) held. A suggestion was made that drivers which can implement ioctl() without the BKL held should be specially flagged as a way of increasing parallelism. That suggestion looks like it will not get very far. But it did pique your editor's interest in current use of the BKL. Besides, there hasn't been a whole lot else going on this week.
The BKL is an artifact from when the Linux kernel first supported multiprocessor systems. Making the kernel safe for concurrent access from multiple CPUs has been a multi-year task; it is not a job that could have been done all at once at the beginning. So Linux 2.0 supported SMP systems by way of the BKL, which only allowed one processor to be running kernel code at any given time. The BKL is essentially a spinlock, but with a couple of interesting properties:
The BKL made SMP Linux possible, but it didn't scale very well. Its overhead could be felt even with two processors, and it made running on anything larger problematic. So the kernel developers have been breaking the BKL into finer-grained locks ever since. Thus, for example, the block I/O subsystem went from the BKL to its own lock (io_request_lock) in 2.2, and from that to individual queue locks in 2.6. The kernel now has thousands of locks, and some people had assumed that the BKL would be gone by 2.6.
As it turns out, there are still over 500 lock_kernel() calls in the 2.6.6 kernel. For the curious, here are some of the places which still rely on this old, system-wide lock:
Given how poorly the BKL is viewed, it may be surprising that so many places in the kernel still use it. The simple fact is that, with regard to the BKL, all of the low-hanging fruit has long since been taken. For most of the remaining calls, removing the BKL is not worth the trouble and code churn. So, while removal of the remaining calls over the 2.7 development series looks entirely possible, it would not be surprising if that does not happen.resigned from the project on May 5. Many of his packages have been picked up by others or have gone into the orphan state, but the kernel packages are important enough to require more careful handling.
The actual process of selecting the new kernel maintainer would appear to have been done in private; we were not able to get an answer from the Debian leader about just how it was done. The results have now been made public, however. The Debian kernel will now be maintained by a team, with William Lee Irwin and Al Viro at the core. Additional helpers include Troy Benjegerdes, Dann Frazier, Goto Masanori, Christoph Hellwig, Benjamin Herrenschmidt, Anton Blanchard, and Arjan van de Ven.
In other words, Debian will now have a set of kernel packages maintained by active kernel developers. This should help to improve the quality of Debian's kernels (though, it should be said, complaints about Mr. Xu's kernels were rare) and to improve the feedback from Debian into the kernel development process. Mr. Irwin's plans include "aggressive mainline tracking" and, eventually, a unified source package for all architectures supported by Debian. Expect some interesting things from the Debian kernel in the near future.
Patches and updates
Core kernel code
Filesystems and block I/O
Page editor: Jonathan Corbet
Next page: Distributions>>
Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds