LWN Weekly Edition Front pageSecurity Kernel development Distributions Development Linux in the news Announcements ->One big page
This page Previous weekFollowing week Sponsored link Serve your customers, not your servers, with VERIO Linux VPS. Full-access test-drive here. |
Kernel developmentRelease status Kernel release status The current development kernel is 2.6.0-test1, which was released by Linus on July 13. As is appropriate in this stage of development, this patch consists (almost) entirely of fixes. See the long-format changelog for the details.The last of the 2.5 kernels was 2.5.75, released on July 10. This patch merged the anticipatory I/O scheduler (covered here last January), a new set of "kblockd" kernel threads (designed to handle block I/O operations without creating more such operations themselves), a scary new "nointegrity" JFS mount option, some software suspend tweaks, and, of course, lots of fixes and updates. See the long-format changelog for more. Linus's BitKeeper tree contains a handful of small fixes, as of this writing. Alan Cox has gotten back into the 2.6 prepatch business; his latest is 2.6.0-test1-ac2. This patch is made up almost entirely of fixes which have not yet made their way to Linus. Andrew Morton's 2.6.0-test1-mm1 is a much more bleeding-edge affair; it contains the latest ACPI code, the SELinux security module, a bunch of asynchronous I/O work, the 64-bit dev_t type, and much other stuff. The -mm tree is also where the bulk of the scheduler interactivity work is being done. The current stable kernel is 2.4.21. The 2.4.22 process continues to move relatively quickly; 2.4.22-pre6 (consisting almost entirely of fixes) was released on July 14.
Kernel development news The 2.6 test series begins On July 13, Linus began the 2.6.0-test series of development kernels. The move to the -test naming scheme indicates that the 2.5 development period is truly done, and that the focus is now strongly on stabilization. To that end, the -test1 release restricted itself to fixes and updates - except for the addition of Andries Brouwer's cryptoloop driver.This sort of announcement usually results in a flurry of "but X hasn't been merged yet" postings. Things are much quieter this time around. It would seem that, for the most part, the features that the developers want to see in the kernel are mostly in place. There are a few remaining loose ends, however:
In the past, Linus has not always been successful in making this kind of freeze stick. This time around, however, Andrew Morton will be involved in the stabilization process. Since Andrew will also be maintaining the resulting 2.6 kernel, he'll have a strong incentive to keep a lid on things during the test phase. Now, of course, is the time for people with an interest in 2.6 to try out the -test releases. Before trying out a 2.6-test kernel for the first time, however, a reading of Dave Jones's "what to expect" document is highly recommended (Joe Pranevich's Wonderful World of Linux 2.6 is also worth a look). Also note that putting a 2.6-test kernel on a production system is a risky thing to do; there are still known bugs and security issues to be dealt with.
64GB on 32-bit systems Once upon a time - not that long ago - the Linux kernel was unable to work with more than 1GB of physical memory (actually, just a little bit less). This limit was imposed by a couple of fundamental design decisions in the kernel:
The 3/1 split was not imposed by any particular external factor; instead, it was a compromise chosen to balance two limits. The portion of the address space given over to user addresses limits the maximum size of any individual process on the system, while the kernel's portion limits the maximum amount of physical memory which can be supported. Allowing the kernel to address more memory would reduce the maximum size of every process in the system, to the chagrin of Lisp programmers and Mozilla users worldwide. There were, however, patches in circulation to change the address space split for specific needs. The 2.3 development series added the concept of "high memory," which is not directly addressable by the kernel. High memory complicated kernel programming a bit - kernel code cannot access an arbitrary page in the system without setting up an explicit page-table mapping first. But the payoff that comes with high memory is that much larger amounts of physical memory can now be supported. Multi-gigabyte Linux systems are now common. High memory has not solved the problem entirely, however. The kernel is still limited to 1GB of directly-addressable low memory. Any kernel data structure which is frequently accessed must live in low memory, or system performance will be hurt. Increasingly, low memory is becoming the new limiting factor on system scalability. Consider, for example, the system memory map, which consists of a struct page structure for every page of physical memory in the system. The memory map is a fundamental kernel data structure which must be placed in low memory. It takes up 40 bytes for every (4096-byte) page in the system; that overhead may seem small until you consider that, if you want to put 64GB of memory into an x86 box, the memory map will grow to some 640 megabytes. This structure thus takes most of low memory by itself. Low memory must also be used for every other important data structure, free memory, and the kernel code itself. For a 64GB system, 1GB of low memory is insufficient to even allow the system to boot, much less do the sort of serious processing that such machines are bought for. One approach to solving this problem is page clustering - grouping physical pages into larger virtual pages. Among other things, this technique reduces the size of the memory map. Page clustering was covered here back in February. Recently, Ingo Molnar posted a patch which takes a very different approach. Rather than try to squeeze more into 1GB of low memory, Ingo's patch makes low memory bigger. This is done by creating separate page tables to be used by user-space and kernel code, eliminating the need to split the virtual address space between the two realms. With this patch, a user-space process has a page table which gives it access to (almost) the full 4GB virtual address space. When the system goes into kernel mode (via a system call or interrupt), it switches over to the kernel page tables. Since none of the kernel page table space must be given to user processes, the kernel, too, can use the full 4GB address space. The maximum amount of addressable low memory thus quadruples. There are, of course, costs to this approach, or it would have been adopted a long time ago. The biggest problem is that the processor's translation buffer (a hardware cache which stores the results of page table lookups) must be flushed when the page tables are changed. Flushing the TLB hurts because subsequent memory accesses will be slowed by the need to do a full, multi-level page table lookup. And, as it turns out, the TLB flush is, itself, a slow operation on x86 processors. The additional overhead is enough to cause a significant slowdown, especially for certain kinds of loads. The cost by the separated page tables is more than most users will want to pay. For those who have applications requiring large amounts of memory - and who, for whatever reason, cannot just get a 64-bit system - this patch may well be the piece that makes everything work. Of course, the chances of such a patch getting in to the mainline kernel before 2.7 are about zero. But it would not be surprising to see it show up in certain vendors' distributions as an option.
Bug trackers and kernel development The Kernel Bug Tracker ("bugme") is a BugZilla system run by the Open Source Development Lab. It currently holds information on over 300 reported bugs in the 2.5 kernel. The Tracker is seen by many as a useful tool that brings some organization and discipline to the task of stabilizing the kernel. So it came as a surprise to many when David Miller, maintainer of the networking subsystem, requested that networking bugs not be entered into the Tracker. It is, he says, the wrong way of solving the problem.The complaint with bug tracking systems is that they try to centralize what is otherwise an inherently distributed process. Bugs accumulate in the database, and a single person gets the job of managing all the bugs for a particular subsystem. If that person does not devote a significant amount of time to the task, the tracking system quickly clogs up with outdated reports, duplicated entries, and generally useless stuff. The time that goes into maintaining the bug tracker is, of course, time that is not available to actually fix the bugs. The proper way of dealing with bugs, according to David, is to simply report them to the relevant mailing list. The report will be seen by the developers who can fix the bug, others who have been affected by the bug can contribute additional information, and fixes can be publicly discussed. And people who, for whatever reason, do not want to deal with a particular bug report can simply hit "delete" and the message goes away. Of course, the "goes away" part is not always popular with those who report bugs; they would rather see the report hang around and annoy people until one of them deals with the problem. But anybody who has sent a few bug reports to a public list knows that those reports can simply vanish without a trace - a rather unsatisfying result. Why bother to report bugs if the reports can simply be ignored? According to David (and others), the lossy nature of mailing list bug reporting is actually a feature. Bug reporting, it is said, is a process similar to patch submission. Users who do not get satisfaction from a bug report should resubmit it. If the bug is not important enough for the user to "maintain" the report, it's not worth a whole lot of effort to fix. The "submit and retry" approach does have some advantages. Since it puts more of the responsibility for bug reports on the users submitting those reports, it scales more reliably as the number of users increases. Unimportant or "operator error" bugs vanish automatically without anybody having to shovel them out of a bug tracking system. Bugs which are fixed by (seemingly) unrelated patches also fade away automatically. The whole thing works in a scalable way without the need for central managers. This approach is foreign and scary, however, to those who feel the need to track every bug and keep a firm hand on the development process. It provides FUD fodder for those who would portray free software development as immature and untrustworthy. It's also frustrating to those who want to retain bug report information for statistical or data mining purposes. It is, however, typical of how the kernel development process works in general. And that process, for all its faults, has produced excellent results over years as the kernel (and its development team) has grown.
Patches and updates Kernel trees
Core kernel code
Device drivers
Documentation
Filesystems and block I/O
Memory management
Networking
Architecture-specific
Security-related
Benchmarks and bugs
Miscellaneous
Page editor: Jonathan Corbet |
Copyright © 2003, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.