LWN Weekly Edition Front pageSecurity Kernel development Distributions Development Linux in the news Announcements Letters to the editor ->One big page
This page Previous weekFollowing week Sponsored link Serve your customers, not your servers, with VERIO Linux VPS. Full-access test-drive here. |
Kernel developmentRelease status Kernel release status The current development kernel is 2.5.52, which was released by Linus on December 15. It consists mostly of fixes and updates, of course, but there's also a bunch of changes from Andrew Morton's "-mm" tree (including the long-term fix for the ext3 data=journal corruption bug), XFS and JFS updates, more module fixes, and a kconfig update. See the long-format changelog for the gory details.The current stable kernel is 2.4.20; Marcelo released the second 2.4.21 prepatch on December 18. This large patch is mostly made up of ia-64 updates, but it also includes some NFS fixes, a couple of ext3 fixes, a bunch of stuff from the "-ac" tree, a new megaraid driver, and various other fixes and updates. For those using very stable kernels: Alan Cox has announced the first 2.2.24 release candidate. It contains a handful of bug fixes, including one for a new denial of service vulnerability caused when somebody runs mmap() on a /proc/pid/mem file.
Kernel development news How to speed up system calls It all started with an observation that system calls on a modern Pentium 4 processor are far slower than on older CPUs. It seems that, for whatever reason, software interrupts generated with the int instruction are very slow with the P4 processor. Since x86 Linux invokes system calls with "int $0x80", that slowness makes itself felt - especially with system calls (like getpid()) that would, otherwise, be very fast.There is an obvious solution to this problem: use the sysenter instruction instead. sysenter is quite a bit faster on modern Pentium processors. There are just a couple of problems: not all x86 processors support sysenter, and sysenter steps on registers in ways that can be hard to work around. The lack of across-the-board support for sysenter is a problem. The kernel maintains a set of flags telling it what capabilities a given processor has; other processor-specific options are set at configuration time. System calls, however, are not invoked from the kernel - that is the C library's job. The last thing glibc needs is to be trying to figure out, at run time, the right way to invoke system calls. Linus's solution to this problem is a patch which brings back a variant of an old idea. As of 2.5.53, the kernel will map a global, read-only page at the top of every process's address space. That page contains the optimal code for executing a system call on the current processor. Whenever glibc needs to call into the system, it simply sets up the registers and, rather than doing the old int $0x80, it jumps into the new page. The C library still needs to do a runtime test (since older kernels will lack this "vsyscall" page), but it need not concern itself with the detailed capabilities of different processors. Keeping the registers straight turned out to be a trickier problem. The way sysenter steps on registers makes it hard to invoke system calls with more than five parameters. Various schemes were looked at, including creating a new "extra argument block" or simply requiring that six-argument system calls be invoked the old way. Linus finally came up with a tricky solution that makes it all work, however; those of you who like digging through x86 assembly may want to peek at his "absolutely wonderfully disgusting solution" to the problem. "I'm a disgusting pig, and proud of it to boot." The result of all this: the gettimeofday() system call runs in just over half the time on a P4 processor. The speedup on Pentium 3's is less - a factor of 1.2 - but is still worthwhile. Now that the vsyscall page is in place, will it be used for other things, such as implementing gettimeofday() entirely in user space? The answer, for now, appears to be "no". Getting a user-space gettimeofday() right is, seemingly, harder than it looks; there are synchronization issues, especially on some SMP systems where the clocks may not be synchronized by the hardware. So a user-space gettimeofday() appears to not be in the works, for now at least.
Whatever happened to the feature freeze? While most people seem to think that the new system call mechanism makes sense, the question has come up: what kind of feature freeze are we in if we're adding things like a whole new way of doing system calls? Alan Cox, perhaps, had the most direct comment:
Linus. you are doing the slow slide into a second round of
development work again, just like mid 2.3, just like 1.3.60, ...
Given the high hopes that have been placed on this feature freeze actually working, this sort of remark is something to be concerned about. Linus has acknowledged the concern, and started a discussion on how patches should be reviewed. Looking ahead:
I thought about the code freeze require buy-in from three of four
people (me, Alan, Dave and Andrew come to mind) for a patch to go
in, but that's probably too draconian for now. Or is it (maybe
start with "needs approval by two" and switch it to three when
going into code freeze)?
There seems to be fairly widespread agreement, however, that this approach could be overly bureaucratic for now. Each development kernel release still contains hundreds of patches (636 for 2.5.51; in 2.5.52 there were "only" 153); people are understandably nervous about having that many patches go through a committee. Or even worse, being on the committee. Of course, Larry McVoy has an elaborate approach involving BitKeeper all planned out, but, given that a couple of people on the short list don't use BitKeeper, things will probably not go that way. Andrew Morton has suggested simply adopting a set of guidelines for what can be accepted. The suggested list:
Anything outside of that list would not be included at this point. As the freeze gets harder, items are dropped off the list, until only bug fixes are left. Given everybody's time constraints, the relatively informal approach is the most likely one to be adopted at this point. The important thing, in the end, is that everybody agrees that the feature freeze is important and is keeping an eye out for violations. As long as that continues, things will hopefully not get too far out of control.
Supporting hardware crypto in the kernel Now that the kernel has its own cryptographic API, James Morris is thinking about how to support cryptographic hardware. A number of cards which perform cryptographic functions exist, and it would be nice to be able to make full use of these cards with a Linux system. Quite a few issues need to be considered on the way there, however, including:
And so on. Now is the time to get these decisions right; anybody who is interested in the interface to cryptographic hardware should probably have a look at James's posting and join the discussion.
Elks Distribution Edition 0.0.5 released Don't throw away that old 80286 system yet - with the just-announced release of EDE (Elks Distribution Edition) 0.0.5, that system, too, can run Linux. EDE comes with a bleeding-edge 0.1.1 kernel and a new elkscmd package; click below for the details. (Thanks to Alan Cox).
Patches and updates Kernel trees
Core kernel code
Development tools
Device drivers
Documentation
Filesystems and block I/O
Janitorial
Memory management
Architecture-specific
Benchmarks and bugs
Miscellaneous
Page editor: Jonathan Corbet |
Copyright © 2002, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.