Kernel development
Brief items
Kernel release status
The current 2.6 kernel is 2.6.1, which was released on January 8. The contents of this kernel are pretty much as described last week: a whole lot of fixes along with a few new features (MSI support, EFI support, a couple of internal API changes, etc.). See the long-format changelog for the details.
The latest patch from Andrew Morton, as of this writing, is 2.6.1-mm3. Recent additions to the -mm tree
include some anticipatory I/O scheduler work ("This is the 114th
patch against the anticipatory scheduler and we're nearly finished,
honest
"), improved CPU scheduler support for hyperthreaded
processors, working modular IDE drivers, a number of big architecture
updates, some SELinux updates, several NFS fixes, an ALSA update, the
kthread abstraction (discussed here last
week), and many other fixes and updates.
The current 2.4 kernel is 2.4.24; Marcelo has released no 2.4.25 prepatches since 2.4.25-pre4 on January 6.
Kernel development news
Kernel page editor Down Under
This week's Kernel Page is a little thin as a result of its normal editor being in Australia to attend Linux.Conf.AU. There are limits to the sort of kernel content that can be written over a conference wireless link while simultaneously making a show of listening to whoever is speaking. This page will be back to its normal form next week.Read-copy-update and interrupt latency
The read-copy-update (RCU) algorithm has found many applications since it was added to the 2.5 kernel. By eliminating lock contention in many situations, RCU can greatly improve performance and scalability on multiprocessor systems. For more information on how RCU works, see this description or this Driver Porting Series article. Or talk to the SCO Group, which claims to own any code which ever even dreamed of using RCU.It turns out, however, that there is one little problem with RCU - its effect on interrupt response times. RCU works by setting aside cleanup work until a later time, when it is known that the data structures of interest have no further references in the kernel. That cleanup work is done with a software interrupt, meaning it can happen after a hardware interrupt or at rescheduling time. But the list of RCU-protected data to be cleaned up can get quite long; it is used, for example, in high-turnover data structures like the dentry cache. So that software interrupt can, potentially, take a long time to run. The RCU cleanup code, in other words, can monopolize a processor for a relatively long period at just the times when a high-priority process might be trying to run.
Dipankar Sarma has taken a look at the situation and found that processing RCU callbacks can, in some situations, take as much as 400 microseconds or so. That may not seem like a lot of time, but it can be enough to significantly increase response latencies. So he has sent out a set of patches which address the problem.
In modern-day kernel programming, it sometimes seems like there is a standard answer to every problem: create a new kernel thread. Dipankar's patch does exactly that; it adds a new per-CPU "krcud" thread which handles RCU cleanup whenever the list of callbacks gets to be too long. Short callback lists are still dealt with at software interrupt time, since that is a faster way of doing things. But, if the list is too long (256 entries, by default) and, in particular, if there is a real-time process waiting to run, the tail end of the list is delegated over to krcud and control is returned to the scheduler.
Dipankar reports good results in his tests, with overall system latencies of less than 400 microseconds. He's not pushing this patch for inclusion yet; it needs more testing first. But, if things pan out, a faster-responding 2.6 kernel may result in the near future.
Keeping printk() under control
Log messages from the kernel can often be an indispensable aid in tracking down problems or generally figuring out what is going on inside the system. As most system administrators find out sooner or later, however, kernel logging can also become a problem in its own right. If a situation develops which causes the kernel to continually spew out logging information, disks can fill up and log messages can be lost. What can be worse, however, is when log messages sent to the console cause the kernel to spend all of its time just scrolling the console frame buffer. In this case, the system can become completely unresponsive. The logging code already tries to mitigate this problem by detecting and suppressing streams of identical messages. That simple mechanism breaks down, however, when the messages being logged differ from each other.As a way of improving the situation, Anton Blanchard has put together a new rate limiting scheme which has found its way into the -mm patch tree. This code, which is derived from a rate limiting mechanism used in the networking subsystem, does not automatically solve the problem, since it requires explicit changes to code which could generate message floods. Such code is often easy to identify, however, and easy to fix.
The patch adds a new function:
int printk_ratelimit(void);
Code which could generate lots of messages should call printk_ratelimit() and only call printk() if the return value is nonzero. Thus, printk_ratelimit() returns a failure status if rate limiting is currently in effect and printk() output should be avoided.
By default, the code limits messages to one every five seconds. It will, however, allow ten messages through in a short period before the rate limiting clamps down on the rest. These values are, of course, tuneable via sysctl parameters.
A mechanism like this is only useful if it is used throughout the code. Core kernel code can be fixed up relatively easily; the patch includes a fix for the page allocator, for example. The source of message floods, however, is often a driver which want to be sure that its "my device has joined the Dark Side" messages are heard. Fixing all of those is a daunting task, but even a partial solution leaves the kernel less susceptible to this particular problem than before.
Patches and updates
Kernel trees
Architecture-specific
Core kernel code
Development tools
Device drivers
Filesystems and block I/O
Janitorial
Memory management
Miscellaneous
Page editor: Forrest Cook
Next page:
Distributions>>