Brief items
The current 2.6 kernel is 2.6.1, which was
released on January 8. The contents of
this kernel are pretty much as described last week: a whole lot of fixes
along with a few new features (MSI support, EFI support, a couple of
internal API changes, etc.). See
the
long-format changelog for the details.
The latest patch from Andrew Morton, as of this writing, is 2.6.1-mm3. Recent additions to the -mm tree
include some anticipatory I/O scheduler work ("This is the 114th
patch against the anticipatory scheduler and we're nearly finished,
honest"), improved CPU scheduler support for hyperthreaded
processors, working modular IDE drivers, a number of big architecture
updates, some SELinux updates, several NFS fixes, an ALSA update, the
kthread abstraction (discussed here last
week), and many other fixes and updates.
The current 2.4 kernel is 2.4.24; Marcelo has released no 2.4.25
prepatches since 2.4.25-pre4 on
January 6.
Comments (none posted)
Kernel development news
This week's Kernel Page is a little thin as a result of its normal editor being in Australia to attend Linux.Conf.AU. There are limits to the sort of kernel content that can be written over a conference wireless link while simultaneously making a show of listening to whoever is speaking. This page will be back to its normal form next week.
Comments (none posted)
The read-copy-update (RCU) algorithm has found many applications since it
was added to the 2.5 kernel. By eliminating lock contention in many
situations, RCU can greatly improve performance and scalability on
multiprocessor systems. For more information on how RCU works, see
this description or
this Driver Porting Series
article. Or talk to the SCO Group, which claims to own any code which
ever even dreamed of using RCU.
It turns out, however, that there is one little problem with RCU - its
effect on interrupt response times. RCU works by setting aside cleanup
work until a later time, when it is known that the data structures of
interest have no further references in the kernel. That cleanup work is
done with a software interrupt, meaning it can happen after a hardware
interrupt or at rescheduling time. But the list of RCU-protected data to
be cleaned up can get quite long; it is used, for example, in high-turnover
data structures like the dentry cache. So that software interrupt can,
potentially, take a long time to run. The RCU cleanup code, in other
words, can monopolize a processor for a relatively long period at just the
times when a high-priority process might be trying to run.
Dipankar Sarma has taken a look at the
situation and found that processing RCU callbacks can, in some
situations, take as much as 400 microseconds or so. That may not seem like
a lot of time, but it can be enough to significantly increase response
latencies. So he has sent out a set of patches which address the problem.
In modern-day kernel programming, it sometimes seems like there is a
standard answer to every problem: create a new kernel thread. Dipankar's
patch does exactly that; it adds a new per-CPU "krcud" thread which handles
RCU cleanup whenever the list of callbacks gets to be too long. Short
callback lists are still dealt with at software interrupt time, since that
is a faster way of doing things. But, if the list is too long (256
entries, by default) and, in particular, if there is a real-time process
waiting to run, the tail end of the list is delegated over to krcud and
control is returned to the scheduler.
Dipankar reports good results in his tests, with overall system latencies
of less than 400 microseconds. He's not pushing this patch for inclusion
yet; it needs more testing first. But, if things pan out, a
faster-responding 2.6 kernel may result in the near future.
Comments (8 posted)
Log messages from the kernel can often be an indispensable aid in tracking
down problems or generally figuring out what is going on inside the
system. As most system administrators find out sooner or later, however,
kernel logging can also become a problem in its own right. If a situation
develops which causes the kernel to continually spew out logging
information, disks can fill up and log messages can be lost. What can be
worse, however, is when log messages sent to the console cause the kernel
to spend all of its time just scrolling the console frame buffer. In this case,
the system can become completely unresponsive.
The logging code already tries to mitigate this problem by detecting and
suppressing streams of identical messages. That simple mechanism breaks
down, however, when the messages being logged differ from each other.
As a way of improving the situation, Anton Blanchard has put together a new
rate limiting scheme which has found its way into the -mm patch tree. This
code, which is derived from a rate limiting mechanism used in the
networking subsystem, does not automatically solve the problem, since it
requires explicit changes to code which could generate message floods.
Such code is often easy to identify, however, and easy to fix.
The patch adds a new function:
int printk_ratelimit(void);
Code which could generate lots of messages should call
printk_ratelimit() and only call printk() if the return
value is nonzero. Thus, printk_ratelimit() returns a failure
status if rate limiting is currently in effect and printk() output
should be avoided.
By default, the code limits messages to one every five seconds. It will,
however, allow ten messages through in a short period before the rate
limiting clamps down on the rest. These values are, of course, tuneable via
sysctl parameters.
A mechanism like this is only useful if it is used throughout the code.
Core kernel code can be fixed up relatively easily; the patch includes a
fix for the page allocator, for example. The source of message floods,
however, is often a driver which want to be sure that its "my device has
joined the Dark Side" messages are heard. Fixing all of those is a
daunting task, but even a partial solution leaves the kernel less
susceptible to this particular problem than before.
Comments (6 posted)
Patches and updates
Kernel trees
Core kernel code
Development tools
Device drivers
Filesystems and block I/O
Janitorial
Memory management
Architecture-specific
Miscellaneous
Page editor: Forrest Cook
Next page: Distributions>>