The current 2.6 prepatch is 2.6.16-rc5
on February 26.
Says Linus: "There's not much to say about this: people have been
pretty good, and it's just a random collection of fixes in various random
" Details can be found in the
The mainline git repository contains, as of this writing, several dozen
fixes merged since -rc5 was released.
The current -mm tree is 2.6.15-rc5-mm1. Recent changes
to -mm include a relayfs API change, a new set of notifier patches, a big rework
of the /proc code, and the return of the swap prefetching patch.
Comments (none posted)
Kernel development news
It's not funny anymore. The current rate at which new GPL violations get reported and/or discovered, especially from the appliance/embedded market is really alarming.
For example, I haven't yet seen a single linux-based NAS product that was even remotely license compliant when first analyzing it. And I'm not only talking about the SoHo NAS boxes with one or two hard disk drives, but even about enterprise storage systems.
-- Harald Welte
Comments (11 posted)
Last month, Greg Kroah-Hartman announced
that OSDL had accepted a set of recommendations aimed at improving its
relations with the kernel development community. One of those
recommendations was naming a kernel developer to the OSDL board of
directors. OSDL has now followed through by announcing (click below for the
press release) that SCSI subsystem maintainer James Bottomley will be
joining the board.
Full Story (comments: 6)
Last week's Kernel Page looked at
the stability of the user-space interface
, especially regarding areas
like sysfs, which are not always regarded as being part of the kernel ABI.
This week, Greg Kroah-Hartman has made an attempt to make the issue more
evident through a set of ABI
. Included in his patch is a proposal for a
different way of looking at ABI stability issues.
Linus has, in the recent past, taken a hard line on changes
interfaces to user-space:
If you cannot maintain a stable kernel interface, then you damn
well should not send your patches in for inclusion in the standard
kernel. Keep your own "HAL-unstable" kernel and ask people to test
It really is that easy. Once a system call or other kernel
interface goes into the standard kernel, it stays that way. It
doesn't get switched around to break user space.
Greg, has, instead, taken the approach that not all kernel interfaces
should be seen as stable from the outset. So he has proposed five
different classifications for ABI stability:
- Stable. Interfaces classified as stable will not break "for at
least two years," and probably quite a bit longer. The Linux system
call interface is classified in this way.
- Testing. A "testing" interface is one which has been through
most of the development process. It is not expected to change, but,
that notwithstanding, the possibility of an incompatible change before
the interface becomes "stable" does exist. This is the time for
user-space programs to begin to make real use of the interface, but
user-space developers need to pay attention to what is happening on
the kernel side. The sysfs files under /sys/class have been
designated as having a "testing" level of stability by Greg's
- Unstable. This classification is for relatively new interfaces
which are expected to change as problems in the initial implementation
become clear. Sysfs files under /sys/devices are classified
- Private. This class describes interfaces which are intended to
be hidden behind a user-space library and which should not be used
directly by applications. The ALSA sound system is an example of a
- Obsolete marks interfaces which are destined to be removed, and
which should not be used at all. Few long-timer observers will be
surprised to see that Greg marked devfs as being obsolete.
Linus doesn't like the unstable and private
classifications, calling them "excuses for bad habits." But it is true
that inclusion in the mainline can stress an interface in surprising ways,
leading to a need for changes. Interface design is hard, even if you don't
have to get everything right the first time. So it may make some sense to
allow unstable interfaces into the kernel for a short while - as long as
they are clearly documented as such. Thus far, there has been no way to
warn developers that a certain interface, perhaps, shouldn't be relied upon
The notion of private interfaces looks harder to justify. There has been
some talk of shipping user-space libraries for private interfaces with the
kernel, just to help ensure that the whole package provides a stable
application interface for any release. That seems like a fairly unlikely
change, however, at least for big interfaces like ALSA.
Changes will likely be made (this scheme might be classified "unstable" at
this point), but it seems probable that it will, in some form, be adopted.
That can only be a good thing for people interested in a stable user-space
interface; once the expectations have been reasonably well documented,
it will be easier to live up to them.
Comments (2 posted)
There's a few patches in circulation which merit a quick look.
What if you could improve kernel performance by 10% without writing any
code? Arjan van de Ven has posted a patch which, he says, does
just that - at least, for some specific benchmarks. This patch uses an
obscure gcc option which causes the compiler to put every function into its
own ELF section. Then, the linker is instructed to arrange those functions
into a specific order in the final executable.
A typical, current x86-64 kernel (the architecture Arjan has been working
with) fills on the order of 4MB of memory. The kernel uses large pages to
hold its text, but a kernel of that size will still require at least two
translation buffer (TLB) entries to cover its entire code body. But some kernel
functions are used more heavily than others; much of the code in the kernel
- error handling, for example - never gets run at all if you are lucky.
So, if all of the regularly-used functions are moved to the beginning of
the kernel image, the kernel should be able to operate with a single TLB
entry for its text - most of the time. TLB entries are important: if an address is found in
the TLB, the processor can avoid looking it up in the page tables, speeding
access significantly. They are also scarce. So allowing the kernel to
operate within a single TLB entry makes a big difference.
There are some details to work out yet. Optimizing TLB use will require
that the kernel be loaded at a TLB-aligned address, which is not currently
done on many architectures. There is another part of Arjan's patch which,
using another gcc option, can move blocks marked with unlikely()
into a separate section. Since this option can expand the code, require
long-distance jumps within functions, and make stack backtraces hard to
read, it is not yet clear whether it makes sense or not. Then, there is
the issue of ordering the functions properly. That task will require
looking at a lot of kernel profiles to be sure that some workloads won't be
optimized at the expense of others. But, once these issues are taken care
of, a reorganized and faster kernel will likely result.
On another front: it is generally easy to see, on a Linux system, what
resources a given process is using. What's harder to find out is what
the process is not using because the resources are not available. As a way
of giving more visibility to that side of the equation, Shailabh Nagar has
been working on a set of task
delay accounting patches. This facility is intended for use with
large-scale load management applications, but the information may be useful
in other contexts as well.
This patch adds a new structure (struct task_delay_info) which is
attached to the task structure. It contains a lock, a couple of timestamp
variables, and sets of delay counters. Whenever a process goes into a
delayed state (meaning, currently, waiting on a run queue, performing
synchronous block I/O, or waiting for a page fault), the time is noted. At
the end of the delay, when the process can run again, the system notes how
much time has passed and updates a counter in the task_delay_info
structure. Thus, over time, one can get a picture of how much time the
process has spent waiting for things when it would have rather been
Perhaps the most complicated part of the patch set is the netlink interface
used to report delay statistics back to user space. This interface has
been carefully written to be as generic as possible on the theory that it
may eventually be used for other sorts of process-related reporting as
well. There has been a request that some of this information, at least,
also be made available through /proc, so that it could be easily
displayed by tools like top.
Finally, those who worked with kernel modules in 2.4 and prior kernels will remember
the MODULE_PARM() macro, used to define load-time parameters.
This macro has been deprecated since 2004, but there
are still a few hundred uses of MODULE_PARM() spread across
several dozen files in the 2.6.16-rc kernels. These old uses came to
attention recently when gcc started optimizing them out. Given the choice
between making the old macro work with current gcc and simply getting rid
of it, Rusty Russell chose to get
rid of it. This patch has not yet been merged anywhere, but it seems
uncontroversial. If there are any out-of-tree modules still using
MODULE_PARM(), updating them soon might be a good idea.
Comments (9 posted)
While there are a number of hopeful developments around the support of
wireless network cards in Linux, that support remains one of the larger
roadblocks for many users. It is thus always a welcome thing when a major
manufacturer announces Linux support - and the beginnings of a working
driver - for their products. So when Intel recently announced
project to support its 3945ABG wireless adapters, there was a certain
amount of celebration. There was also come criticism, however, which
highlights an ongoing issue with wireless support under Linux.
The ipw3945 project currently
has a developer release of the driver, with a stable version expected
within a few weeks. This release supports all of the basic features one
would expect, with some additional features (quality of service, for example)
"not officially supported." It should, in other words, be enough to allow
use of the device.
It would seem that there is little to complain about here. But there is
this little paragraph from the announcement:
In order to meet the requirements of all geographies into which our
adapters ship (over 100 countries) we have placed the regulatory
enforcement logic into a user space daemon that we provide as a
binary under the same license agreement as the microcode. We
provide that binary pre-compiled as both a 32-bit and 64-bit
application. The daemon utilizes a sysfs interface exposed by the
driver in order to communicate with the hardware and configure the
required regulatory parameters.
The requirement for a binary-only blob brought out some concerns from
developers who think that the regulatory-agency requirement has been
overblown, and that it is not actually necessary to lock down the code in
this way. Others disagree, noting that regulations in many parts of the
world are quite strict with regard to allowing any user modification of
hardware which can transmit. It is probably true that, in order to be able
to offer this product in many parts of the world, Intel must lock down much
of this logic in binary-only code.
Given that, however, Intel has chosen an interesting way to go about it.
The closed code is not part of the driver itself; it is a daemon which runs
entirely in user space. The driver itself is fully free software. So
there is no non-free code going into the kernel, which is surely a step in
the right direction.
The regulatory daemon controls the hardware by way of a special file
exported through sysfs. The driver then interprets those commands - which
enable or disable specific channels, set maximum power values, and so on -
and programs the hardware accordingly. A quick look at the (15,000-line)
driver source is sufficient to find the code which actually controls the
So, in other words, this arrangement has not actually locked down much of
anything. The daemon comes with the usual "thou shalt not reverse
engineer" provisions, but there are people in parts of the world who can
safely ignore that requirement. It would seem that little work beyond
running the daemon under strace would be required. It might also
be possible to write a replacement just by studying the driver code,
without looking at the Intel-supplied daemon at all. One way or another,
it seems likely that a free replacement for the regulatory daemon will come
along, sooner or (not much) later.
Comments (15 posted)
Patches and updates
Core kernel code
- Junio C Hamano: GIT 1.2.3.
(February 23, 2006)
Filesystems and block I/O
Page editor: Jonathan Corbet
Next page: Distributions>>