The current 2.6 kernel is 2.6.3
, which was released
on February 17. Only a handful of
patches have gone in since the last release candidate. Overall, 2.6.3
includes a great deal of internal cleanup work, the removal of the USB
scanner driver (in favor of the user-space libusb solution), the new
generic DMA pool mechanism, "context mount" support for SELinux,
a big ALSA update, a fix for the new mremap()
and quite a few architecture
updates. See the long-format changelog
During the last week, we also saw 2.6.3-rc3
(changelog) and 2.6.3-rc4 (changelog).
The current kernel tree from Andrew Morton is 2.6.3-mm1. Recent additions to the -mm tree
include some more scheduler improvements, a new CPU hotplug implementation,
journaled quotas for the ext3 filesystem, and numerous fixes.
2.6.3-mm1 also contains the new device mapper crypto target
code. This target allows the creation of encrypted filesystems by way of
the device mapper (LVM) subsystem. If things work out, this approach is
likely to replace the (buggy) cryptoloop driver; if you have an interest in
encrypted filesystems, testing out this patch might be a good idea.
The current 2.4 kernel is 2.4.25, released by Marcelo on February 18. Among
other things, this release includes the mremap() vulnerability
fix. Marcelo has had a busy week, having previously released 2.4.25-rc2, -rc3, and -rc4.
Comments (4 posted)
Kernel development news
I suspect most samba developers are already technically
insane... Of course, since many of them are Australians, you can't
-- Linus Torvalds
Comments (2 posted)
It all started as a JFS bug report
. The JFS
filesystem, it seems, gets upset when user space passes it file names
encoded in the UTF-8 format. Rather than create or open a file with the
name as given, it gives up and returns EINVAL
. Patches which fix
the problem have been posted, but the resulting discussion has taken rather
longer to be resolved.
JFS has an "iocharset" option which can be used to state
explicitly, at mount time, which character encoding is being used. There
were calls on linux-kernel for this option to be added to other filesystems
as well. The idea was rather strongly shot down, however, for a few
reasons. One of those is that multiple users could be simultaneously using
different character encodings on the same filesystem; a global option for
the whole filesystem clearly will not be able to address that case.
The real reason, however, is that performing character set conversion
requires the kernel to interpret the file name strings being passed to it
from user space. The kernel hackers are very resistant to the imposition
of any such policy; it would go against decades of Unix tradition.
Officially, the kernel has no policy regarding which character set is being
used for file names, content, or anything else. In each case, the kernel
sees nothing more than a stream of bytes.
That said, the kernel does have some policies regarding file names: they
use "/" as a directory delimiter, and they are terminated by a
NULL byte. This policy rules out the use of many encodings
which are sometimes employed to represent non-ASCII characters; the
fixed-width wide encodings all tend to use lots of bytes containing zero.
In reality, the only practical choices for representing characters beyond
the ASCII set are iso-8859-1 (which allows the representation of characters
used in many continental European languages) and UTF-8, which can encode
pretty much anything.
UTF-8 is relatively easy to use; for US users it looks just like ASCII, but
it can handle a far wider range of characters while not breaking (most)
code which uses traditional C strings. Thus it is often said that UTF-8 is
the encoding used by the Linux kernel. That statement is a mistake,
however: Linux does not use any particular encoding. If user space uses
UTF-8 to represent extended characters, everything will work. But nothing
forces user space to work in that way.
This approach keeps policy out of the kernel, but some developers are not
entirely happy with it. The lack of policy can lead to user-space
confusion in a number of ways. For example, if a user creates a file called
WéîrdÑàmë, that name could be represented in the
filesystem in more than one way. Depending on how user space is configured, it could choose
either iso-8859-1 or UTF-8; the encoding of that name will be quite
different depending on that choice. A different user space could interpret
the file name differently in the future, resulting in unreadable filenames
and confused users. The kernel, lacking a character encoding policy of its
own, will do nothing to help prevent this situation.
Confusion over character sets can also facilitate the creation of security holes; code which
attempts to clean up file names can fail if evil characters are given in an
unexpected encoding. Code which expects UTF-8 must also be careful when
dealing with the Linux kernel because the kernel itself makes no effort to
ensure that any string is, in fact, a legal UTF-8 encoding.
To complicate the situation even more, Andrew Tridgell posted another reason why, he thinks, the kernel will
have to adopt a specific character encoding: case insensitivity. Says
The reason is that I think that eventually the Linux kernel will
need to efficiently support a userspace policy of
case-insensitivity and the only way to do case-insensitive filename
operations is to interpret those byte streams as a particular
Needless to say, the idea of implementing case-insensitive filesystem
operations in the kernel was not particularly popular. Not too many kernel
hackers want to complicate the filesystem code to implement what they see
as being a broken Windows feature to begin with. There are other
difficulties as well: case-insensitive matching must be done differently in
different languages. The end result is that case insensitive lookups are
not very likely to make it into the kernel anytime soon.
Linus is not averse to trying to help out Samba and other applications
which wish to implement case-insensitive behavior, however. He has proposed a new "magic_open()"
interface which would make it easier for user space to perform
case-insensitive lookups without actually doing that work in the kernel.
This interface would likely require quite a bit of work before it would do
what the Samba developers need, but something derived from it could just
make an appearance in the 2.7 development series.
Meanwhile, the kernel does not seem likely to adopt any sort of official
encoding anytime soon. The problems that result from the lack of an
encoding policy are mostly seen as user space issues. Proper locale
support is still relatively new in Linux, and many rough edges remain.
Given the high level of interest in high-quality localization support in
Linux, however, one might expect those edges to be smoothed down
(For those who would like to learn more about UTF-8, see this FAQ or RFC 3629).
Comments (23 posted)
The kernel function invalidate_page_range()
is not something which
has a lot of callers. Its job is to invalidate all memory mappings which
cover a specific part of a file, presumably because the contents of the
relevant pages have changed on disk. This function is currently exported
only to GPL-licensed modules.
Paul McKenney has requested that this
function be exported to all modules. It seems that IBM's GPFS filesystem
needs it, and that filesystem is not free software. The claim is that the
filesystem is an entirely independent development, and is thus not derived
from the kernel; it should not have to be licensed under the GPL to be
loadable into the kernel.
Andrew Morton says he is not opposed to the
patch. One might think it would not be too controversial,
especially since that function was first created and submitted by...Paul McKenney. There are
developers, however, who believe that any module which is digging that
deeply into the virtual memory subsystem cannot help but be derived, in
some fashion, from the Linux kernel. There is also, perhaps, a certain
desire to demonstrate that even IBM can't obtain arbitrary access to the
kernel for proprietary modules.
In general, the kernel hackers are more interested in seeing their work be
useful and used, instead of fighting over licensing battles.
So one might expect to
see this patch eventually get incorporated. In more recent times, however,
some developers have been adopting a firmer position with regard to
proprietary modules. This patch may still get in, but it's likely to have
a harder time than would have once been the case.
Comments (2 posted)
type in the Linux kernel is a simple integer variable
with a set of operations which are guaranteed to be atomic without the need
for explicit locking. For years, atomic_t
variables have operated
under the constraint that they can be expected to hold no more than 24
bits; this limitation was forced by the Sparc32 architecture, which used
the other eight bits to implement the atomic operations.
As of 2.6.3, this limitation no longer holds. This patch by Keith M Wesolowski has changed
the Sparc32 implementation to a version (taken from the PA-RISC
architecture) which provides full 32-bit atomic variables.
The new implementation works by creating a small array (four entries) of
spinlocks. When an operation is to be performed on an atomic variable, one
of those spinlocks is chosen by a hash function; the code holds the given
lock while manipulating the variable. The result is proper locking for
atomic operations without doubling the size of every atomic_t in
the system. The patch was quickly picked up and merged, and kernel
programmers have one less strange limitation to worry about.
Comments (3 posted)
Patches and updates
- Andrew Morton: 2.6.3-mm1.
(February 18, 2004)
- Bernhard Rosenkraenzer: 2.4.25-pac1.
(February 18, 2004)
Filesystems and block I/O
Benchmarks and bugs
Page editor: Jonathan Corbet
Next page: Distributions>>