User: Password:
Subscribe / Log in / New account

Kernel development

Brief items

Kernel release status

The current 2.6 prepatch is 2.6.12-rc2, announced by Linus on April 4. Changes this time include a number of architecture updates, an XFS update, some netpoll improvements, a big USB update, an ALSA update, a number of networking tweaks, and lots of fixes. Says Linus: "This is also the point where I ask people to calm down, and not send me anything but clear bug-fixes etc. We're definitely well into -rc land. So keep it quiet out there." The long-format changelog has the details.

No patches have been merged into Linus's BitKeeper repository since the -rc2 release. Given recent events, one should not expect more patches to end up there anytime soon.

The current -mm tree is 2.6.12-rc2-mm1. Recent changes to -mm include a new version of the crash dump code, a reiser4 update, a patch optionally removing all BUG() and printk() calls (shrinks the kernel but with significant side effects), an InfiniBand update, some scheduler tweaks, and various fixes.

The current 2.4 kernel is 2.4.30, which was released by Marcelo (with no changes from -rc4) on April 3.

Comments (3 posted)

Kernel development news

Time for a new semaphore type?

The Linux kernel uses two basic mutual exclusion primitives internally: spinlocks (which are fast, but require that critical sections be atomic) and semaphores (which are slower, but can sleep). These mechanisms are adequate for most uses, but there are exceptions. Trond Myklebust has encountered one of those exceptions when working on the NFSv4 code. In NFSv4, there are situations where non-atomic code must obtain a lock, but the thread cannot block at that point without risking deadlocks. So Trond set out to add an asynchronous capability to the Linux semaphore implementation - a way to request that a function be called at some point in the future when the semaphore becomes available. He encountered a little problem, however: each architecture implements its own, highly-optimized semaphore code, often in assembly language. To add functionality to semaphores, he would have to dig into more than 20 different implementations, and, somehow, ensure that they all still work afterward.

Rather than dive into that jungle, Trond elected to start over. The result is a new semaphore type which Trond calls an "iosem." At its core, an iosem looks much like a regular semaphore:

    #include <linux/iosem.h>

    void iosem_init(struct iosem *sem);
    void iosem_lock(struct iosem *sem);
    void iosem_unlock(struct iosem *sem);

A call to iosem_lock() is similar to a call to down(); it will block until the semaphore is available.

The definition of an iosem structure is simple:

    struct iosem {
	unsigned long state;
	wait_queue_head_t wait;

Whenever a thread releases the lock, it will perform a wakeup on the given wait queue entry. For the synchronous locking case, that will cause the threads waiting for the lock to be scheduled; one of them will then succeed in acquiring that lock. Everything works as one might expect.

2.6 wait queues are flexible things, however. In particular, it is possible to replace the function that is called when a wakeup occurs; this capability turns a wait queue into a fairly general notification mechanism. The iosem code takes advantage of this mechanism to allow different things to happen when an iosem becomes available. For example, consider this interface:

    struct iosem_work {
	struct work_struct work;
	struct iosem_wait waiter;

    void iosem_work_init(struct iosem_work *work,
                         void (*func) (void *), void *data);

    int iosem_lock_and_schedule_work(struct iosem *sem,
                                     struct iosem_work *work);

A thread using this interface sets up a function (func), then calls iosem_lock_and_schedule_work(). If the iosem is available, func will be called immediately, with the lock held. Otherwise, a special entry will be added to the iosem's wait queue, and the call to iosem_lock_and_schedule_work() will return immediately. At some future time, func will be called (with the lock held) by way of a workqueue. Either way, func must release the lock when it is done.

Other sorts of behavior could easily be added to this interface. Since the same code is used for all architectures, the iosem mechanism is relatively easy to extend. There has been some interest from maintainers of other parts of the kernel (asynchronous I/O, for example) in using this mechanism. There have been a few complaints, however, about the name and about adding a wholly new mutual exclusion primitive to the kernel. In particular, Benjamin LaHaise (who has recently resurfaced on the kernel lists) has stated that it would be better to rationalize the current semaphore implementation - and said that he would do the work. So, while an asynchronous semaphore implementation is likely to get into the kernel, the form it will take is not yet clear.

Comments (1 posted)

Finding the boundaries for stable kernel patches

Greg Kroah-Hartman started off the process in the usual way: a posting of all patches proposed for inclusion in that kernel release. The development community was invited to complain about any patches which do not appear to meet the criteria for the extra-stable 2.6 kernels. This time around, somebody complained.

The patch in question is a fix to the BIC TCP congestion control algorithm (congestion avoidance, including BIC, was covered here two weeks ago). BIC is supposed to perform a binary search to quickly find the optimal congestion window size. Due to a mistake in the TCP dropped packet code, however, that search was not being performed, and BIC was not working as expected. The (very small) patch makes BIC work the way its designers intended, and would seem to be a useful addition.

As Ted Ts'o pointed out, however, the rules for these kernels include:

It must fix a real bug that bothers people (not a, "This could be a problem..." type thing.)

It is safe to say that the kernel mailing lists have not been overwhelmed by users complaining that BIC was not converging properly on the best congestion window size. In fact, no users have complained. So, it could be argued, the BIC fix, while worthy, should be merged for 2.6.12 and left out of the 2.6.11.x series.

An answer came from David Miller:

An incorrect implementation of any congestion control algorithm has ramifications not considered when the congestion control author verified the design of his algorithm. This has a large impact on every user on the internet, not just Linux machines.

David concluded that, since BIC is enabled by default in the 2.6 kernel, this sort of implementation fix should take a high priority. This view seems likely to prevail for this particular patch. Expect more debates, however, as the kernel developers figure out just where the line should be drawn for patches being considered for inclusion into the stable 2.6 kernels.

Comments (none posted)

The kernel and binary firmware

Device firmware is a perennial issue in certain circles. As long as non-free firmware is safely contained within the device it controls, everybody seems to be happy. Increasingly, however, firmware must be loaded from the host system. People who want no non-free software on their computers resist the idea of having binary-only firmware linked into their kernel. Certain Debian developers have long tried to extract all non-free firmware from their distribution. Recently, the issue has come up again with a new twist: the fear that, even if a firmware blob comes with a free license, it cannot be distributed as part of the kernel because it's not in "the preferred form for modification."

The form of a solution to everybody's concerns has been available for some time: extract the firmware from the kernel source, and load it from user space at device initialization time. The firmware can then carry its own license, worries about conflicts with kernel licensing can go away, and distributors can judge each firmware blob's free software credentials using their own criteria. It would seem like a solution which would make everybody happy; the reality, however, is that this approach has not been taken in many cases. One might conclude that nobody (not even the most vocal complainers) has been sufficiently motivated to get into the code and actually pull out the firmware in this manner. There is some truth to that claim, but there is also a little more going on. The simple fact is that the infrastructure needed to make the user-space firmware mechanism work well is not ready.

The kernel contains support for user-space firmware loading by way of request_firmware(). When a driver decides it needs a firmware blob to feed its device, it can call request_firmware(); that call will result in a hotplug event. User space can then see which device's firmware is needed, locate it in the filesystem, and feed it back to the driver.

One problem with this interface is that it is too simple. Some hardware, notably the tg3 network adaptor, does not want a simple firmware blob. Instead, its firmware looks like a regular executable image - it has text, read-only data, and writable data sections. There is also associated metadata needed for the driver to actually load the firmware into the card. To accommodate complex devices like the tg3, somebody will have to extend the request_firmware() interface; that work has not yet happened.

Once that issue has been dealt with, there is still the problem of actually getting the firmware onto the system. Loading the firmware often must be done before the host system will function in any useful way, so it must be present on a freshly-installed system. Often, it will have to be part of the initrd or initramfs image used at boot time. There is thus a clear case for packaging the firmware as part of the kernel source itself; the two depend on each other anyway. That solution would clearly displease some users, however, so a separate firmware distribution seems called for. Mechanisms will need to be put into place so that user space knows where to find the firmware distribution, so that the kernel build process can create bootable kernels, etc.

These problems are all clearly amenable to solution; it is simply a matter of a suitably-motivated developer finding the time to do the work. Whether that will happen remains to be seen; most of the commercial distributors, who might be expected to fund this sort of infrastructural work, do not appear to be overly concerned about the firmware issue. So solving this problem may fall on the Debian developers, and they have a few other things on their plate at the moment.

Comments (8 posted)

Patches and updates

Kernel trees


Build system

Core kernel code

Development tools

Device drivers

Filesystems and block I/O

Memory management




Page editor: Jonathan Corbet
Next page: Distributions>>

Copyright © 2005, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds