Brief itemsannounced by Linus on January 20. A massive set of patches was merged into this release; included therein is a new Qlogic SCSI driver, a bunch of USB work, infrastructural work to better support hotplug block devices, several architecture updates, some I/O scheduler work, a rework of the PCMCIA drivers, sysfs support for several new types of devices, an XFS update, and much more. See the long-format changelog for the details.
The latest kernel from Andrew Morton, as of this writing, is 2.6.1-mm5. Recent additions to the -mm tree include a working modular IDE implementation, improved x86 CPU type selection options, a user-mode Linux update, and many other fixes.
The current 2.4 kernel is 2.4.24. Marcelo released 2.4.25-pre5 on January 15; a "deadly mistake" there forced the release of 2.4.25-pre6 one day later. The 2.4.25 prepatches have been getting steadily smaller; there may be a release candidate coming in the near future.
Kernel development news
Driver authors - and their users - might have a much easier time if drivers could be written to run in user space. In addition to mitigating the above-mentioned kernel programming issues, user-space driver development would allow the creation of a stable ABI; it also, presumably, would eliminate any licensing issues associated with closed-source drivers. User-space driver writers could also use any language they choose, "even Python."
Peter and company have set out to make user-space drivers possible. Some of the necessary pieces are already in place. Standard Linux will allow a suitably privileged process to access I/O ports, for example. Low-address memory-mapped I/O registers can be accessed via a mmap() of /dev/mem. There is also an interface which gives user-space processes access to the PCI configuration space; this interface works via ioctl() calls on /proc files, though, thus upsetting the sensibilities of most kernel hackers. These facilities are enough to allow some user-space drivers (particularly XFree86) to work, but they are not sufficient to enable a wider range of drivers to move out of the kernel.
One of the big gaps is interrupts; there is no way, currently, for user-space processes to register and respond to device interrupts. A patch from the Gelato project addresses this gap by creating a set of files under /proc. A process wanting to deal with interrupt 11, say, would open /proc/irq/11/irq. Reading the resulting file descriptor enables the interrupt and blocks the process until a device interrupt happens; control then returns to user-space, which can figure out what to do. A typical user-space driver will set up a separate thread to wait for interrupts in this manner; the actual work can be handed off to a different thread within the program.
Peter presented some graphs showing that interrupt response times suffer very little when interrupt handlers run in user space. The main limitation at the moment seems to be the fact that shared interrupts are not supported.
Another thing that user-space processes cannot normally do is set up DMA operations. To enable DMA, a new set of system calls has been added. The interface appears to be in a bit of flux, but it will be something like the following. The driver starts by opening a special file for device operations:
int usr_pci_open(int bus, int slot, int function);
There is then a function for setting up DMA mappings:
int usr_pci_map(int fd, int cmd, struct mapping_info *info);
The cmd argument can be USR_ALLOC_CONSISTENT to set up a long-lived consistent mapping, or USR_MAP to create a streaming, scatter/gather mapping. In either case, the info argument is used to pass in the relevant information, and to get the necessary address(es). There is also, of course, a USR_UNMAP operation for when the DMA is complete.
Many user-space drivers will be able to obtain their requests directly from user space; the X server works in this way. Many other drivers, however, will need to hook into the kernel for this information. The current patch includes a mechanism (Peter described it as ugly) for a user-space block driver to register itself with the kernel and get I/O requests. It works by opening another special file and using it to communicate requests and responses back and forth. A similar interface apparently exists for network drivers.
Getting a user-space driver patch into the kernel could be an interesting challenge. Many kernel hackers, certainly, resist changes that look like they are pushing Linux toward something that looks like a microkernel architecture - or which might legitimize binary-only drivers. On the other hand, some drivers bring a great deal of baggage into the kernel with them which might be better kept in user space; think of some of the code required by some sound drivers or the modulation software needed by "linmodem" drivers. The ability to run these drivers in user space could be a nice thing to have.
See the Gelato user-level drivers page for more information.
Andi Kleen has been putting some effort into making the kernel smaller through the use of some relatively new and obscure gcc options. He starts with -Os, as do most kernel shrinkers; this one simply tells the compiler to optimize for size rather than strictly for performance. Anecdotal evidence suggests that -Os not only produces a smaller kernel, but the resulting code also often runs faster as well.
The next step was to use -funit-at-a-time. This option is new; it will be part of the upcoming gcc 3.4 release. It causes the compiler to load the entire source file into memory before it begins generating code; the result is better inlining and dropping of unused functions. The result was a little over 3% reduction in kernel text size. The reasons for this shrinkage require further investigation; it may be that there is a significant amount of dead code in the kernel.
Finally, Andi has also enabled -mregparm=3, which instructs the compiler to pass up to three function arguments in registers, rather than on the stack. This option helps even more than -funit-at-a-time. Using all three options, Andi is able to reduce the text size by over 700KB.
There is one potential problem with -mregparm=3, however: it changes the calling conventions within the kernel, and thus breaks binary modules. As one might imagine, some kernel developers are more worried about this than others. Red Hat kernel packager Arjan van de Ven has stated that he is using this option, and intends to build production kernels that way as well. As always, sympathy for the difficulties encountered by distributors of binary-only modules is low. If the kernel hackers decide that this option is worth using, they'll not let some broken binary modules stop them.
FUTEXes are an improvement on what came before, but they do not yet provide the functionality that some users - particularly real-time system implementers - would like to have. To help fill in the gap, Iñaky Pérez-González has been working (with others) on a new set of "robust mutexes" which go by the name of FUSYNs. The project has a simple web site based at OSDL and a set of patches. Some information can be found in fusyn.txt, which is included with the patch.
FUSYNs enhance FUTEXes with:
Future plans include the addition of features like condition variables, reader/writer locks, spinlocks, etc.
Inside the kernel, this functionality is implemented through the addition of some new facilities which could be useful beyond the FUSYN code. The "vlocator" structure allows the kernel to associate objects with user-space processes via a hash table. In the longer term, vlocators could be used to provide some relief for the ever-growing task structure. The unfortunately-named "fuqueue" functions much like an ordinary kernel wait queue, except that wakeups take process priority into account - only the highest-priority process is awakened. To support this functionality, a new "plist" type is added; it implements a general, priority-sorted, doubly-linked list capability.
The reaction to posts of FUSYN patches on linux-kernel has tended to be quiet. There does not appear to be any strong opposition to the addition of this capability to the kernel. Whether FUSYNs go into 2.6, or have to wait for 2.7, however, remains to be seen.
Patches and updates
Core kernel code
Filesystems and block I/O
Benchmarks and bugs
Page editor: Jonathan Corbet
Next page: Distributions>>
Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds