Linux in the news
All in one big page
See also: last week's Kernel page.
The current development kernel release is 2.3.17. Once again, this is a very large patch. There are a lot of driver changes, and one can see the first results of Alan Cox's attempts to clean up the SCSI code and make it somewhat more readable. Note that this isn't the long-awaited rewrite of the SCSI layer; it's just some cleaning up so that he can get on with trying to find some other problems...
The current stable kernel release is still 2.2.12. Some problems are still being reported with this kernel - in particular, there still appears to be a memory leak problem that tends to turn up on systems running inn. As of this writing, the developers are still trying to chase that one down.
Trouble with 2.2 - sort of. In the August 19 issue of LWN we reported that RAID 0.90 was being folded into the 2.2.12 kernel. A week later we had to update that report and note that the RAID patches had been pulled back out. Why? Too many people objected to such a large change - which requires new user tools - going into a stable kernel minor release.
Now it appears that the NFS server patches will suffer a similar fate. These patches, developed by H.J. Lu and others, are absolutely necessary for sites doing serious NFS service with Linux systems. Heterogeneous environments, in particular, frequently turn up problems with the stock 2.2 NFS server. The patches add no new functionality, they just make the server actually work. But they require recent versions of the user space tools.
Serious users of both RAID and NFS have been applying these patches by hand for as long as they have been using the 2.2 kernel. A number of distributions also ship versions of the kernel with the patches applied. The natives on linux-kernel are starting to get restless. These patches are considered necessary by many just to get a working system. Why do they not find their way into the mainstream kernel?
There seem to be a few problems here:
It thus seems that stable kernels, increasingly, will have to remain truly stable. Even important changes get blocked out at minor release time.
So how does the kernel make progress in this environment? The recipe would seem to be more frequent major releases, each of which contains a rather smaller set of changes. If stable kernels are truly stable at (or shortly after) their release, and a new release is not more than two years away, people can calmly wait for larger changes to be integrated.
The 2.3 feature freeze, first promised for almost a month ago, still has not been announced. If a 2.4 release - which can contain working RAID and NFS implementations - is to happen before the end of the year, this freeze needs to happen soon. If it's not already too late.
Big memory and Raw I/O. LWN first reported on the "big memory patch," which allows Intel-based Linux systems to address up to 4GB of memory, back in the August 19 issue. This week Siemens and SuSE, the sponsors of that development, issued a press release announcing the patch and pointing out that it got included into 2.3.15.
There is a remaining loose end or two, however, with the big memory patch. In particular, it breaks Stephen Tweedie's raw I/O patch, which was also recently added to the development series. The raw I/O patch allows data to be transferred directly between user-space buffers and a device. There is an obvious performance gain in some situations, since a copy through the kernel's buffer cache can be avoided.
Just as important, however, is simply avoiding the cache altogether. Caching some kinds of data is wasteful, since there will not be another need for it. Rather than improving performance, caching of such transient data has only the effect of forcing out everything else, leading to a sluggish system. Anybody who has had to wait for the window system to respond after a large program build or file copy has seen this mechanism in action. Caching can also be a problem when disks are shared between more than one system.
Why is there trouble with raw I/O in particular? It seems that quite a few devices out there are unable to address high memory - memory above 2GB. Attempts to tell such devices to move data to or from high memory can result in total failure at best, and a corrupted system is a distinct possibility. The kernel is careful to keep its own buffers in lower memory so that this sort of problem does not arise. But raw I/O uses user-space buffers, which can end up anywhere. For this reason, the big memory patch currently disallows any sort of raw I/O to high memory.
The solution in this case appears to be "bounce buffers." A bounce buffer is a kernel-space buffer which lives in low memory. When I/O is requested to a high memory page, and the device can not handle it, an intermediate copy is made via the bounce buffer. This technique defeats the "zero copy" aspect of raw I/O, but preserves the other advantages. It can also be implemented so that bounce buffers are only used when they are truly needed. A proper implementation with bounce buffers should not only solve the raw I/O problem, but it should also allow the page cache to exist in high memory.
Finally, when the day arrives that more than 4GB of memory can be supported, bounce buffers will become even more necessary. A lot of PCI devices out there do not handle 64-bit addressing and will need help at that point, even if they currently work with high memory. (Thanks to Stephen Tweedie, whose linux-kernel messages were ruthlessly plundered for this article).
A few other patches and updates released this week:
Section Editor: Jon Corbet
September 9, 1999