Brief itemsreleased on June 5. This one contains several fixes for serious problems; none of them look immediately security-related, however.
No patches have been merged into the mainline repository since -rc6, as of this writing.
The current -mm tree is 2.6.17-rc6-mm1. Recent changes to -mm include improved force feedback support in the input driver and a large number of patches related to the locking validator.
Kernel development news
Perfection is the enemy of progress and of success. We risk moving back to the case we got into in 2.4 when merging got so hard that most vendors shipped kernels bearing no relationship to the "upstream" tree. Probably worse this time as there is no common "unofficial" tree like -ac so they will all ship different variants and combinations.
-- Alan Coxa 2.6.18 merge plan summary describing how he expects to dispose of the patches currently sitting in the -mm tree. There has been occasional talk of doing a bugfix-only kernel cycle, but it's clear that 2.6.18 won't be that cycle - there are a lot of patches tagged for merging.
The features which are expected to be merged are interesting, but they are best discussed once they hit the mainline repository; until then, their fate remains uncertain. So, for now, suffice to say that 2.6.18 will likely include an S/390 hypervisor filesystem, a number of memory management patches, some software suspend improvements, a new i386 hardware clock subsystem, some SMP scheduler improvements, the swap prefetch patches (maybe), priority-inheriting futexes, a rework of the /proc/pid code, a number of MD (RAID) improvements, a new kernel-space inotify API, and a bunch of code from subsystem trees which does not appear in -mm directly. As is usual, a great deal of code will be flowing into the mainline for the next release.
It can also be interesting to look at what will not be merged. From Andrew's posting, the following big patch sets are likely to be held back:
In particular, some dismay has been expressed regarding how long it can take to get drivers into the mainline. It seems that, perhaps, the quality bar is being set too high. It is always possible to find things to criticize in a body of code, but sometimes the best thing to do is to proceed with the code one has and improve it as part of an ongoing process. There is concern that reviewers are insisting on perfection and keeping out code which is good enough, and which could be of value to Linux users.
All of this is subject to change when the merge window actually opens. Developers are making cases for specific patches; Ingo Molnar is asking for reconsideration of the generic IRQ and lock validator patches, for example. Watch this space in the coming weeks to see what really happens.
As it happens, a number of USB users have found that, on upgrading to 2.6.16, their systems do not work anymore. But, in this case, this "regression" is not seen as such by the developers and is not likely to change. This issue is a good demonstration of the sort of tradeoffs which operating systems developers must make.
USB ports can supply power to the devices plugged into them; this power is sufficient to drive many devices, as well as totally unrelated items (such as USB-powered LED lamps). There are limits to the amount of power which can be supplied, however. USB devices will communicate their maximum current draw to the host, which can then decide whether it has the capacity available or not. If sufficient power is not available, the device will not be allowed to configure itself and operate.
There are many rules in the USB specification on how power configuration should work. One of those applies to unpowered USB hubs - the ones which lack a power supply of their own. The total current drawn by an unpowered hub cannot be allowed to exceed what the host can supply; in particular, the USB specification limits devices on unpowered USB hubs to 100 mA of current. Even if only one hub port is in use, that single port is limited to that value, despite the fact that a larger draw should work in that situation.
Prior to 2.6.16, the Linux kernel did not actually check power requirements before configuring devices. With 2.6.16, however, any device whose stated maximum power requirement exceeds 100 mA will not be allowed to configure itself on an unpowered hub. Thus, devices which worked in that mode in earlier kernels now fail to operate; not all users are entirely pleased.
The argument has been made that, since these configurations almost always work in the real world, the kernel should not be shutting them down now. The fact is, however, that running hardware outside of its specifications is always a dangerous thing to do. Often one will get away with it, but sometimes things can fail badly. A fairly large class of USB devices are mass storage devices; the consequences of power-related problems with these devices could include corrupted data and damaged hardware. These are not consequences which the USB developers wish to inflict on their users, so, instead, they refuse to operate devices out of their specifications.
To the developers, the fact that some previously-working hardware now fails to operate is not a regression. It is a bug fix, with the kernel finally performing some due diligence which should have been happening all along. They do not intend to change this behavior.
As it happens, it is possible to convince the kernel to override its good sense and configure the device anyway. It is not easy, however. Essentially, the steps are this:
echo -n 1 > /sys/bus/usb/devices/1-2.3/bConfigurationValue
The configuration values and path number must be replaced with the actual values determined from the lsusb output.
Needless to say, this sequence of steps is not entirely easy - and it must be repeated each time the device is plugged in. For those who are comfortable writing udev rules, this configuration change can be automated without too much trouble. Perhaps the desktop environments will eventually be made smart enough to detect this situation and offer (with suitable scary warnings) to override the kernel for specific devices. But it might just be better to buy a powered hub or plug the device directly into the host.
There is one situation, however, where the current scheduler does not work as well as one would like. Imagine a simple system with two processors. If two CPU-bound processes, each running at normal priority, are started on this system, the scheduler will eventually run one process on each CPU. If two niced (low-priority) processes (also CPU-bound) are then started, one would normally expect the scheduler to ensure that those processes get less CPU time than the normal-priority processes.
If the processes are distributed such that one normal-priority and one low-priority process end up on each CPU, that expectation will be met; the low-priority processes will get a relatively small amount of CPU time. It is just as likely, however, that both normal-priority processes will end up on the same CPU, with the two low-priority processes on the other. In this case, the two normal-priority processes will be contending for the same CPU, while the low-priority processes fight for the other. As a result, the low-priority processes will get as much CPU time as the others, their reduced priority notwithstanding. That is almost certainly not what the user had in mind when the process priorities were set.
The problem is that the scheduler looks only at the length of the run queue on each CPU, without taking priorities into account. So, in either case above, the CPUs appear to be equally busy, and no redistribution of processes will occur. To fix this problem, the load balancing code must be made to understand that not all running processes are created equal.
A solution can be found in the "smpnice" patch set, implemented by Peter Williams with input from a number of other developers. The smpnice code changes the load balancer so that it does not just look at run queue lengths. Instead, each process is assigned a "load weight," which is derived from its priority. When load balancing decisions are made, the scheduler compares total load weights rather than the length of the run queues. If a load weight imbalance is detected, the scheduler will move a process to bring things back into line. If the imbalance is large, high-priority processes will be moved; when the imbalance is small, however, a low-priority process will be moved instead.
The basic idea makes sense, but this set of patches has been a long time in development. The scheduling code is full of subtle heuristics which are easily upset. So early versions of the smpnice patches caused benchmark regressions and ran into a number of difficulties. For example, a processor running a very high-priority process will tend to appear to be the most heavily loaded, with the result that load balancing no longer occurs between other processors on the system. This problem was fixed by ignoring processors which have no processes which can be moved. Some load balancing heuristics which would move high-priority processes were broken, resulting in suboptimal scheduling decisions; now, if a process would have the highest priority on the new CPU, it is considered first for moving. Various stability problems, where processes would oscillate between processors, have also been ironed out.
With all of these fixes applied, the smpnice code appears to be stabilizing, with the result that it might just make it into the 2.6.18 kernel. That should improve life for people running multiple-priority workloads on SMP systems.
Patches and updates
Core kernel code
Filesystems and block I/O
Page editor: Jonathan Corbet
Next page: Distributions>>
Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds