Brief items
The current 2.6 development kernel is 2.6.27-rc7,
released on September 21.
"
All the changes are small - the biggest individual ones are
literally things like a few m68k defconfig changes and the trivial cleanups
in the MAINTAINERS file." Details can be found in
the
long-format changelog.
Several dozen fixes have been merged into the mainline git repository since
the 2.6.27-rc7 release.
The linux-next tree has abruptly taken a
break; it is expected to return around October 13.
Comments (none posted)
Kernel development news
And exactly as in the theory of relativity, two people on different
CPU's can actually validly _disagree_ about the ordering of the
same event. There are things that act as "light-cones" and are
borders for what everybody can agree on, but basically, in the
absence of explicit locks, it is very possible that no such thing
as "ordering" may even exist.
--
Linus Torvalds
- early_printk("Kernel really alive\n");
+ early_printk("Kernel really alive! It's alive! IT'S ALIIIIIIIIIVE!\n");
--
Bill Nottingham
Still experimental, not for inclusion, but given that I am now
finding more bugs in the rest of Linux than in this code, I suspect
that it is getting close.
--
Paul McKenney (Thanks to
Steven Rostedt)
Comments (1 posted)
By Jonathan Corbet
September 24, 2008
The
2.6.27-rc regression list
posted on September 21 contains - deep within the list - an entry
reading "e1000e: 2.6.27-rc1 corrupts EEPROM/NVM". One might be forgiven
for missing it; the list of regressions is still (unfortunately) long, and
there is nothing there to indicate that it is a notable problem. But it
is: this particular bug goes beyond breaking networking; when it bites, it
corrupts the EEPROM on the device, causing it to cease to function
forevermore (or, at least, until the user can manage to flash the EEPROM
with working code). This is a problem which is worth fixing.
As of this writing, though, nobody seems to know what the problem is.
There was some confusion resulting from the fact that the related e1000
driver also suffered from an EEPROM corruption problem - but that turns out
to have been an entirely different bug. The e1000 problem was fixed by
putting a lock around accesses to the EEPROM, preventing corruption caused
by concurrent access. But something else is going on with the e1000e.
Figuring out what that "something else" is appears to be a challenge. The
problem is not readily reproducible, and there is this little problem that
triggering the bug more than once requires the replacement of the affected
hardware. It's not even clear which kernel versions are affected, though
it appears that only the 2.6.27 development series shows the bug. There is
some correlation between e1000e corruptions and graphics driver crashes,
leading David
Miller to pursue a hypothesis that the
real culprit is changes to the X server, but that idea has not, yet been
proven. Other developers suspect a concurrency-related problem similar to
the e1000 bug.
As of this writing, the bulk of what is known can be found in this
advisory from Mandriva. Kernel developers are adding information to the kernel bugzilla
entry as they find it.
It has been suggested that anybody running 2.6.27 on a potentially affected
system might want to save a copy of the current EEPROM contents with a
command like:
ethtool -e eth0 > eth0.eeprom
(That assumes, of course, that the relevant device is eth0 on your
system). With the saved data, it should be possible to recover the device
if the worst happens; without, chances are that victims will have to return
their systems to the vendor.
In one sense, this bug demonstrates that the system works. It was caught
while the kernel was still in the stabilization phase; one can be certain
that it will be obliterated somehow before any stable 2.6.27 release comes
out. On the other hand, the first report
of this problem hit the net on August 8; the problem was known for
over a month before distributors started responding to it and the all-out
hunt for the cause began. That is a long time for any regression to
persist, but it is especially long when one is dealing with a regression
which has the ability to regress hardware back to a stone-age state.
The distributors have now responded; most of them have withdrawn kernels
with the affected drivers. So far, nobody has posted tools to help
affected users recover their hardware (suggestions to use ibautil
should be ignored and forgotten about as soon as possible). Such a tool
is forthcoming, but it would be hard to
blame the relevant
engineers for focusing on fixing the problem first. With any luck at all,
the root cause will have been isolated by the time you read this.
There is one thing that will not have changed, though. Testers of
unstable software - especially the kernel - have often been warned that
said software can do all kinds of terrible things to their systems. It is
easy to ignore those warnings; even -rc1 kernels actually work for most
people, most of the time. But, as we have seen in this case, the
potential for catastrophic bugs is real. Development code can brick your
network adapter, scramble your filesystems, open up severe security holes,
or save your documents as OOXML. When experimenting with unstable code -
even if it has been neatly packaged by your distributor - it is always
prudent to have good backups and an even better sense of humor.
Comments (14 posted)
By Jonathan Corbet
September 24, 2008
On the final day of the Linux Plumbers Conference, Keith Packard ran a
microconference dedicated to future displays. A number of topics were
discussed there, but the key session had to do with the near-term future of
Linux video drivers. Longtime LWN readers will be more than familiar with
the story: Linux has multiple subsystems charged with managing graphics
hardware, the user-space driver model adopted by XFree86 leads to all kinds
of problems, support for 3D graphics is not what it should be, etc. That
whole story was recounted here, but with a notable difference: solutions
are in the final stabilization stages, and these problems will soon be
history.
There are two major components to the work which is being done: graphics
memory management and kernel-based mode setting. A contemporary graphics
processor (GPU) is really a CPU in all respects, including the possession
of a sophisticated memory management unit. Managing the sharing of memory
between user space, the kernel, and the GPU is fundamental to the
implementation of correct, high-performance graphics. One year ago, the TTM subsystem looked like the
solution to the memory management problem, but TTM grew increasingly
unworkable as the understanding of the problem improved. So now the Graphics Execution Manager (GEM)
code looks like the way forward; it is currently being prepared for merging
into the mainline kernel.
Kernel-based mode setting, instead, is meant to get user-space code out of
the business of messing around directly with the hardware. Putting the
kernel in charge of the configuration of the video adapter has a long list
of advantages. Suspend and resume have a much better chance of working,
for example. Once the X server stops accessing hardware directly, it no
longer needs to run as root; having that much untrusted code running with
full privileges has made people nervous for many years. In the current
scheme, the kernel cannot change the graphics mode if it needs to; that
means that, for example, if the system panics, a graphical user will never
see the message. With kernel-based mode setting, the kernel can switch to
a different mode and allow the user to frantically try to read the message
before it scrolls off the screen. Kernel-based mode setting will also make
fast user switching work much better, without the need to use a separate
virtual terminal for each user session.
One of the first topics of discussion was: how does the kernel decide when
to switch to the panic screen to show the user an important message? There
are quite a few different paths by which the kernel can indicate distress;
should a kernel message be presented every time a WARN_ON()
condition is encountered? There would appear to be a need to unify the
error paths in the kernel to help simplify this kind of decision. Linus
Torvalds Jesse Barnes suggested that the kernel could simply switch on every message
emitted with printk(), on the theory that such a policy would lead
to a rapid and welcome reduction in kernel verbosity.
The real debate in this session, though, had to do with development
process. As has been discussed
previously on LWN, much of the video driver work is done outside of the
mainline kernel tree. We are now seeing a big chunk of that work being
prepared for a merge. But the new mode setting interface is a big API
change which will require adjustments from user space; a new kernel
expecting to handle mode setting may not give the best results when run
with an older user space X server. So there will be a big flag day of
sorts when everything changes and all of the new code gets run for the
first time.
Linus is not pleased with the notion of a video graphics flag day; he made
a long appeal for a more incremental approach to fixing the video driver
work. In his opinion, the flag day will lead to a whole bunch of untested
code being made active all at once; there will certainly be design mistakes
which show up, and the whole thing will fail to work properly. At which
point another flag day will be required. Linus was not impressed by the
claim that Fedora users have selflessly been testing this code for
everybody; in his view, the kernel developers are not doing this testing.
He sees the whole thing as a recipe for disaster.
The real problem - and the reason for the out-of-tree development - is that
all of this work requires the creation of a number of new, complex
user-space ABIs. That is true for both mode setting and memory management,
and the two cannot be easily separated from each other. Until the
combination as a whole is seen to work, the video driver developers simply
cannot commit themselves to a stable user-space interface - and that means
that their code cannot be merged.
As an example, TTM was cited. Had that code been pushed when it looked
like the right solution, there would now be even bigger problems to solve.
In summary, the graphics developers believe that the approach they are
taking is as incremental as they can make it. Whether they convinced Linus
of that fact is unclear, but he eventually seemed to accept the plan. He
did ask for them to push the mode setting code upstream first, but that
code cannot work without memory management support. So GEM will go into
the mainline ahead of kernel-based mode setting. Once everything is in the
kernel, it will be possible to boot a system with either kernel-based or
user-space mode setting, so both new and old distributions will be
supported. Someday, in the distant future, support for mode setting in
user space can be removed. Much sooner than that, though, we should all be
running much-improved graphics code and will have long since forgotten how
things used to be.
Comments (10 posted)
By Jake Edge
September 24, 2008
A subtle change in 2.6.25 recently left Andrew Morton with a less than
completely functioning system, but it also demonstrated a user-space
interface that may sometimes be overlooked: SELinux. The problem stemmed
from a change to facilitate containers by making /proc/net into a
symbolic link, which tripped up SELinux policies that had been
written for earlier kernels. Putting policy into user space is a guiding
principle of kernel development, but that can sometimes lead to an unexpected
synchronization required between those policies and the kernel.
The change itself was fairly minor, making /proc/net be a symbolic
link to /proc/self/net so that containers would only see their
network devices, rather than those of the enclosing system. But when
Morton ran a recent kernel on his Fedora Core 5 and 6 systems, he got:
sony:/home/akpm> ifconfig -a
Warning: cannot open /proc/net/dev (Permission denied). Limited output.
Further investigation found that even
ls got permission errors
when looking at
/proc/net. As is usual with mysterious
"permission denied" errors, SELinux was the underlying cause.
When the change was made, back in March, it was reviewed by the SELinux
developers, but no one noticed that it would cause an additional permission
check—on the symbolic link itself. So, when resolving things like
/proc/net/dev or other entries in that directory, the "labels" on
the symbolic link were checked. Of course, /proc is a synthetic
filesystem, so the labels are generated from SELinux code rather than
retrieved from extended attributes (xattrs).
Distributions have updated their policies to allow access to the symbolic
link—probably by noticing the SELinux denial in log messages—so
most folks
never saw the problem. As Morton found out, though, existing distribution
policy files
(those shipped with FC5 and FC6 for
example) would still disallow the access. Morton regularly runs newer
kernels with older distributions to try to catch exactly this kind of
error; he is probably one of very few, perhaps the only one, doing that.
Because the distribution-supplied kernel was being changed, some argued
that requiring users to update their SELinux policies is not an onerous
requirement.
Paul Moore puts it this
way:
Maybe
I'm in the minority here, but in my mind once you step away from the
distro supplied kernel (also applies to other packages, although those
are arguably less critical) you should also bear the responsibility to
make sure you upgrade/tweak/install whatever other bits need to be
fixed.
Morton did not buy that argument saying:
Nope. Releasing a non-backward-compatible kernel.org kernel is a big
deal.
We'll do it sometimes, with long notice, much care and much deliberation.
We did it this time by sheer accident. That's known in the trade as a
"bug".
But SELinux developer Stephen Smalley points out that permissions checks
are not normally considered part of the kernel to user space interface. It
is something of a gray area, though. Clearly the standard UNIX permission
checks are part of that interface, at least partially because the
kernel does handle the policy for those checks. Since the policies that
govern the decisions about SELinux
access denial come from user space, it is a bit hard to argue that
changes to the kernel will not ripple out. Smalley describes the problem:
I should note here that for changes to SELinux, we have gone out of our
way to avoid such breakage to date through the introduction of
compatibility switches, policy flags to enable any new checks, etc
(albeit at a cost in complexity and ever creeping compatibility code).
But changes to the rest of the kernel can just as easily alter the set
of permission checks that get applied on a given operation, and I don't
think we are always going to be able to guarantee that new kernel + old
policy will Just Work.
One possible solution to the immediate problem was floated by Smalley:
SELinux could change the
label that it returns for symbolic links under /proc. It is not
clear that anyone really wants that change, and there has been no movement
to add it. As Morton says, "people who are shipping 2.6.25-
and 2.6.26-based distros probably
wouldn't want such a patch in their kernels anyway."
Longer term, Eric Biederman asks about
supporting xattrs for /proc. That would allow user space to label
the proc filesystem appropriately, removing one of the special cases.
Unfortunately, doing so would create yet another incompatibility between
newer kernels and older user spaces.
In the end, because the bug was only seen
by Morton, many months after it was introduced, it may just be ignored.
The larger issue of how permissions checks fit into the kernel to user
space interface, though, may rear its head again.
Comments (4 posted)
Patches and updates
Kernel trees
Core kernel code
Development tools
Device drivers
Documentation
Filesystems and block I/O
Memory management
Networking
Architecture-specific
Security-related
Virtualization and containers
Benchmarks and bugs
Miscellaneous
Page editor: Jonathan Corbet
Next page: Distributions>>