By Jonathan Corbet
February 2, 2010
As of this writing, there have not yet been any distributor updates for the
vulnerability which will become known as CVE-2010-0307. This particular
bug does not (as far as your editor knows) allow a complete takeover of a
system, but it can be used for
denial-of-service attacks, or in a situation where an attacker with
unprivileged local access wishes to force a reboot. It is also an
illustration of the hazards which come with old and tricky code.
Mathias Krause reported the problem at the
end of January. It seems that, on an x86_64 system, a kernel panic can be
forced by trying (and failing) to exec() a 64-bit program while
running in 32-bit mode, then triggering a core dump. There does not seem
to be a way to exploit this bug to run arbitrary code - but those who would
take over systems have shown enough creativity in situations like this that
one can never be sure. Even without that, though, the ability to take any
64-bit x86 system down is not a good thing. Current kernels are affected,
as are older ones; your editor is not aware of anybody having taken the
time to determine when the problem first appeared, but Mathias has shown that
2.6.26 kernels contained the bug.
The execve() system call is the means by which a process stops
running one program and starts running a new one. It must clean up most
(but not all) of the state associated with the old program, resetting
things for the new one. In this process, there is a "point of no return":
the place where the system call is committed to making the change and can
no longer back out. Before this point, any sort of failure should lead to
an error return from the system call (which otherwise is not expected to
return at all); afterward, the only recourse is to kill the process
outright.
Sometime after the point of no return, execve() must adjust the
"personality" of the process to match the new executable image. For
example, a 64-bit process switching to a 32-bit image must go into the
32-bit personality. In the past, personalities have also been used to
emulate other operating environments - running SYSV binaries, for example.
The personality changes a number of aspects of the environment the program
runs in, though, as we'll see, fewer than it once did.
In the past, personality changes have included filesystem namespace
changes. That was necessary because the process of starting the new
executable could require looking up other images, such as an "interpreter"
image to run the new program. The lookup clearly had to happen prior to
the point of no return; if the lookup fails then the system call should
fail. So some aspects of the new image's environment had to be present
while the process was still running in the context of the old image.
The solution, at the time, was to put some brutal hacks into the low-level
SET_PERSONALITY() macro. This macro's job is to switch the
process to a new personality, but, post-hack, it no longer did that.
Instead, it would make the namespace changes, but leave most of the
environment unchanged, setting the special TIF_ABI_PENDING task
flag to remind the kernel that, at a later point, it needed to complete the
personality change. Over time, the namespace changes were removed from the
kernel, but this two-step personality switch mechanism remained.
This hackery allowed SET_PERSONALITY() to be called before the
point of no return without breaking the process of tearing down the old
image. What was missing, though, was any mechanism for fully restoring the
old personality should things change after the SET_PERSONALITY()
call. In effect, that call became the real point of no return,
since the kernel had no way of going back to how things were before.
There aren't too many ways that execve() could fail in the window
between the SET_PERSONALITY() call and the official point of no
return. But one is all it takes, and one easily accessible failure mode is
an inability to find the "interpreter" for the new image. The interpreter
need not be an executable; it's really the execution environment as a
whole. As it happens, there's no means by which a 32-bit process can run a
64-bit image; trying to do so leads to a failure in just the wrong part of
the execve() call. Control will return to the calling program,
but with a partially-corrupted personality setup.
As it happens, the most common response to an execve() failure is
to inform the user and exit; the calling program wasn't expecting to be
running any more, so it will normally just bail out. So the schizophrenic
personality it's running under will likely never be noticed. But if the
calling program instead takes a signal which forces a core dump, the
confused personality information will lead to an equally confused kernel and a
panic.
In summary, what we have here is a combination of tricky code, made worse
by inter-architecture compatibility concerns, implementing behavior which
is no longer needed - and doing it wrong. For added fun, it's worth noting
that this problem was reported in December,
but it fell through the cracks and remained unfixed.
The initial solution proposed by Linus was
to simply remove the early SET_PERSONALITY() call. After a bit of
discussion, though, Linus and H. Peter Anvin concluded that it was better
to fix the code for real. The result was a pair of patches, the
first of which splits flush_old_exec() (which contained the point
of no return deeply within) into two functions meant to run before and
after that point. This patch also gets rid of the early
SET_PERSONALITY() call. The
second patch then eliminates the TIF_ABI_PENDING hack, simply
doing the full personality change at the point of no return.
These changes were merged just prior to the release of 2.6.33-rc6. This is
a fairly significant pair of patches to put into the core kernel at this
late stage in the 2.6.33 development cycle. And, indeed, they have caused
some problems, especially with non-x86 architectures. Distributors looking
to backport this fix into older kernels may well find themselves looking
for a way to simplify it. But security fixes are important, and fixes
which get rid of cobweb-encrusted code which could be hiding other problems
are even better. The remaining problems should be cleaned up in short
order, and the 2.6.33 kernel will be better for it.
(
Log in to post comments)