The x86_64 DOS hole
Mathias Krause reported the problem at the end of January. It seems that, on an x86_64 system, a kernel panic can be forced by trying (and failing) to exec() a 64-bit program while running in 32-bit mode, then triggering a core dump. There does not seem to be a way to exploit this bug to run arbitrary code - but those who would take over systems have shown enough creativity in situations like this that one can never be sure. Even without that, though, the ability to take any 64-bit x86 system down is not a good thing. Current kernels are affected, as are older ones; your editor is not aware of anybody having taken the time to determine when the problem first appeared, but Mathias has shown that 2.6.26 kernels contained the bug.
The execve() system call is the means by which a process stops running one program and starts running a new one. It must clean up most (but not all) of the state associated with the old program, resetting things for the new one. In this process, there is a "point of no return": the place where the system call is committed to making the change and can no longer back out. Before this point, any sort of failure should lead to an error return from the system call (which otherwise is not expected to return at all); afterward, the only recourse is to kill the process outright.
Sometime after the point of no return, execve() must adjust the "personality" of the process to match the new executable image. For example, a 64-bit process switching to a 32-bit image must go into the 32-bit personality. In the past, personalities have also been used to emulate other operating environments - running SYSV binaries, for example. The personality changes a number of aspects of the environment the program runs in, though, as we'll see, fewer than it once did.
In the past, personality changes have included filesystem namespace changes. That was necessary because the process of starting the new executable could require looking up other images, such as an "interpreter" image to run the new program. The lookup clearly had to happen prior to the point of no return; if the lookup fails then the system call should fail. So some aspects of the new image's environment had to be present while the process was still running in the context of the old image.
The solution, at the time, was to put some brutal hacks into the low-level SET_PERSONALITY() macro. This macro's job is to switch the process to a new personality, but, post-hack, it no longer did that. Instead, it would make the namespace changes, but leave most of the environment unchanged, setting the special TIF_ABI_PENDING task flag to remind the kernel that, at a later point, it needed to complete the personality change. Over time, the namespace changes were removed from the kernel, but this two-step personality switch mechanism remained.
This hackery allowed SET_PERSONALITY() to be called before the point of no return without breaking the process of tearing down the old image. What was missing, though, was any mechanism for fully restoring the old personality should things change after the SET_PERSONALITY() call. In effect, that call became the real point of no return, since the kernel had no way of going back to how things were before.
There aren't too many ways that execve() could fail in the window between the SET_PERSONALITY() call and the official point of no return. But one is all it takes, and one easily accessible failure mode is an inability to find the "interpreter" for the new image. The interpreter need not be an executable; it's really the execution environment as a whole. As it happens, there's no means by which a 32-bit process can run a 64-bit image; trying to do so leads to a failure in just the wrong part of the execve() call. Control will return to the calling program, but with a partially-corrupted personality setup.
As it happens, the most common response to an execve() failure is to inform the user and exit; the calling program wasn't expecting to be running any more, so it will normally just bail out. So the schizophrenic personality it's running under will likely never be noticed. But if the calling program instead takes a signal which forces a core dump, the confused personality information will lead to an equally confused kernel and a panic.
In summary, what we have here is a combination of tricky code, made worse by inter-architecture compatibility concerns, implementing behavior which is no longer needed - and doing it wrong. For added fun, it's worth noting that this problem was reported in December, but it fell through the cracks and remained unfixed.
The initial solution proposed by Linus was to simply remove the early SET_PERSONALITY() call. After a bit of discussion, though, Linus and H. Peter Anvin concluded that it was better to fix the code for real. The result was a pair of patches, the first of which splits flush_old_exec() (which contained the point of no return deeply within) into two functions meant to run before and after that point. This patch also gets rid of the early SET_PERSONALITY() call. The second patch then eliminates the TIF_ABI_PENDING hack, simply doing the full personality change at the point of no return.
These changes were merged just prior to the release of 2.6.33-rc6. This is
a fairly significant pair of patches to put into the core kernel at this
late stage in the 2.6.33 development cycle. And, indeed, they have caused
some problems, especially with non-x86 architectures. Distributors looking
to backport this fix into older kernels may well find themselves looking
for a way to simplify it. But security fixes are important, and fixes
which get rid of cobweb-encrusted code which could be hiding other problems
are even better. The remaining problems should be cleaned up in short
order, and the 2.6.33 kernel will be better for it.
Index entries for this article | |
---|---|
Kernel | Security/Vulnerabilities |
Security | Linux kernel |
Posted Feb 4, 2010 2:45 UTC (Thu)
by NCunningham (guest, #6457)
[Link] (2 responses)
This hackery allowed SET_PERSONALITY() to be called before the point of new return
should be "no return", methinks.
Posted Feb 4, 2010 16:04 UTC (Thu)
by xav (guest, #18536)
[Link] (1 responses)
Posted Feb 14, 2010 21:49 UTC (Sun)
by efexis (guest, #26355)
[Link]
Posted Feb 4, 2010 15:14 UTC (Thu)
by ortalo (guest, #4654)
[Link] (3 responses)
So, to me, most of the additional effort that may be needed currently for security would be to find a way to convey this trust to the less knowledgeable users. To convey it *honestly* of course. They may be very grateful for this additional tranquility, don't you think?
PS: One caveat with this process however, only average tranquility of the user base may improve. While appeasing the user base, we will probably spot empty holes in our own assurance statements. Most users certainly won't miss them but we may ourselves worry about them and end up sleeping a little less well than before.
Posted Feb 4, 2010 16:46 UTC (Thu)
by NAR (subscriber, #1313)
[Link] (1 responses)
You should read some of spender's comments here at LWN to get a slightly different view...
Given such stories, all in all, I do not especially worry about my kernel being exploited to actually harm me.
I don't know: problem first reported and missed on 15th December 2009, then again reported on 28 January 2010 - that's six weeks of head start for the black hats.
Posted Feb 4, 2010 18:35 UTC (Thu)
by ortalo (guest, #4654)
[Link]
6 weeks from first spot to correction for a DoS is not bad news from my point of view - that's precisely the point.
Furthermore, to show you that I am not simply of the angelic kind, 6 weeks is not an "honest" number. 6 weeks of vulnerability is over-optimistic, it does not take into account the time it will take for this correction to reach the standard Ubuntu kernel on my own computer.
Let's be honest, even pessimistic. Especially as such time measurements are probably not very relevant. Anyway, we have a need for evaluation before adressing the evaluation result.
I am confident Linux will not compare unfavorably - first because I suspect few systems will dare try to stand the comparison. And even in this case it will be worth knowing that linux <=X.Y.Z cannot be used for specific security applications. (*If* competitors can prove to be better of course, and if users cannot wait for X.Y+1.Z... ;-)
Posted Feb 5, 2010 19:36 UTC (Fri)
by PaXTeam (guest, #24616)
[Link]
can you provide a list of security fixes in the current stable tree (2.6.32.x) then? a whole world would be eternally grateful for that.
Posted Feb 4, 2010 22:32 UTC (Thu)
by jimparis (guest, #38647)
[Link] (1 responses)
Posted Feb 11, 2010 13:36 UTC (Thu)
by gvy (guest, #11981)
[Link]
Posted Feb 5, 2010 18:42 UTC (Fri)
by dmills (guest, #54200)
[Link]
I don't know anything like enough about the interaction of core files and the execution environment to know the answer, anyone?
Regards, Dan.
The x86_64 DOS hole
The x86_64 DOS hole
The x86_64 DOS hole
The x86_64 DOS hole
Linux kernel developpers evidently demonstrate pretty high maturity with respect to security issues. (And it has been like this for nearly as long as I can remember...)
They also adhere to the general philosophy of public disclosure. (More precisely, no one among them has ever took action to prevent permanently the disclosure of a vulnerability. They fix it.)
Given such stories, all in all, I do not especially worry about my kernel being exploited to actually harm me. (I suspect you don't either, do you?)
And I like this idea of fighting in a frequently fear-driven field using peaceful assurance. (Should be devastating... ;-)
Linux kernel developpers evidently demonstrate pretty high maturity with respect to security issues. (And it has been like this for nearly as long as I can remember...)
The x86_64 DOS hole
The x86_64 DOS hole
It does not take into account the fact that some black hat (especially a "well-funded" one) may have spotted the vulnerability much earlier. (That's where there is often black magic at work in threat evaluation. But this parameter does exist, even though unobservable.)
It does not take into account the fact that several (how many btw?) distributed kernel versions may have the same flaw and that some of them have been deployed in the field and will never be corrected.
The x86_64 DOS hole
> precisely, no one among them has ever took action to prevent permanently
> the disclosure of a vulnerability. They fix it.)
The x86_64 DOS hole
A Linux/x86_64 DoS vulnerability
http://git.altlinux.org/people/ldv/packages/?p=startup.gi...
(I've been using kernel.core_pattern = /dev/null for a bit longer and it happened to be useful more than once)
The x86_64 DOS hole
Do core dumps contain the personality of the process?
If so, is it possible to artificially craft a core file that when loaded in a debugger sets up an invalid personality?