By Jake Edge
January 25, 2012
A privilege escalation in the kernel is always a serious threat that
leads kernel hackers and distributions to scramble to close the hole
quickly. That's exactly what happened after a January 17 report from Jüri
Aedla to the closed kernel security mailing list. But most people
didn't learn of the hole from Aedla (since he posted to a closed list), but
instead from Jason Donenfeld (aka zx2c4) who posted a detailed look at the flaw on January
22. The fix was made by Linus Torvalds and went
into the mainline on January 17, though with a commit message that
obfuscated the security implications—something that didn't sit well
with some.
The problem and exploit
The problem itself stems from the removal
of the restriction on writes to /proc/PID/mem that was merged for the
2.6.39 kernel. It was part of a patch set
that was specifically targeted at allowing debuggers to write to the memory
of processes easily via the /proc/PID/mem file. Unfortunately, it
left open a hole that Aedla and Donenfeld (at least) were able to exploit.
The posting by Donenfeld is worth a read for those interested in how
exploits of this sort are created. The problem starts with the fact that
the open() call for /proc/PID/mem does no additional
checking beyond the normal VFS permissions before returning a file
descriptor. That will prove to be a mistake, and one that Torvalds's fix
remedies. Instead of checks at open() time, the code would check
in write() and only allow
writing if the process being written to is the same as the process doing
the writing (i.e. task == current).
That restriction seems like it would make an exploit difficult, but it can
be avoided with an exec() and coercing the newly run program to do
the writing to itself. That will be dangerous if the newly run program is a setuid executable for
example. But there is another test that is
meant to block that particular path, by testing that
current->self_exec_id has the same value as it did at
open() time. self_exec_id is incremented every time that
a process does an exec(), so it will definitely be different after
executing the setuid binary. But, since it is simply incremented, one can
arrange (via fork()) to have a child process with the same
self_exec_id as the main process after the setuid
exec()
is done.
The child with the "correct" self_exec_id value (which it gets by doing
an exec()) can then
open the parent's /proc/PID/mem file (since there are no extra
checks on the open()) and pass the descriptor back to the parent via
Unix sockets. The parent then needs to arrange that the setuid
executable writes to that file descriptor once a seek() to the
proper address has been done. Finding that proper address and getting the
binary to write to the fd are the final pieces of the puzzle.
Donenfeld's example uses su
because it is not compiled as a position-independent executable (PIE) for
most distributions, which makes it easier to figure out which address to
use. He exploits the fact that su prints an error message when it
is passed an unknown username and the error message helpfully prints the
username passed. That allows the exploit to pass shellcode (i.e. binary
machine language that spawns a shell when executed) as the argument to
su.
After printing the error message, su calls the exit()
function (really exit@plt), which is what Donenfeld's exploit
overwrites. It finds the
address of the function using objdump, subtracts the length of the
error message that gets printed before the argument, and seeks the file to
that location. It uses dup2() to connect stderr to the
/proc/PID/mem file descriptor and execs
su "shellcode".
In pseudocode, it might look something like this:
if (!child && fork()) { /* child flag set based on -c */
/* first program invocation, this is parent, wait for fd from child */
fd = recv_fd(); /* get the fd from the child */
dup2(2, 15);
dup2(fd, 2); /* make fd be stderr */
lseek(fd, offset); /* offset to overwrite location */
exec("/bin/su", shellcode); /* will have self_exec_id == 1 */
}
else if (!child) {
/* this is the child from the fork(), exec with child flag */
exec("thisprog", "-c"); /* this program with -c (child) */
}
else {
/* child after exec, will have self_exec_id == 1 */
fd = open("/proc/PPID/mem", O_RDWR); /* open parent PID's mem file */
send_fd(fd); /* send the fd to the parent */
}
Of course Aedla's
proof-of-concept
or Donenfeld's
exploit
code are likely to be even more instructive.
It's obviously a complicated multi-step process, but it is also a completely
reliable way to get root privileges. Updates to Donenfeld's post show
exploits for distributions like Fedora that do build su as a PIE,
or for Gentoo where the read permissions on setuid binaries have been
removed so objdump can't be used to find the address of
the exit function. For Fedora, gpasswd can be substituted as it
is not built as a PIE, while on Gentoo, ptrace() can be used to
find the needed address. While it was believed that address space layout
randomization (ASLR) for PIEs would make exploitation much more difficult, that
proved to be only a small hurdle, at least on 32-bit systems.
The fix and reactions
The fix hit the mainline without any coordination with Linux
distributions. Kees Cook, who works on ChromeOS security (and
formerly was a member of the Ubuntu security team), told LWN that Red Hat has a person on the closed
kernel security mailing list, so it was aware of the problem but did not
share that information on the Linux distribution security list. "I've been told this will
change in the future, but I'm worried it will be overlooked again",
he said. The first indication that
other distributions had was likely from Red Hat's Eugene Teo's request for a CVE on the
oss-security mailing list.
As Cook points out, the abrupt public disclosure of the bug (via a mainline
commit) runs
counter to the policy described in
the kernel's Documentation/SecurityBugs
file, where the default policy is to leave roughly seven days between
reports to the mailing list and public disclosure to allow time for vendors
to fix the problem. Cook is concerned that bugs
reported to security@kernel.org are not being handled reasonably:
The current behavior of security@kernel.org harms end users, harms
distros, and harms security researchers, all while ignoring their own
published standards of notification. I have repeatedly seen the security@kernel.org list hold a
double-standard of "it is urgent to publish this security fix" and
"it's just a bug like any other bug". If it were just a bug, there
should be no problem in delaying publication. If it were an urgent
security fix, all the distros should be notified.
The "just a bug" refers to statements that Torvalds has made over the years
about security bugs being no different than any other kind of bug. In
email, Torvalds
described it this way:
To me, a bug is a bug. Nothing more, nothing less. Some bugs are
critical, but it's not about some random "security" crap - it could be
because it causes a machine to crash, or it could be because it causes
some user application to misbehave.
In keeping with that philosophy, Torvalds does not disclose the security
relevance of a fix in the commit message: "I think the whole 'mark this patch as having security implications' is
pure and utter garbage". Even if there is a known security problem
that is being fixed, his commit
messages do not reflect that, as with the message for the
/proc/PID/mem fix:
Jüri Aedla reported that the /proc/<pid>/mem handling really isn't very
robust, and it also doesn't match the permission checking of any of the
other related files.
This changes it to do the permission checks at open time, and instead of
tracking the process, it tracks the VM at the time of the open. That
simplifies the code a lot, but does mean that if you hold the file
descriptor open over an execve(), you'll continue to read from the _old_
VM.
Torvalds's commit message stands in pretty stark contrast to Aedla's report
to security@kernel.org (linked above):
I have found a privilege escalation vulnerability, introduced by making
/proc/<pid>/mem writable. It is possible to open /proc/self/mem as stdout or
stderr before executing a SUID. This leads to SUID writing to it's own memory.
This "masking" of the actual reason for a commit doesn't site well with
either Cook or Teo (who also responded to an email query). Cook "cannot overstate how much I am
against this kind of masking", while Teo pointed out that this
particular bug is in no way unique:
There are many kernel vulnerabilities that were fixed silently in the
upstream kernel. This is not the first one, nor will be the last one I'm
afraid.
Both Teo and Cook were in agreement that disclosing what is known about a
fix at the time it is applied can only help distributions and others trying
to track kernel development. Torvalds, on the other hand, is concerned
about attackers reading commit messages, which could lead to more attacks
against Linux systems. He has a well-known contempt for security
"clowns" that seems to also factor into his reasoning:
So I just ignore the idiots, and go "fix things asap, but try not to
help black hats". No games, no crap, just get the damn work done and
don't make a circus out of it.
Both the security camps hate me. The full disclosure people think I
try to hide things (which is true), while the embargo people think I
despise their corrupt arses (which is also true).
The strange thing is that by explicitly not putting the known
security implications of a patch into the commit message, Torvalds
is treating security bugs differently. They are no longer "just
bugs" because some of the details of the bug are being purposely omitted.
That may make it difficult for "black hats"—though it would be
somewhat surprising if it did—but it definitely makes it more difficult
for those who are trying to keep Linux users secure. Worse yet, it makes
it more difficult down the road when someone is looking at a commit (or
reversion) in isolation because they may miss out on some important context.
Silent security fixes are a hallmark of proprietary software, and
Torvalds's policy resembles that to some extent. It could be argued (and
presumably would be by Torvalds and others) that the fixes aren't
silent since they go into a public repository and that is
true—as far as it goes. By deliberately omitting important
information about the bug, which is not done for most or all other
bugs, perhaps they aren't so much silent as they are "muted" or, sadly,
"covered up". There is definitely a lot of validity to Torvalds's
complaints about the security "circus", but his reaction to that circus
may not be in the best interests of the kernel community either.
(
Log in to post comments)