By Jonathan Corbet
February 19, 2013
A security-oriented firm called Trustwave recently sent out
a
preview of an upcoming report [PDF] that features some focused criticism of
how the Linux community handles security vulnerabilities. Indeed, it says:
"
Software developers vary greatly in their ability to respond and
patch zero-day vulnerabilities. In this study, the Linux platform had the
worst response time, with almost three years on average from initial
vulnerability to patch." Whether or not one is happy with how
security updates work with Linux, three years sounds like a rather longer
response time than most of us normally expect. Your editor decided to
examine the situation by focusing on two vulnerabilities that are said to
be included in the Trustwave report and one that is not.
Three years?
As of this writing, Trustwave's full report is not available, so a detailed
look at its claims is not possible. But, according to this
ZDNet article, the average response time was calculated from these two
"zero-day" vulnerabilities:
- CVE-2009-4307: a divide-by-zero crash
in the ext4 filesystem code. Causing this oops requires convincing
the user to mount a specially-crafted ext4 filesystem image.
- CVE-2009-4020: a buffer overflow in
the HFS+ filesystem exploitable, once again, by convincing a user to
mount a specially-crafted filesystem image on the target system.
The ext4 problem was reported on
October 1, 2009 by R.N. Sastry, who had been doing some filesystem fuzz
testing. The report included the filesystem image that triggered the bug —
that is the "exploit code" that Trustwave used to call this bug a zero-day
vulnerability. Since the problem was limited to a kernel oops, and since
it required the victim's cooperation (in the form of mounting the
attacker's filesystem) to trigger, the ext4 developers did
not feel the need to drop everything and fix it immediately; Ted Ts'o
committed a
fix toward the end of November. SUSE was the first distributor to
issue an update containing the fix; that happened on January 17, 2010.
Red Hat did not put out an update until the end of March — nearly five
months after the problem was disclosed — and Mandriva waited until February
of 2011.
One might argue that things happened slowly, even for an extremely
low-priority bug, but where does "three years" come from? It turns out
that the fix did not work properly on the x86 architecture; Xi Wang
reported the problem's continued existence
on December 26,
2011, and sent a proper fix on
January 9, 2012. A new CVE number (CVE-2012-2100) was assigned for the problem
and the fix was promptly committed into the mainline. Distributors were a
bit slow to catch up, though; Debian issued an update in March, Ubuntu in
May, and Red Hat waited until mid-November — nearly eleven months after
disclosure — to ship the fix to its users. The elapsed time from the
initial disclosure until Red Hat's shipping an update that fixes the
problem properly is, indeed, just over three years.
The story for the HFS/HFS+ vulnerability is similar. An initial patch
fixing a buffer overflow in the HFS filesystem was posted by Amerigo Wang
at the beginning of December, 2009. The fix was committed by Linus on
December 15, and distributor updates began with Red Hat's on
January 19, 2010. Some distributors were rather slower, but it was
another hard-to-exploit bug that was deemed to have a low priority.
The problem is that the kernel supports another (newer) filesystem called
HFS+. It
is a separate filesystem implementation, but it contains a fair amount of
code that was cut-and-pasted from the original HFS implementation, much like ext4
started with a copy of the ext3 code. The danger of this type of code
duplication is well known: developers will fix a bug in one copy but not
realize that the same issue may be present in the other copy as well.
Naturally enough,
that was the case here; the HFS+ filesystem had the same buffer overflow
vulnerability, but nobody thought to do anything about it until Timo Warns
quietly told a few kernel developers about it at the end of April 2012.
Greg Kroah-Hartman committed
a fix on May 4, and the problem was publicly disclosed a few days
after that. Once again, a new CVE number (CVE-2012-2319) was assigned, and, once again,
distributors dawdled with the fixes; openSUSE sent an update in June, while
Red Hat waited until October, five months after the problem became known.
The time period from the initial disclosure of the HFS vulnerability until
Red Hat's update for the HFS+ problem was just short of three years.
One could look at this situation two ways. On one hand, Trustwave has clearly chosen
its vulnerabilities carefully, then applied an interpretation that yielded
the longest delay possible. Neither story above describes a zero-day
vulnerability knowingly left open for three years; for most of that time,
it was assumed that the problems had been fixed. That is doubly true for
the HFS+ filesystem, for which the vulnerability was not even disclosed
until May, 2012. Given the nature of the vulnerabilities, it is highly
unlikely that the black hats were jealously guarding them in the meantime;
the odds are good that no system has ever been compromised by exploiting
either one of them. Trustwave's claims, if they are indeed built on these
two vulnerabilities, are dubious and exaggerated at best.
On the other hand, even low-priority vulnerabilities requiring the victim's
cooperation should be fixed — and fixed properly — in a timely manner,
and it is not at all clear that happened with these problems. The
response to the ext4 problem was arguably fast enough given the nature of
the problem, but the fact that the problem persisted on the obscure x86
architecture suggests that the testing applied to that fix was, at best,
incomplete. In the HFS/HFS+ case, one could argue that somebody
should have thought to check for copies of the bug elsewhere. The fact
that the HFS and HFS+ filesystems are nearly unused and nearly unmaintained
did not
help in this case, but attackers do not restrict themselves to
well-maintained code. And, for both bugs, distributors took their time to get
the fixes out to their users. We can do better than that.
Meanwhile, in 2013
Perhaps the slowness observed above is the natural response to
vulnerabilities that nobody is actually all that worried about. Had they
been something more serious, it could be argued, the response would have
been better. As it happens, there is an open issue at the time of this
writing that can be examined to see how well we do respond; the answer
is a bit discouraging.
On January 20, a discussion on the private kernel security list went public
with this patch posting by Oleg Nesterov.
It seems that the Linux implementation of the ptrace() system call
contains a race condition: a traced process's registers can be changed in a
way that causes the
kernel to restore that process's stack contents to an arbitrary location.
The end result
is the ability to run arbitrary code in kernel mode. It is a local attack,
in that the attacker needs to be able to run an exploit program on the
target system. But, given the ability to run such a program, the attacker
can obtain full root privileges. That is the kind of vulnerability
that needs quick attention; it puts every system out there at the mercy of
any untrusted users that may have accounts there — or at the mercy of any
attacker that may be able to
compromise a network service to run an arbitrary program.
On February 15, the vulnerability was disclosed as such, complete with handy exploit
code for those who do not wish to write their own. Most victims are
unlikely to apply the kernel patch included with the exploit that makes the
race condition easier to
hit; the exploit also needs the ability to run a process with real-time
priority to win the race more reliably.
But, even without the patch or real-time scheduling, a sufficiently patient
attacker should be able to
time things right eventually. Solar Designer reacted to the disclosure this way:
I haven't looked into this closely yet, but at first glance it
looks like the worst Linux kernel vulnerability in a few years.
For distro vendor kernels (rather than mainline, which was patched
almost a month ago), this is a 0-day.
Arguably this should not be a zero-day vulnerability: the public discussion
of the fix is nearly one month old, and the private discussion had been
going on for some time before. But, as of this writing, no distributors
have issued updates for this problem. That leads to some obvious
questions; quoting Solar Designer again:
The mainline commits from January are by Oleg Nesterov of Red Hat.
Why wasn't(?) the issue handled with due severity within Red Hat,
then - such that Red Hat would at the very least have a statement
on whether and which of their kernels are affected by now.
One assumes that such a statement will be forthcoming in the near future. In the meantime,
users and system administrators worldwide need to be worried about whether
their systems are vulnerable and who might be exploiting the problem.
Once again, we can do better than that. This bug was known to be a serious
vulnerability from the outset; one of the developers who reported it
(Salman Qazi, of Google) also provided the exploit code to show how severe
the situation was. Distributors knew about the problem and had time to
respond to it — but that response did not happen in a timely manner. The
ptrace() problem will certainly be
straightened out in less than three years, but that still may not be a
reason for pride. Users should not be left wondering what the situation is
(at least) one month after distributors know about a serious vulnerability.
(
Log in to post comments)