By Jake Edge
July 3, 2013
The reporting of 1200 bugs, some of which may have security
implications, is
sure to overwhelm any distribution's bug handling abilities. So it was
rather helpful that Alexandre Rebert started out by posting to the debian-devel mailing list
rather than just flooding the bug tracker.
Beyond just the sheer number of bugs, though, there is a question of
dealing with so many potential security issues, which are generally handled
differently than regular bugs.
Rebert
and other security researchers at Carnegie Mellon University (CMU) found the bugs
in binaries from the Debian repositories using an automated bug finder
called Mayhem [PDF]
Mayhem is a closed-source research project at CMU CyLab that
uses symbolic execution on binary programs to find exploitable bugs in the
code.
It does its job by looking for load and store instructions that can be
influenced by the inputs to the program. It examines the paths
through the program using a "hybrid symbolic execution" mechanism that
combines normal execution of the program with symbolic execution of an
intermediate language representation that is created whenever a tainted
(i.e. dependent on
user input) branch condition is detected. The symbolic execution looks for
ways to exploit the tainted code and builds an exploit if it can. The
Mayhem paper goes into a lot more detail, perhaps enough for others to
reproduce the technique.
The bugs are "exploitable" in the sense that each crash can execute arbitrary
code. While code execution bugs are serious, the programs in question are
typically run by regular users from the shell, so being able to get a shell
(which is the usual proof of concept used by demonstration exploits as
well as by Mayhem) is not a huge accomplishment. But being able to get a
shell means that an exploit could do anything the user could do, including
exposing or deleting files, participating in a botnet, sending spam, and so
on. The exploits require specially crafted arguments and/or input files to
trigger the bugs, so users would have to be tricked into running the
programs that way.
Of course, any setuid programs or those accessible via the web or other
internet services are a much larger concern. That's not to downplay what
the Mayhem team has done in any way, but fuzzing has shown us that
arbitrary inputs to programs often lead to crashes—the trick is finding a
way to get users to provide crafted inputs that lead to an interesting (to
the attacker) result. Regardless, the bugs do need to be fixed, and
the Mayhem team has provided a wealth of information to do just that.
Each bug report comes with a tar file (an example for
gcov was provided with Rebert's message) that contains a script to
reproduce the problem, files containing the arguments and input that cause
the crash, the core dump, and more. Reports for each of the bugs were sent
to the appropriate Debian package maintainers, though some of those
addresses were
actually mailing lists, as Paul Wise pointed
out. That allows us to see some of the reports, including
one
for the nfsidmap binary in the nfs-common package. Rebert's
message also linked to a text file that lists
all of the affected packages and their maintainers.
There are almost certainly more bugs out there for Mayhem to find as the
team limited the search space of the tool, allowing just five minutes of
run time per binary. They also limit the bugs reported to one per binary
and five per package. There are likely to be plenty of duplicate bugs on
the list as well; bugs in libraries may well appear for multiple binaries.
And, of course, the bugs aren't limited to Debian, as many of the packages
will be in the repositories of lots of different distributions; all or
nearly all of them will not be Debian-specific at all.
Unfortunately, there is no automated way to extract addresses for the
upstream developers or mailing lists from the Debian packages. The bug
reports may ultimately need to make their way upstream, but the Mayhem team
couldn't find a way to do that, so they started with the Debian
maintainers. As Andreas
Tille noted, some
packages may have implemented the machine-readable debian/copyright
file, which might provide an upstream contact and email address. But,
for security reports, even that may not be the right place to send the
message.
But, in fact, Rebert has recognized that the
security tag on most of the proposed bug reports was probably not accurate. "It looks like a majority of the crashes have
little security implications", he said, so that tag will be removed
before the actual bug reports get submitted. It isn't clear that a
security contact would be needed in the majority of cases but, since Mayhem
sets out to find exploitable bugs, "responsible disclosure" might still
indicate that a security list or email should be used to report the problems.
The problem is, in some ways, similar to the question of where bugs should be filed that we
reported on last week. Which bug tracker (distribution or upstream) to use
is contentious enough when looking at single bugs reported by users; 1200
bugs increases the scale of the problem significantly. The clear
indication is that Mayhem can find lots more if it were given free rein,
though the duplicates need to eliminated or substantially reduced or the
team risks overwhelming distributions and upstreams.
The "huge pile of bugs" problem is a consequence of the closed-source
nature of Mayhem. If the tool were available to be used by various
projects' developers as part of their testing, the bugs could be
found and fixed in the normal course of development. Rebert mentioned the
possibility of creating some kind of Mayhem web service, but it would be
far more useful if the tool was free software (even "free as in beer" would
be better than the existing situation). Since public funds were used to
develop the tool, one might hope the public would get a bit more out of
that spending. The Mayhem paper mentions that the
US Defense Advanced Research Projects
Agency (DARPA) helped fund some of the work, but, alas, that funding doesn't
seem to come with a mandate to publish the source.
It's clear that running Mayhem on the 23,000 or so binaries found in the
Debian "Wheezy" repository has found real bugs, some of which are
"exploitable" in limited scenarios. Some are probably worse than that,
however, and as the tool gets improved, it may be able to narrow in on more
dangerous bugs. One might guess that CMU and the Mayhem developers plan
to commercialize Mayhem. That is, of course, their prerogative, but it is
unfortunate that tools like Mayhem and the Coverity static analyzer
(which came out of Stanford University)
are not free software tools. One suspects they would see much more
use—and, possibly,
improvement—if they were.
(
Log in to post comments)