|
|
Subscribe / Log in / New account

Fuzz and strings

By Jake Edge
November 19, 2014

The strings utility is commonly used to scan binary files for "interesting" strings. But, as Michał Zalewski found out when fuzz testing the tool, it does a lot more than just loop through the file looking for bytes of interest (e.g. ASCII). In fact, by default it does so much more that scanning random files with strings is seriously contraindicated.

One of the more frequent uses for strings is to scan binaries from various potentially dodgy sites to check for the presence of certain code (say, GPL-covered code), undocumented options, or other bits buried in the binary. But, as Zalewski showed, that could directly lead to code execution—probably not what the investigator intended. The problem stems from the fact that, by default, strings tries to be "helpful" by using other tools at its disposal—but it turns out, those tools are rather more dangerous as well.

strings comes from the GNU binutils project, which also provides the "Binary File Descriptor" library (libbfd). That library is used by several tools in binutils to tease apart various object file formats into an abstract representation. When strings detects an executable or object file in a format understood by libbfd, it uses the library to only search those pieces that might contain strings of interest. Unfortunately, libbfd appears to have some serious flaws in parsing the format of some object files, which can lead to exploitable crashes.

When running his american fuzzy lop (afl) fuzzer, Zalewski found "a range of troubling and likely exploitable out-of-bounds crashes due to very limited range checking". He went on to show an example where the code reads and writes using an attacker-controlled pointer, which is a recipe for code execution. In addition, he said that many Linux distributions do not build strings with address-space layout randomization (ASLR), making attacks using strings even easier. For example, many binaries in Fedora are not built as position-independent executables (PIEs), which is what allows ASLR. Fedora doesn't build PIEs for everything because of the extra overhead in executing an ASLR-protected binary. Network-facing programs (e.g. sshd), however, are shipped with ASLR protection.

Any attack requires somehow convincing the victim to run strings on a crafted binary file; that could result in the compromise of the account used to run the program, which is presumably—hopefully—not root. Using the -a (or --all) switch for strings will revert its behavior to what folks arguably expect: a simple loop through the binary looking for consecutive bytes of interest. That is unlikely to lead to any kind of compromise. Zalewski suggested that distributions may want to consider making strings -a the default in the future.

Other tools in binutils (e.g. objdump, readelf) may also suffer from similar problems, as they use libbfd too. Users of those tools may be a bit more aware that they are taking some risk when running them on random, possibly attacker-controlled binaries, however. The comments on the post indicate that GDB may also be vulnerable since it uses libbfd as well.

Zalewski reported the problem to the binutils project, leading to some fixes, though there has not been a new binutils release—at least yet. Interestingly, Tavis Ormandy found a related bug all the way back in 2005.

In an aside, Zalewski noted that he did the fuzzing on his matchbox-sized 10-core fuzzing cluster. That report compares the price-performance tradeoffs for several different fuzzing platforms (e.g. "cloud", virtual private server, his Intel Edison cluster). With luck, that may help lead others to start doing more focused fuzzing of our tools.


Index entries for this article
SecurityFuzzing
SecurityVulnerabilities/Code execution


to post comments

Fuzz and strings

Posted Nov 20, 2014 8:42 UTC (Thu) by jezuch (subscriber, #52988) [Link] (4 responses)

> Fedora doesn't build PIEs for everything because of the extra overhead in executing an ASLR-protected binary.

May I postulate for consideration a proposition that computers are fast enough these days and we have enough "space cycles" not to worry about things like that for all except the most CPU-intensive or latency-sensitive programs?

Fuzz and strings

Posted Nov 20, 2014 10:58 UTC (Thu) by faramir (subscriber, #2327) [Link] (1 responses)

For a program that you start and leave running for a long time, you might be correct. However if one is running an intensive shell script, it might startup a simple program thousands of times. Or even one-liners like the following:

find . -type f -name '*.txt' -exec grep foobar \{\} /dev/null \;

I'm sure there are more efficient ways to do that (probably xargs), but I still think it is worth considering.

Fuzz and strings

Posted Nov 21, 2014 9:51 UTC (Fri) by oever (guest, #987) [Link]

Yes, there is a more efficient way to do that. You can make that command roughly 8x faster by using + instead of \;. Also, you do not need to escape the {}.

  find . -type f -name '*.txt' -exec grep foobar {} +

The difference between \; and + is that \; runs the command for each file, while + uses as many arguments as will fit in 128k bytes.

  find ~ -type f -exec bash -c 'echo $#' {} +

Fuzz and strings

Posted Nov 20, 2014 18:44 UTC (Thu) by dlang (guest, #313) [Link] (1 responses)

> May I postulate for consideration a proposition that computers are fast enough these days and we have enough "space cycles" not to worry about things like that for all except the most CPU-intensive or latency-sensitive programs?

This may be true for your personal laptop, but on servers the question isn't "how much spare processor does this system have?" but rather "how many systems do I need to have to serve my users?". If you do something that takes an extra 10% CPU that means that you have to buy 10% more systems.

Fuzz and strings

Posted Nov 21, 2014 0:22 UTC (Fri) by raven667 (subscriber, #5198) [Link]

There is a weird dumbbell curve where efficiency matters a lot for small battery powered systems decreasing as battery power increases, very little for single-user desktops and high powered servers, and then very much again in large environments with consolidation and virtualization.

My co-worker has a funny story from when they managed a team in the late '90s, a developer wanted to write software in a high-level language like C++ or something to save half the time and their response was that even if the developer saved half the time, but the thousands of deployed systems required a ram upgrade of $1k each that this would pay for a developer several times over to make the software more efficient and they don't have time or money to do re-work or deploy upgrades so please just make it efficient the first time though kthxbye.

Fuzz and strings

Posted Nov 20, 2014 16:20 UTC (Thu) by xz (guest, #86176) [Link] (1 responses)

readelf is not affected. Its manpage says:

This program performs a similar function to objdump but it goes into more detail and it exists
independently of the BFD library, so if there is a bug in BFD then readelf will not be affected.

And indeed it is not linked to libbfd.

Fuzz and strings

Posted Nov 20, 2014 16:37 UTC (Thu) by jake (editor, #205) [Link]

> readelf is not affected.

good point ... fixed now, thanks ...

jake

Fuzz and strings

Posted Nov 22, 2014 19:34 UTC (Sat) by jwilk (subscriber, #63328) [Link] (2 responses)

-a/--all will be default in binutils 2.25:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;...

--all option to choose to do less extra stuff?

Posted Dec 4, 2014 15:23 UTC (Thu) by perlwolf (guest, #46060) [Link] (1 responses)

How can an option named "--all" sensibly mean "don't use all of the extra capabilities"? I'd expect a --all option to "turn on everything" not "turn off the binary stuff".

--all option to choose to do less extra stuff?

Posted Dec 4, 2014 15:48 UTC (Thu) by mpr22 (subscriber, #60784) [Link]

How can an option named "--all" sensibly mean "don't use all of the extra capabilities"?

When it means "print all the strings" instead of "print some of the strings".


Copyright © 2014, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds