|
|
Log in / Subscribe / Register

Jones: system call abuse

Dave Jones has been fuzzing Linux system calls lately, and has found a bug in the interaction between perf and mprotect(). He has plans for adding other fuzzing techniques and expects that this is just the first bug that will be found. "So I started exploring the idea of writing a tool that instead of passing random junk, actually passed semi sensible data. If the first thing a syscall does is check if a value is between 0 and 3, then passing rand() % 3 is going to get us further into the function than it would if we had just passed rand() unmasked. There are a bunch of other things that can be done too. If a syscall expects a file descriptor, pass one. If it expects an address of a structure, pass it realistic looking addresses (kernel addresses, userspace addresses, 'weird' looking addresses)."

to post comments

Jones: system call abuse

Posted Nov 9, 2010 22:34 UTC (Tue) by Wummel (guest, #7591) [Link] (4 responses)

Dave also makes some implicit statements here which I think are the reasons why fuzzing is not more widely seen in application or API development:
a) usable fuzzing requires in-depth or expert knowledge of the function that is to be tested
b) it requires quite a lot of work to get useful fuzzing results (ie. detecting "real" bugs)

So with this in mind I am happy Dave is investing his time to do such useful security testing :-)

Jones: system call abuse

Posted Nov 9, 2010 23:12 UTC (Tue) by wahern (subscriber, #37304) [Link] (3 responses)

Much of that in-depth knowledge can be determined by a program. See KLEE (http://klee.llvm.org/).

With KLEE you flag interesting variables and then it attempts to examine all the possible code paths dependent on the value (e.g. range) of that variable, generating test cases to intelligently fuzz those paths.

The more code KLEE must examine the more time it takes (hours or days or forever), and it can only handle deterministic code (i.e. no user input), so KLEE is a tool, not a solution. But a very interesting tool.

Jones: system call abuse

Posted Nov 9, 2010 23:33 UTC (Tue) by wahern (subscriber, #37304) [Link] (2 responses)

An example analogous to the `rand() % 3' case: http://blog.llvm.org/2010/04/whats-wrong-with-this-code.html

Jones: system call abuse

Posted Nov 9, 2010 23:45 UTC (Tue) by dmarti (subscriber, #11625) [Link]

Also from Dave: user space sucks...so if you know how it sucks from that tool, you know possible values to pass into this tool?

Jones: system call abuse

Posted Nov 10, 2010 0:19 UTC (Wed) by proski (guest, #104) [Link]

That's a wonderful example of a program being smarter than the user expected it to be!

Jones: system call abuse

Posted Nov 10, 2010 0:51 UTC (Wed) by roc (subscriber, #30627) [Link] (2 responses)

Check out the CMU "Ballista" project that did this ten years ago.

Jones: system call abuse

Posted Nov 10, 2010 5:07 UTC (Wed) by ejr (subscriber, #51652) [Link]

My kingdom for a "like" button.

Jones: system call abuse

Posted Nov 11, 2010 17:33 UTC (Thu) by vonbrand (guest, #4458) [Link]

Ballista fails to compile due to using ancient C++ (or GCC, whatever).

smart fuzzing

Posted Nov 10, 2010 2:00 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

I've done this kind of thing twice in the dim past, and would like to find spare time to do it again because I find it quite rewarding. I have a feeling that most people don't feel the same way.

In SANE I fed just slightly unexpected values into the buffer size parameters, finding (as I expected given the mysterious crashes reported) that several backends wrongly assumed they would be asked for buffers that were at least a certain size, or were a multiple of some small integer like 4 even though the specification does not require this.

Last time I looked my test code still lives in the SANE command line tools, hopefully new driver authors are testing their code with it.

For LADSPA I wrote a tool named 'demolition' which sees what happens when legal but extraordinary values are fed into a LADSPA audio plugin, either as parameters or as audio data. It judges which values will be considered extraordinary in part by examining the plugin's built-in metadata. A compliant plugin should at worst run very slowly (and a watchdog timer moves on to the next test in this case) but often they crash or exhibit other undesirable behaviour.

Jones: system call abuse

Posted Nov 10, 2010 6:03 UTC (Wed) by MisterIO (guest, #36192) [Link] (2 responses)

If the function checks if a value is between 0 and 3, isn't it better to do rand % 4 then?

Jones: system call abuse

Posted Nov 10, 2010 6:18 UTC (Wed) by wahern (subscriber, #37304) [Link] (1 responses)

Ha. Good catch. If it was meant between 0 and 3 inclusive then yes. And that's fair assumption in this case, because if it was meant exclusive then rand() % 3 still wouldn't be right; you'd need (rand() % 2) + 1.

You can take the programmer out of the python.

Posted Nov 10, 2010 13:42 UTC (Wed) by gmatht (subscriber, #58961) [Link]

But you can't take the python out of the programmer.

$ python
>>> range(0,3)
[0, 1, 2]

Perhaps he was testing the secret rewrite of Linux in Python? ;)

Automatically shrinking the recipe generated by fuzz testing.

Posted Nov 10, 2010 7:32 UTC (Wed) by gmatht (subscriber, #58961) [Link] (1 responses)

I fuzzed tested the LyX project with a tool I call Keytest. This randomly generates key-presses and feed them to the GUI under test until it gets a crash.

Once it gets a crash it discards key-presses not required to reproduce the bug, to produce a small recipe. This recipe can be sent to the developer and manually reproduced, or used to quickly run an automated bisect to pin down a regression. I would have thought that another fuzz testing tool would also refine the recipes it outputs, but I haven't found this in any published feature lists. Has anyone come across this feature elsewhere?

Also, would this feature be useful for testing the kernel, or is it usually the single last syscall that causes the problem?

Automatically shrinking the recipe generated by fuzz testing.

Posted Nov 10, 2010 16:37 UTC (Wed) by JoeBuck (subscriber, #2330) [Link]

These kinds of approaches (constrained random simulation, constraint solving, extraction of test cases that produce failures) are standard in hardware verification.

Jones: system call abuse

Posted Nov 10, 2010 9:22 UTC (Wed) by ballombe (subscriber, #9523) [Link] (1 responses)

I have been using fuzzing to find bugs in a computer algebra system since 2001 (and still does) with much succes. The trick is to generate input that is 90% "garden variety" and 10% "alien" so that it passes most sanity tests. This is very efficient for finding crashes, much less so for other kind of wrong behaviour.

Jones: system call abuse

Posted Nov 10, 2010 10:42 UTC (Wed) by njd27 (subscriber, #5770) [Link]

One of the "easy tasks" in the LibreOffice list is developing a fuzz tester:

http://wiki.documentfoundation.org/Easy_Hacks#Fuzz_XML_fi...

Jones: system call abuse

Posted Nov 10, 2010 11:28 UTC (Wed) by ms (subscriber, #41272) [Link] (2 responses)

I'm too lazy to read through Dave's posts, but is he aware of the body of work that's been done on this with things like QuickCheck, SmallCheck, LazySmallCheck &c?

Jones: system call abuse

Posted Nov 10, 2010 15:28 UTC (Wed) by adobriyan (subscriber, #30858) [Link] (1 responses)

How many kernel bugs these foochecks have found?

Jones: system call abuse

Posted Nov 10, 2010 15:35 UTC (Wed) by ms (subscriber, #41272) [Link]

Erm, well if you rewrote the kernel in Haskell, no doubt a vast number, but the kernel would be vastly less buggy anyway due to the existence of a decent type system that eliminates one large class of bugs.

(I'm _not_ suggesting that writing a kernel in Haskell is a good idea. I'm merely pointing out that randomised testing based on type analysis of the inputs to functions is a well studied area.)

The code looks pretty rough

Posted Nov 10, 2010 15:27 UTC (Wed) by Ross (guest, #4065) [Link] (2 responses)

I took a quick look at the way it was generating plausible file descriptors, PIDs, etc. and noticed lots of mishandling of rand output:

sanitise.c -> get_interesting_value:
i = rand() & 20;
...
switch (i) {
case 0: return 0x00000001;
...
case 20: return 0xffffffffffffffff;

So I think that should have been rand() % 21...

sanitise.c -> get_address:
i = rand() % 2
...
switch (i) {
case 0: return KERNEL_ADDR;
...
case 2: return get_interesting_value();

So that probably should be rand() % 3

That's not to mention how horrible rand() is at actually being random, especially
in the lower two bits of output. Between those bugs it really reduces the number
of addresses that are tried. I notice that random() is used other places but
srandom() isn't ever called (though srand() is called twice).

I also wonder if address space randomization makes this less useful -- how
often does it fail to reproduce the same crash or misbehavior because memory
has shifted around?

I suppose I should send a patch instead of complaining.

-Ross

The code looks pretty rough

Posted Nov 10, 2010 16:32 UTC (Wed) by MisterIO (guest, #36192) [Link]

Ah, so he's consistent with the error I reported above! I thought it was just a typo or a distraction.

The code looks pretty rough

Posted Nov 10, 2010 19:49 UTC (Wed) by chad.netzer (subscriber, #4257) [Link]

Ugh, mistyping AND(&) for MOD(%) is brutal.

Jones: system call abuse

Posted Nov 11, 2010 21:05 UTC (Thu) by PaXTeam (guest, #24616) [Link]

what's the CVE number for this bug?

Jones: system call abuse

Posted Nov 17, 2010 0:38 UTC (Wed) by Baylink (guest, #755) [Link]

This reminds me of my favorite IOCCC winner category: "Worst abuse of the rules".


Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds