LWN.net Logo

Nonsense. This is trivial stuff.

Nonsense. This is trivial stuff.

Posted Jul 17, 2009 19:47 UTC (Fri) by mikov (subscriber, #33179)
Parent article: Linux 2.6.30 exploit posted

Is this an April Fool's joke? Is everybody insane? This is a trivial bug that is immediately visible and would be discovered by any source code audit.

The linked post from Bojan Zdrnja is utter nonsense - he appears to be profoundly confused.

I don't see how this has generated such an amount of discussion. It is as simple as that:

1. Use pointer that could be NULL
2. Check if pointer is NULL

What is here to discuss? What compiler bugs?

The exploitation of the bug may be creative, but the bug itself is OBVIOUS!!! Really, this is the first time in ages when I am disappointed by LWN. Please, somebody should edit the story.


(Log in to post comments)

Nonsense. This is trivial stuff.

Posted Jul 17, 2009 20:03 UTC (Fri) by spender (subscriber, #23067) [Link]

If you don't understand the difference between a
harmless oops and arbitrary code execution in
kernel context, or why these bugs are important,
then apparently the joke's on you.

-Brad

Nonsense. This is trivial stuff.

Posted Jul 17, 2009 20:16 UTC (Fri) by mikov (subscriber, #33179) [Link]

If you don't understand why using a possibly NULL pointer _before_ checking it is an obvious bug, the joke is on you. And it isn't even a joke - it is sad.

The bug itself is extremely trivial and glaringly obvious. There is no need to involve the C standard to understand it. Suggesting a "compiler bugs" for this, I am sorry I have to say it, is quite silly.

As I already said, the exploitation of the bug however is creative and impressive - my hat is off for finding the bug and exploiting it to whoever did it.

Nonsense. This is trivial stuff.

Posted Jul 17, 2009 20:34 UTC (Fri) by corbet (editor, #1) [Link]

There is an interesting aspect to this that nobody has mentioned: I have seen reviewers advising the removal of NULL pointer checks with the reasoning that, should a NULL pointer ever come about, the resulting kernel oops is just as useful as a "this pointer was null!" message. In the mean time, the overhead of the check can be avoided.

The idea, clearly, is that the memory management hardware can be counted on to do the NULL check so there's no need to do it in software too. But if that address can be mapped, that reasoning clearly does not hold. I don't doubt there are other situations like this one in the kernel code; for most systems, disallowing mappings at zero seems like a reasonable step to take.

Removing NULL checks

Posted Jul 17, 2009 21:02 UTC (Fri) by corbet (editor, #1) [Link]

I had to dig a bit, but here's an example of developers being advised that some NULL checks are not needed. In retrospect, that doesn't seem like the best advice...

Removing NULL checks

Posted Jul 17, 2009 21:06 UTC (Fri) by stevenb (guest, #11536) [Link]

Anyway, the "it appears to be a GCC bug in the end." bit in the LWN story about this is unjustified IMHO.

Removing NULL checks

Posted Jul 17, 2009 22:39 UTC (Fri) by nix (subscriber, #2304) [Link]

It would be an excessive optimization, if it wasn't that GCC has a way to
turn it off... which the kernel is not using.

Removing NULL checks

Posted Jul 17, 2009 22:56 UTC (Fri) by mikov (subscriber, #33179) [Link]

I don't think that disabling this option for a kernel compile is reasonable. It is not reasonable to have to check every new optimization for every new version of a compiler, assuming the optimizations themselves are valid.

What is reasonable on the other hand is to know C and to write valid C code, which doesn't break when a butterfly flaps its wings. I still can't understand why everybody is making such a big deal over such a trivial bug. It is similar to relying on the value of "(a++) + (a++)" - basic C stuff that hopefully everybody knows by now.

If this was an user app nobody would think twice about it. So, what, the kernel is an excuse for sloppy coding?? I would hope that the opposite was true...

Removing NULL checks

Posted Jul 17, 2009 23:33 UTC (Fri) by stevenb (guest, #11536) [Link]

This optimization is not even new. It was added to GCC ten years ago, see
http://gcc.gnu.org/news/null.html

Removing NULL checks

Posted Jul 17, 2009 21:44 UTC (Fri) by mikov (subscriber, #33179) [Link]

I think there are three completely orthogonal issues here, which seems to be confusing people:

1. The kernel is (mostly) written in C and uses a general purpose C compiler (even more than one, if you count Intel). So the kernel should follow the C rules (and the C standard), even if on the surface it seems that these rules don't always make sense in the kernel context. For example, under DOS dereferencing a NULL pointer would not generate any kind of error. That doesn't mean that it is a good practice to remove NULL pointer checks for all reads. It is still undefined behavior according to the C standard and any C compiler for DOS is free to apply the same (correct!) optimization.

I think what people don't realize is that such optimizations are not only valid, but necessary. While there are the few exceptional cases like this one where they seem to get in the way, they are useful in the majority of cases.

2. Whether the _tun_ pointer should ever be expected to be NULL in this point. I have no idea, but if it is supposed to have been checked earlier, it is reasonable not to check again and this is a simple logical bug.

3. Whether the kernel should trap NULL pointer accesses in hardware for kernel code. Again I don't know, but personally I prefer that it did, unless there is good reason not to. Is there one?

Still relying on this can be risky, for example:
int * a = NULL;
x = a[10000];

The first page may be unmapped, but we are not accessing it.

Removing NULL checks

Posted Jul 17, 2009 23:24 UTC (Fri) by nowster (subscriber, #67) [Link]

This would never have happened if the kernel were written in some modern language more recent than C. I propose rewriting the kernel in INTERCAL.

Removing NULL checks

Posted Jul 19, 2009 11:13 UTC (Sun) by eupator (guest, #44581) [Link]

I suspect you may underestimate the modernism of C or the classicism of INTERCAL.

Removing NULL checks

Posted Jul 18, 2009 19:09 UTC (Sat) by darwish07 (subscriber, #49520) [Link]

You have a nice memory: I remember being advised to strip a BUG_ON line since the kernel OOPS mechanism will handle it - http://lkml.org/lkml/2008/2/13/71

I'm not sure if (or how) can this be exploited though ...

Removing NULL checks

Posted Jul 20, 2009 10:03 UTC (Mon) by makomk (guest, #51493) [Link]

Hopefully, that shouldn't be exploitable. I'm pretty sure the kernel
policy is that userspace shouldn't be able to trigger BUG_ONs
deliberately, so removing one in favour of just dereferencing the null
pointer ought to be safe assuming there isn't a bug elsewhere. (Of course,
if there is, it could turn it into a security vulnerability.)

Removing NULL checks

Posted Jul 20, 2009 12:46 UTC (Mon) by spender (subscriber, #23067) [Link]

Andrew Morton is wrong, a BUG_ON is in no way equivalent to just dereferencing a null. What if I have my page mapped at NULL?
If that author adds one more call to smack_netlabel and forgets to check for sock->sk == NULL before calling it, instead of just causing an OOPS he's now potentially created a vulnerability which can result in arbitrary code execution.

-Brad

Nonsense. This is trivial stuff.

Posted Jul 18, 2009 1:12 UTC (Sat) by spender (subscriber, #23067) [Link]

Bug that causes a trivial oops (only terminates the 'exploit' process) is not security relevant.
Bug that results in arbitrary code execution due to a gcc optimization is both clever and important.
Or are you stuck in the mindset that all bugs are equal?

People aren't involving the C standard or compiler optimizations to talk about a bug, they're talking about how something which by the appearance of the source is unexploitable can be turned exploitable.

-Brad

Nonsense. This is trivial stuff.

Posted Jul 18, 2009 2:02 UTC (Sat) by mikov (subscriber, #33179) [Link]

Bug that results in arbitrary code execution due to a gcc optimization is both clever and important.

I would not qualify a bug as clever. Perhaps you mean the exploit, and yes, it is indeed very clever.

The code optimization angle is not really that important. I am sure that there are many other exploits that depend on how GCC generates its code, so this is no different. For example exploits that depend on the layout of variables in the stack. It is exactly the same thing.

In this case however people are getting confused because the explanation for the exploit depends on parts of the C language which are not so well known.

People aren't involving the C standard or compiler optimizations to talk about a bug, they're talking about how something which by the appearance of the source is unexploitable can be turned exploitable.
Nonsense. Any bug is potentially exploitable - you simply never know. No experienced developer would decide that a bug is 100% un-exploitable, especially one with an invalid pointer dereference. Of course it is not obvious how to exploit it, but it never is.

All crashing bugs are very very important.

Nonsense. This is trivial stuff.

Posted Jul 19, 2009 0:55 UTC (Sun) by xilun (subscriber, #50638) [Link]

A good reviewer will not rule out possibility of exploitation, because: 1) it is undefined behavior, 2) NULL pointer dereferencing had been exploited before. A good reviewer will certainly _not_ think just by reading the source that this is unexploitable.

How might we eliminate undefined behaviour?

Posted Jul 20, 2009 0:41 UTC (Mon) by xoddam (subscriber, #2322) [Link]

Question for language lawyers:

If the behaviour of a program is undefined, we can't rule out the risk that it is exploitable.

So how might we move towards eliminating undefined behaviour from a given piece of software?

It would be nice if the compiler was able to produce meaningful warnings everywhere that the program might ever do anything undefined.

Of course this is impossible in the general case because the task of static analysis is NP-complete (mostly because the program's inputs are not known to the compiler).

However there is surely a way to produce *many* such meaningful warnings. In particular, it must be possible wherever the compiler is presently able to exploit the undefined-ness of an operation to make an optimisation.

Of course if we were to attempt this immediately we would start with a huge, huge number of false positives due to race conditions and the like which are for the most part not bothersome in practice.

If it were then possible to gradually reduce these warnings by meaningful annotations and the judicious introduction of memory barriers and relevant runtime checking, any progress would then be most reassuring.

How might we eliminate undefined behaviour?

Posted Jul 20, 2009 1:41 UTC (Mon) by mikov (subscriber, #33179) [Link]

A poster in The Register (Steve 105) gave a good example of why a warning in this case, while it does seem desirable at first sight, is probably not a good idea in general:

Surely the reason for the optimisation is (among other things) code like this:
inline char foo(char *p) { if (p == 0) return 0; else return *p; }

char bar(char *p) { *p = 2; return foo(p); }

int main() { char c = 0; return bar(&c); }
If foo gets inlined into bar, the compiler can spot that the null pointer check in the inlined code is unnecessary and remove it. This is a most excellent optimisation (granted, in this example foo and bar do so little work that other optimisations may render it unnecessary).

I think GCC would do good if it reported this a warning iff using the pointer and checking the pointer are in the same original function - that is the optimization didn't appear as a result of other optimizations like global inlining, etc.

That would have caught our case. I am not sure how difficult it would to implement this distinction though - I suspect quite.

How might we eliminate undefined behaviour?

Posted Jul 20, 2009 2:34 UTC (Mon) by xoddam (subscriber, #2322) [Link]

Point taken; there is some subltety involved :-)

But I do think that we can reasonably ask the compiler to distinguish between cases where an optimisation is *demonstrably* safe (ie. the result can be determined at compile time, as in your inlining case) and cases where the optimisation can result in totally different behaviour between naive and optimised object code.

The particular optimisation that resulted in this particular vulnerability does not follow from eager evaluation of known quantities but from an assumption on the part of the compiler that the program is so correct as to never step outside the language specification. Since the spec leaves so much undefined, and programmers are human, and input is untrusted, this assumption is less than completely valid -- and therefore such optimisations merit at the very least a BIG FAT WARNING.

How might we eliminate undefined behaviour?

Posted Jul 20, 2009 3:01 UTC (Mon) by MisterIO (guest, #36192) [Link]

I think you're mixing two different things. 1 is the warning or something similar that could be emitted in a clear bug like the one in the original code(and some static checkers should really find that kind of bug), 2 is the gcc's optimization, that has nothing to do with the bug. Sure, the optimization allows the exploitation of the bug, but the warning about the bug doesn't have anything to do with it and, even without the optimization, the bug is still there so the warning makes sense. I'm not saying that gcc should warn about it at all costs, but it's clearly something that a static checker should be able to do. For example it seems like cppcheck has accepted my suggestion to include it among its checks.

How might we eliminate undefined behaviour?

Posted Jul 20, 2009 9:33 UTC (Mon) by stevenb (guest, #11536) [Link]

Inlining is just one of the problems, but in general "good warnings" and "optimization" don't go well together.

It shouldn't be too hard to warn about this in GCC before inlining in recent GCCs. GCC internals could look like this, for the current trunk (r149803):
* put a new pass after pass_early_warn_uninitialized (which runs before inlining), say pass_null_pointer_check
* look for null-pointer checks in the function
* for every found null-pointer check, see if there is a dereference of the pointer that dominates the check.

Implementing the details is left as an exercise for the reader. This would only warn when optimizing.

The problem is that you could actually need optimizations to get a reliable warning. If you try to warn before doing any optimizations (including inlining) then you wouldn't warn about a snippet like this one:

void bar(char *p) {
char *q = p;
*q = 2;
if (p)
foo(p);
}

because you wouldn't find a dereference of p before inlining, but the compiler will copy propagate p into q to give:

void bar(char *p) {
char *q = p;
*p = 2;
if (p)
foo(p);
}

which would have given you the warning...

It shouldn't be very hard to construct cases where you get missed warnings or false positives depending on what optimizations you do before figuring out what to warn about.

If you warn after optimizing (including inlining perhaps) you may get lots of false positives. But if you warn before optimizations, you may not warn for cases that are obvious by inspection.

Nonsense. This is trivial stuff.

Posted Jul 20, 2009 9:38 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

“Bug that causes a trivial oops (only terminates the 'exploit' process) is not security relevant.”

-sigh-

Suppose there is a process which is supposed to watch a log and take certain actions (maybe alter a firewall rule) depending on what it sees in the logs. Now suppose there's a bug in that program, all the bug enables me (the attacker) to do is force it to call the read syscall with the log fd, a buf value of my choosing (maybe from bytes in the log) and count of zero.

On the face of it, this seems useless for me as an attacker.

Suppose I find a way to get read() to Oops if the buf value is N-1 for a particular magic value N, regardless of the value of count.

On the face of it, this too seems useless. A real security "expert" from the Internet has just assured us that it's not security relevant. But..

Combine them, and I've disabled the firewall tweaking log file reader. This was defending against brute force attacks on a network service, which I promptly break into.

Nonsense. This is trivial stuff.

Posted Jul 20, 2009 12:23 UTC (Mon) by spender (subscriber, #23067) [Link]

Your post is the perfect reason why I'm the security expert and you're not.

Your entire "suppose I find a way" argument is based off a non-existent bug. I was talking about the vulnerability I exploited. The only way you're going to turn that into a security problem for this firewall app you've invented is if you can get arbitrary code execution in the firewall app to force it to open a device it never wanted to open in the first place, and then perform an additional poll on the device that it never wanted to do in the first place. If you have arbitrary code execution within the firewall app, you might as well just call kill (or just fail with your arbitrary code execution and crash the process)!

Try again.

-Brad

Nonsense. This is trivial stuff.

Posted Jul 18, 2009 0:54 UTC (Sat) by bojan (subscriber, #14302) [Link]

Yeah, that was what I thought exactly when I saw it:

struct sock *sk = tun->sk;  // initialize sk with tun->sk
…
if (!tun)
    return POLLERR;  // if tun is NULL return error

It is _way_ to late to check once tun has already been used. The rest is just other problems that may exist already in the kernel.

PS. I am not the author of that article, despite having the same first name.

Nonsense. This is trivial stuff.

Posted Jul 18, 2009 1:18 UTC (Sat) by spender (subscriber, #23067) [Link]

Way too late to prevent a trivial oops which only terminates the process causing the oops, with no side-effects on system stability? Yes.
Way too late to prevent arbitrary code execution if -fno-delete-null-pointer-checks was used? No.

Without gcc's optimization, the code would have been unexploitable for arbitrary code execution. That's why it's being discussed.

-Brad

Nonsense. This is trivial stuff.

Posted Jul 18, 2009 9:21 UTC (Sat) by Ross (subscriber, #4065) [Link]

You are saying that the optimization is what "changes" this.

But it doesn't even have to be an optimization -- the compiler can assume a pointer is valid anywhere in the execution as long as it has the same value and that value is, was, or is going to be dereferenced. And there is no "standard" behavior without said optimization unless you assume a lot about the way the compiler works (basically it would be assuming the compiler doesn't track which pointers are dereferenced). The program is telling something to the compiler which the compiler is likely to use, just like a compiler will almost always assume local variables keep the same value between assignments (unless their address is passed or they are marked volatile). You wouldn't complain about a compiler "skipping" reads of a stack slot. Dereferencing a pointer is shouting to the compiler: "hey! this pointer is good (obviously not NULL) because I'm using it!".

Nonsense. This is trivial stuff.

Posted Jul 18, 2009 10:05 UTC (Sat) by bojan (subscriber, #14302) [Link]

Yeah, I get that. I'm referring to the code itself, which on its face is trivially wrong. In other words, the check whether tun==NULL should be _before_ sk=tun->sk, not after it.

Nonsense. This is trivial stuff.

Posted Jul 20, 2009 8:07 UTC (Mon) by bojan (subscriber, #14302) [Link]

After reading some of the C spec (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1336.pdf), section 6.3.2.3 has this:

> An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.57) If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

So, my guess is that GCC folks simply put two and two together and got four. In other words, if tun was zero, it would not be pointing to any valid object. Therefore, the fact that the programmer is using it must mean that the programmer _knows_ that it will be pointing to a valid object (i.e. is non-zero). Therefore, the check for it being zero later on is superfluous and can be removed. Ergo, no bug in GCC.

Nonsense. This is trivial stuff.

Posted Jul 18, 2009 15:59 UTC (Sat) by pflugstad (subscriber, #224) [Link]

I basically agree - the bug is obvious for anyone doing C coding. I actually would have expected the various static code analysis tools (Stanford Checker, Sparse, etc) to catch this kind of stuff.

But the other side of the discussion is that this bug is unexploitable if GCC had not optimized away the null check.

Nonsense. This is trivial stuff.

Posted Jul 18, 2009 16:14 UTC (Sat) by mikov (subscriber, #33179) [Link]

But the other side of the discussion is that this bug is unexploitable if GCC had not optimized away the null check.

Yes. I wonder what kind of process lead to finding this bug... Looking at the assembler output? Discovering it is very impressive. Brad really deserves a lot of praise.

But still, this is not different than any exploit which relies on knowledge of how a specific compiler works - e.g. how it places variables in the stack, and so on. It is exactly the same thing if you think about it. But nobody would claim that GCC has a bug because it placed the return address in the stack where it could be overwritten.

Brad keeps claiming that the important thing is that on the surface this bug doesn't present as security related. While that is true, nobody can predict how several bugs will interact. In my book any crashing bug is potentially security related until proven otherwise.

Nonsense. This is trivial stuff.

Posted Jul 19, 2009 3:27 UTC (Sun) by gmaxwell (subscriber, #30048) [Link]

Eh, if I were betting I'd suggest that this was found by running a simple static analysis tool that warns you of the check after the dereference. Many will.

No fancy examination of the assembly was required to find a bug which justified analysis for exploitability. The exact compiler behaviour was only needed for the creation of the exploit, which is pretty usual, not the finding of the bug.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds