LWN.net Logo

C's notion of null is not really relevant

C's notion of null is not really relevant

Posted Jul 17, 2009 15:45 UTC (Fri) by xoddam (subscriber, #2322)
In reply to: Linux 2.6.30 exploit posted by trasz
Parent article: Linux 2.6.30 exploit posted

It's normal, and could make sense, to succeed in dereferencing a pointer containing zero if you have readable memory mapped at address zero. Some older platforms used 'page zero' memory as cache -- it was faster (or took less object code) to reach.

So the C standard can say nothing about it. Where the C standard does mention null pointers is in saying that, when assigning (or comparing) the constant zero to a pointer, a specific value must be used that means 'null'. The semantics of this 'null' value need not be the same as the address zero (ie. it may be some special nonzero bit pattern when represented in a particular pointer of a particular type) and this allows for weird platforms where zero is a commonly used address value and there is some other special invalid address register value which can be efficiently detected or used to raise an exception.

OTOH Linux could say something about it, eg. by not permitting anyone to map anything at address zero. This might break one or two native-code emulators of ancient zero-page-using OSes but surely no-one will cry over those in 2009.

None of which is relevant to whether it's legal for gcc to remove code during optimisation that relies on common-but-not-universal behaviour. This is definitely a compiler bug.

It's possible in practice for the pointer to be zero, and for execution to continue after reading the contents of address zero, so the test should never have been removed.


(Log in to post comments)

C's notion of null _is_ the _only_ relevant

Posted Jul 19, 2009 0:38 UTC (Sun) by xilun (subscriber, #50638) [Link]

"C's notion of null is not really relevant" <-- yeeeehhhaaaaaa! let's write a C compiler, but let's redefine the langage, because nothion defined by the standard are not relevant. What a joke. Create your own langage if you want. Leave C and GCC alone.

Re read C99 again. And again. And again. Maybe then you will understand. The null pointer is not just a pointer containing zero. The null pointer is something you should not dereference. Ever. Any decent C programmer knows that. It's also known for a while now that such errors are sometimes exploitable.

A pointer containing zero is not a valid pointer supported by the C langage if it happens that the representation of a null pointer is value zero, because it would then break LOT of clauses of the standard. If you want to dereference a pointer containing zero, you must do that in assembly langage. Any other way is calling for problems.

So in the name of what can this be a compiler bug when the compiler is absolutely compliant with the langage standard on this point? You are misunderstanding what C is and what it is not. The most it could be is a feature you could request, but you do not even have to because this feature already exist (there is a flag to disable the optimisation, so _this_ particular point _is_ defined at least for the translation phase when you use the flag).

If the kernel support mapping address 0 and given that Linux only support systems where the representation of NULL is 0x0 AFAIK, they should use the flag. GCC maintainers for the C language just don't have to make the default behavior defined for every undefined behavior of the standard just because you want that, even if binaries are 5x slower. C is not Java; live with it.

C's notion of null _is_ the _only_ relevant

Posted Jul 19, 2009 11:19 UTC (Sun) by nix (subscriber, #2304) [Link]

The *attacker* mmap()ed address zero, not the kernel. I suppose there
should be an option to make NULL not all-bits-zero when inside the kernel,
but then you'd have to adjust pointers in transit from userspace and
comparisons to it would be slower and so on.

C's notion of null _is_ the _only_ relevant

Posted Jul 20, 2009 0:16 UTC (Mon) by xoddam (subscriber, #2322) [Link]

Okay. My mistake. I re-read it.

The standard does indeed say you can't rely on successfully dereferencing null.

(So if you need to read the contents of memory at zero, you are doing something nonstandard and should explicitly tell the compiler if you want it to mean something, or use assembler as you propose).

OTOH if I as a C programmer don't *need* the contents of memory at zero (ie. I'm not writing wine or DOSEmu or equivalent) but a pointer may be invalid, I must do any necessary checking *before* dereferencing a pointer.

So I must grudgingly admit that the compiler is within its rights to make my program do something -- anything -- utterly unexpected once I've made such a stupid mistake as dereferencing null.

But in general programs that do utterly unexpected stuff in any circumstances are bad practice.

So I'd sure-as-hell like a big fat WARNING if the optimiser proposes to remove an entire if statement and thereby possibly make my OS kernel behave -- in practice, not in the meta-universe of the C standard -- in unexpected, exploitable ways.

If there's no such warning from gcc, that *is* a bug. Just IMO.

C's notion of null _is_ the _only_ relevant

Posted Jul 20, 2009 6:47 UTC (Mon) by nix (subscriber, #2304) [Link]

Some undefined behaviour does require a diagnostic, but the universe of
undefined behaviour is unbounded, and determining if some things are
undefined rams you right into Rice's theorem and the halting problem.

Spotting null dereferences in the general case certainly is (although
warning when the compiler *already* spots a null dereference, as you
propose, is not hard: have you thought about the case of NULL dereferences
being brought into a function via inlining, though? I'd try a GCC using
this warning on a large template-heavy C++ codebase before considering it:
see how many FPs you get.)

C's notion of null _is_ the _only_ relevant

Posted Jul 21, 2009 6:22 UTC (Tue) by alankila (subscriber, #47141) [Link]

You may be underestimating how easy it is to generate dead code. Functions which are called with statically allocated objects passed via pointers will never get a NULL and thus any if (foo == NULL) check is unnecessary. A good compiler doesn't generate object code that spends time testing conditions that can't be true, so it is reasonable that it eliminates this test. Neither can it produce a warning without driving everyone crazy, because I stipulate that this is a very common situation in defensively written code.

C's notion of null _is_ the _only_ relevant

Posted Jul 21, 2009 7:09 UTC (Tue) by xoddam (subscriber, #2322) [Link]

Determining that code is dead is easy (and I heartily approve it) if the actual values can be computed at compile time. For the particular case you mention (all callers pass pointers which are known not to be null), you would probably need whole-program optimisation to determine it.

However, knowing that the program has already attempted to dereference a pointer is not quite the same as statically determining that the pointer is definitely non-NULL.

I submit that removing such a test when some possible sources of the value are not visible to the compiler is an excessive optimisation and warrants a warning.

People write defensive code for a reason.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds