User: Password:
|
|
Subscribe / Log in / New account

Null pointers, one month later

Null pointers, one month later

Posted Aug 18, 2009 18:00 UTC (Tue) by xilun (guest, #50638)
Parent article: Null pointers, one month later

"This happens not because there is anything special about a pointer containing zero, but because the trick of not mapping valid memory at the bottom of the virtual address space has been known and used for decades."

This is only true from the point of view of the hardware. Linux like every Unix like is programmed is C, so this is _not_ true on targets when the representation of a NULL pointer is zero, which is the case for, oh well, just every target Linux and GCC supports...

Even after the compiler is instructed that NULL is less special than it thought and every single line of Linux is reviewed for this kind of problem, NULL pointers will stay special in the eye of third party tools. That's why it was a very very very bad idea at first to allow to map a page at address zero, and I guess if this "feature" stays there will again be security issues in the future because of that. So it's still a very bad idea, even if less very bad know that at least (some) people are conscious of one more thing they have to worry about when they write or review some Linux code.


(Log in to post comments)

Null pointers, one month later

Posted Aug 18, 2009 18:20 UTC (Tue) by patrick_g (subscriber, #44470) [Link]

>>> I guess if this "feature" stays there will again be security issues in the future because of that

Is it just for Wine or is there other softwares using the map at adress zero ?

Null pointers, one month later

Posted Aug 18, 2009 18:41 UTC (Tue) by drag (subscriber, #31333) [Link]

I think that DosBox does also.

Is this something that programmers of emulation machines (yes I know Wine isn't emulation, but in this case it seems want to do emulation-ish things?) typically want to be able to do?

Would it make sense for the kernel to simply lie? Make it so that address zero from the applications VM perspective isn't really address zero from the kernel's or machines's perspective?

(I am struggling to understand everything going on here. It seems like it wouldn't be difficult to do.. I always understood the point to having virtual memory is so that applications can abritrarially get their memory mapped to any section of memory.)

Null pointers, one month later

Posted Aug 18, 2009 19:56 UTC (Tue) by taviso (subscriber, #34037) [Link]

The reason they want to do this is to use an intel hardware feature called v8086 mode, which maps the segmented real address space onto the first megabyte of the linear address space.

You could fake it, but then you wouldn't be using the "hardware acclerated" emulation that makes things like dosemu very fast despite being a relatively complex feat.

Uses of pages near zero

Posted Aug 18, 2009 20:55 UTC (Tue) by jreiser (subscriber, #11027) [Link]

Is it just for Wine or [are] there other softwares using the map at adress zero ?

"All memory is equal, but the memory near address zero is more equal than others." On x86 (protected mode, both 32-bit and 64-bit) and PowerPC (both 32-bit and 64-bit) the hardware itself supports the low 64KiB or 32KiB better than any other region. Some forms of every branch instruction can access low memory always, in addition to the usual region near the program counter. On the PowerPC this is explicit: the AA bit (Absolute Addressing: the bit with positional value (1<<1)) in the instruction. On x86 it is implicit: the 0x66 prefix byte, which performs target_address &= 0xffff; just before branching, and the 0x67 prefix byte, which makes the 0xe9 (and 0xe8) opcodes take a 16-bit displacement instead of a 32-bit displacement. On PowerPC the benefit is a larger set of target addresses, including some targets that are universally accessible regardless of the current value of the program counter. On x86, another benefit also is smaller size: 2, 3, or 4 bytes for a branch instead of 5, or only 5 bytes for some universally-accessible targets on x86_64. Also, do not overlook the advantage of using just 16 bits for storing pointers to an important collection.

Most traditional static compilers such as gcc never use these features. However, there are other compilers, program processors, and runtime re-writers which take advantage of the hardware to offer otherwise-impossible features.

Null pointers, one month later

Posted Aug 20, 2009 10:35 UTC (Thu) by etienne_lorrain@yahoo.fr (guest, #38022) [Link]

> Even after the compiler is instructed that NULL is less special than it thought ...

In fact the C language doesn't know the identifier "NULL", it just knows that its value is zero because the preproceessor defines that.
You can ask the compiler not to optimise tests against zero by a compilation switch, like it is done in the latest Linux source.
Another solution is to define NULL as an external pointer, and let the linker set its value to zero (either linker command file or ld parameter).
Then, the compiler cannot optimise tests against a value it doesn't know, namely NULL - but it will still optimise away tests against zero, some of them are obvious.

The real problem is to tell the compiler not to optimise the NULL test of this function in the general case:
inline void fct (unsigned *cpt) { if (cpt != NULL) *cpt += 1; }
but to optimse it when it is called as:
static unsigned cpt1; // cpt1 address known not to be zero
voit fct1 (void) { fct (&cpt1); }
or when called as:
void fct2 (void) {
unsigned cpt2; // cpt2 address known not to be zero
fct (&cpt2);
}
That is difficult to acheive.

Null pointers, one month later

Posted Aug 20, 2009 11:47 UTC (Thu) by xilun (guest, #50638) [Link]

"In fact the C language doesn't know the identifier "NULL", it just knows that its value is zero because the preproceessor defines that."

Wrong:
The fact that the C compiler, preprocessor excluded, does not know about the symbol NULL is irrelevant. NULL is defined as the null pointer constant, and the C language, even preprocessor excluded, does know about the null pointer constant. And even if the null pointer constant can be literally written (in a strictly conforming program) as (void*)0, that does not mean that the representation of the null pointer constant must be zero.

Null pointers, one month later

Posted Aug 20, 2009 12:34 UTC (Thu) by hppnq (guest, #14462) [Link]

Dereferencing the pointer that is supposed to never point at a valid object (the NULL pointer) is always going to be a problem -- but that problem is made bigger if there are actually objects living at exactly that part of memory ("zero").

C and C++ could have non_nullable pointers, easily

Posted Aug 20, 2009 18:25 UTC (Thu) by hummassa (subscriber, #307) [Link]

Put some pragma to deal with legacy code, etc...
int *nonnull a = NULL; // syntax error
int *b = NULL; // Ok
int f(int *nonnull c) { return *c; } // ok
int g(int *d) { return *d; } // syntax error
int h(int *e) {
  if( e ) {
    // here, "e" is of type "int *nonull" b/c of the check
    return *e;
  } else {
    return 0;
  }
} // ok
f(b); // syntax error
h(b); // ok
if( b ) f(b); // ok

C and C++ could have non_nullable pointers, easily

Posted Aug 20, 2009 20:22 UTC (Thu) by nix (subscriber, #2304) [Link]

Great. Now what does malloc() return on error?

How will you *create* one of these pointers?

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 4:47 UTC (Fri) by njs (guest, #40338) [Link]

Malloc returns a maybe-NULL pointer, just like now. The type system requires you to check for errors before it will let you dereference this pointer. (xmalloc can return a non-NULL pointer, though, because it has done the check.)

So... no problem?

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 7:27 UTC (Fri) by nix (subscriber, #2304) [Link]

No improvement, more like. All it does is automates away the null checks
everyone should already be doing anyway, and replaces it with something
which is sufficiently automated that I can't see how it could provide
helpful output at runtime (unless it did a longjmp() or EH got added to C
or something).

So at best it'd give you something like a dump of program state at the
time of the unintended NULL dereference: i.e., a core dump. The only
advantage is that the set of places you could get core dumps from might be
slightly smaller (at allocation, rather than at first dereference).

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 8:05 UTC (Fri) by njs (guest, #40338) [Link]

Well, xmalloc has all those effects, but that's the *point* of xmalloc, so I don't see what it has to do type-distinguishing nullable and non-nullable pointers... the point of which is to force people to think about whether a pointer can be null every time they want to dereference it, in a relatively painless way.

(This is all relatively common in languages with real type systems.)

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 18:50 UTC (Fri) by bronson (subscriber, #4806) [Link]

> All it does is automates away the null checks everyone should already be doing anyway

More like it mandates the null checks that everybody is supposed to do but even the most skilled programmers can't get 100% correct. It should raise the quality of all C programs.

> at best it'd give you something like a dump of program state at the
time of the unintended NULL dereference

Yes, that's better than dereferencing and getting rooted isn't it?

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 19:06 UTC (Fri) by nix (subscriber, #2304) [Link]

True indeed. However, for nearly all programs (i.e., everything other than
kernels and those very rare userspace programs that dereference things at
address zero or have structures whose sizeof() is in the multimegabyte
range), dereferencing null pointers doesn't lead to a root hole, but to a
crash. DoSes are bad enough, and it's still a bug...

So, yes, it's an improvement, but I'm not sure it's a large one. (I also
fear it would turn out like 'const' too often does: the semiclued majority
would just use nullable pointers everywhere because non-nullable ones
are 'too annoying'. But security-important software and software written
by clued people which can't use real languages like ocaml ;) would of
course benefit. And perhaps that's all we can hope for.)

C and C++ could have non_nullable pointers, easily

Posted Aug 27, 2009 19:30 UTC (Thu) by hummassa (subscriber, #307) [Link]

That's why, in my example, I stated that (sorry):

YOU CANNOT DEREFERENCE A NULLABLE POINTER

if you want to use the star, check if it is nullable. People will start to use non-nullable pointers everywhere in their interfaces because they don't want to be checking for null all the time. :-D Cunning, eh?

C and C++ could have non_nullable pointers, easily

Posted Aug 27, 2009 19:31 UTC (Thu) by hummassa (subscriber, #307) [Link]

Forgot to explain: dereferencing a nullable pointer should be a syntax error. Uh, and no static_cast between nullable and non-nullable pointers, either... no cheating :-D

C and C++ could have non_nullable pointers, easily

Posted Aug 20, 2009 22:19 UTC (Thu) by hppnq (guest, #14462) [Link]

Well, yeah. For ages GCC has supported the nonnull function attribute, used to specify arguments of a function that should not be NULL so you can catch these at compile time.

The problem, however, is that in non-trivial programs you need to be able to dereference pointers that could be NULL even if they should not be NULL. The compiler may not be able to catch all of these situations for you.

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 4:45 UTC (Fri) by njs (guest, #40338) [Link]

> The problem, however, is that in non-trivial programs you need to be able to dereference pointers that could be NULL even if they should not be NULL.

Yes, that's no problem. When you set up the sort of type system he or she describes, you include some sort of syntax that lets you get convert a "nullable" pointer into a non-null pointer by checking that it is, in fact, non-NULL. Once it's a non-null pointer, it becomes legal to dereference. (In the OP's sketch they overload the 'if' operator for this, but you could add some sort of extra syntax instead if you want to make it clearer.)

It does mean you can't dereference a maybe-NULL pointer *that is actually NULL*, but... that's the point :-).

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 7:17 UTC (Fri) by nix (subscriber, #2304) [Link]

So... instead of getting null pointer dereferences, we get
cannot_convert_null exceptions from the pointer conversion?

I still don't see any robustness benefit here.

(Of course proving that pointers cannot be null at compile time is
impossible in the general case.)

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 7:48 UTC (Fri) by dgm (subscriber, #49227) [Link]

This would be useful if the type system _forced_ you to check before assigning a maybe-null-pointer to a never-null-pointer. And to be really useful, only never-null-pointers should be dereferenced, and the compiler would only allow pointer arithmetic on maybe-null-pointers.

The gotcha is that null pointers are just _one_ type of invalid pointer.

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 19:10 UTC (Fri) by nix (subscriber, #2304) [Link]

Yes, those changes would make the idea genuinely useful. They'd also break
compatibility with almost all previous code: this from a language so
conservative that by word-of-dmr the precedence of && and || was
intentionally set wrong so as to avoid breaking code running on three
sites :)

C and C++ could have non_nullable pointers, easily

Posted Aug 22, 2009 1:10 UTC (Sat) by njs (guest, #40338) [Link]

Yeah, it's more of a thought experiment, though one could enable it only for certain (new) compilation units, or treat them as annotations for a tool like sparse.

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 7:54 UTC (Fri) by farnz (subscriber, #17727) [Link]

It's also impossible to verify the C type system; this doesn't stop compilers from running. The trick is to go for a conservative assessment; you're not interested in the choices "will sometimes be null/will never be null", you're interested in "might or might not be null/will never be null". The second is tractable; imagine an "ifnull( <ptrexpression> ) { null-block } else { <ptrexpression is now nonnull> nonnull-block }". By requiring you to use ifnull to convert nullable pointers to nonnull pointers whenever you might encounter them, the compiler can force you to decide how you're going to handle unexpected nulls.

Whenever the compiler isn't sure that a pointer is nonnull, it gives a compile-time error message. So, examples:

int func1( int *pointer )
{
    return *pointer; // Compile error here - cannot deference a nullable
}

int func2( int * nonnull pointer )
{
    return *pointer; // OK
}

int func3( int * pointer )
{
    return func2( pointer ); // Compile error here - even if pointer is
actually non-null.
}

int func4( int * pointer )
{
    ifnull( pointer )
        return 0;
    else
        return func3( pointer ); // OK, but func3 still won't compile, as
other callers might use a null pointer.
}

int func5( int * pointer )
{
    ifnull( pointer )
        return 0;
    else
        return func2( pointer ); // OK
}

This forces you to handle nulls sanely at some point, or fail to compile and link properly. Practical code handles nullness at boundary points, and then passes nonnull pointers around the place, to code which can assume that they're not null.

C and C++ could have non_nullable pointers, easily

Posted Aug 27, 2009 19:35 UTC (Thu) by hummassa (subscriber, #307) [Link]

No, you get a syntax error everytime you try to dereference a nullable pointer. People will pepper the APIs with non-nullable pointers, and will prefer to pass them around instead of nullable pointers (that will still have their place as "optional object" references). But, if you want to use the star or the arrow, you will have to have a non-nullable pointer.

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 8:00 UTC (Fri) by hppnq (guest, #14462) [Link]

Once it's a non-null pointer, it becomes legal to dereference.

This assumes that pointers do not change, which is only true in the trivial cases. If you want to be completely safe, your only option is to always check, right before using it, that a pointer is not NULL.

And then, by the way, you still have to worry about what will happen it turns out to be pointing to 0x1. ;-)

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 22:53 UTC (Fri) by nix (subscriber, #2304) [Link]

I've maintained code that actually went so far as to do this:
struct blah *foo (...)
{
    if (error_1)
        return NULL;

    if (error_2)
        return (struct blah *)1;

    if (error 3)
        return (struct blah *)2;

    /* repeat for ten or so errors */

    return /* a real struct blah */;
}
After I'd finished being sick into the keyboard I got a new keyboard and fixed it so it didn't do that anymore.

C and C++ could have non_nullable pointers, easily

Posted Aug 21, 2009 23:40 UTC (Fri) by corbet (editor, #1) [Link]

Ever seen the kernel ERR_PTR() macro? :)

C and C++ could have non_nullable pointers, easily

Posted Aug 22, 2009 0:34 UTC (Sat) by nix (subscriber, #2304) [Link]

Ew. I wish you hadn't drawn my attention to that :)

... but ERR_PTR() has a somewhat comprehensible reason to exist. The thing
I'm discussing had only half a dozen callers, and half of them ignored the
fact that it might return an error and just blindly dereferenced anyway
(but for all I know the same is true of ERR_PTR()s users).

C and C++ could have non_nullable pointers, easily

Posted Sep 9, 2009 6:59 UTC (Wed) by cmccabe (guest, #60281) [Link]

I actually don't see what the big deal is with ERR_PTR and friends.

In higher level languages like OCaml, Java, etc., when you encounter an unrecoverable error in a function, you throw an exception. Then the function has no return value-- control just passes directly to the relevant catch() block.

ERR_PTR is the same thing. Normally, the function would return a foo pointer, but an unrecoverable error happened. So you get an error code instead. As a bonus, if you forget to check for the error code, you get a guaranteed crash (well, if some bonehead hasn't allowed the page starting at address 0 to be mapped). I say "bonus" because the alternative is usually a nondeterministic crash.

C and C++ could have non_nullable pointers, easily

Posted Sep 9, 2009 6:35 UTC (Wed) by cmccabe (guest, #60281) [Link]

C++ does have non nullable pointers. They're called references.

References must point to valid objects.

C and C++ could have non_nullable pointers, easily

Posted Sep 9, 2009 14:28 UTC (Wed) by foom (subscriber, #14868) [Link]

The only problem is that:
Obj *p = 0;
Obj &r = *p;

is perfectly valid. So they don't make a very good non-nullable pointer.

C and C++ could have non_nullable pointers, easily

Posted Oct 18, 2009 22:52 UTC (Sun) by cmccabe (guest, #60281) [Link]

> The only problem is that:
> Obj *p = 0;
> Obj &r = *p;
>
> is perfectly valid. So they don't make a very good non-nullable pointer.

There shall be no references to references, no arrays of references, and no pointers to references. The declaration of a reference shall contain an initializer (8.5.3) except when the declaration contains an explicit extern specifier (7.1.1), is a class member (9.2) declaration within a class declaration, or is the declaration of a parameter or a return type (8.3.5); see 3.1. A reference shall be initialized to refer to a valid object or function. [Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. As described in 9.6, a reference cannot be bound directly to a bitfield. ]

—ISO/IEC 14882:1998(E), the ISO C++ standard, in section 8.3.2 [dcl.ref]

C.

C and C++ could have non_nullable pointers, easily

Posted Oct 19, 2009 3:47 UTC (Mon) by foom (subscriber, #14868) [Link]

> a null reference cannot exist in a well-defined program

I stand corrected.

I had always considered the "dereference" that occurs during the initialization of a reference
variable as syntax, rather than an actual memory operation, and thus the value of the pointer is
irrelevant at that point. Clearly the standard says otherwise.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds