LWN.net Logo

RE: nullable pointers

RE: nullable pointers

Posted Mar 29, 2011 23:32 UTC (Tue) by neilbrown (subscriber, #359)
In reply to: RE: nullable pointers by nybble41
Parent article: GCC 4.6.0 released

I'm having trouble understanding the argument against nullable pointers. Can anyone point me at a good treatise on the subject?

I can see some value in allowing a pointer to be declared non-nullable - a bit like 'const' in C, it might catch a few errors and it explicitly documents an intention so there could be is value (though I tend to find 'const' more annoying than helpful).

But when you really want a nullable pointer, as you do in a linked list or any tree data structure, arbitrarily not allowing one seems pointless.

Or am I misunderstanding something?


(Log in to post comments)

RE: nullable pointers

Posted Mar 31, 2011 3:46 UTC (Thu) by deweerdt (subscriber, #18159) [Link]

I think that much of those seemingly unsubstantiated claims against nullable pointers come from this: http://www.infoq.com/presentations/Null-References-The-Bi...

RE: nullable pointers

Posted Mar 31, 2011 10:48 UTC (Thu) by neilbrown (subscriber, #359) [Link]

Thanks for that link!!

If I understand the argument correctly, it goes like this:

1/ A null pointer is a known address that doesn't actually point to an object of the expected type. It points to something else, but should never be dereferenced.
2/ dereferencing such a pointer is not only wrong, but it would lead to results that are not predictable from examining the original source code.
3/ therefore, every attempt to dereference a pointer must be preceded by a test to see if the pointer is "NULL", and to raise an exception if it is
4/ (I'm guessing here, this wasn't explicit in the talk but is the only way that the rest makes sense) Performing that test was too expensive on hardware at the time, so they didn't perform the test, so programs behaved in hard-to-understand ways which had a substantial cost either in debugging time or dealing with incorrect or harmful results.

If I am right, then:
A/ while it may have been a mistake then, it isn't relevant any more as any hardware with an MMU (which I think is everything these days) can easily check every dereference for 'NULL' and raise an exception, and
B/ The "billion dollars" is almost certainly an extreme over-estimate. The real problem is programmers writing bad code. Some other technique than NULL (e.g. explicit sentinal objects) might have made some of the problems easier to detect earlier and so might have saved something, but I really don't think it is justifiable to place that much blame on the 'NULL' pointer. Some maybe, but not much.

So I see absolutely no problem in a modern language allowing a NULL pointer, though certainly supporting non-NULLable pointers is very appropriate. And every runtime should check for dereferences of NULL (preferrably in hardware) and raise and exception when it happens. But I'm sure they do.

RE: nullable pointers

Posted Mar 31, 2011 11:16 UTC (Thu) by paulj (subscriber, #341) [Link]

Agree.

NULL pointers are a red-herring. If we're talking about C level languages, then a NULL pointer dereference is a *good* kind of mistake - it's a clear error and causes an immediate exception at runtime, as you say. Much, much more insidious in C-level-akin languages is stale pointers and code continuing to twiddle memory that no longer references the state it thinks it should. Give me an immediate segfault over bugs which only cause crashes later, in unrelated code, any day! Further, even in managed-runtimes and with non-nullable types, you *still* can have stale references that cause bugs (which can still be hard to debug, even if it manages to better protect state from being corrupted).

The problem is just so much more fundamental: The correct management of the lifetime of state, and the co-ordination of visibility of that state.

(Functional programmers would of course argue the problem is best solved by never sharing state, only ever copying and transforming it. Course, while that removes scope for many kinds of crash-causing bugs, it still can't remove higher-level, logical mistakes..).

RE: nullable pointers

Posted Mar 31, 2011 14:55 UTC (Thu) by deweerdt (subscriber, #18159) [Link]

Yes, I think some people extrapolated the title and created this internet rumor that simply switching to non-nullable pointers would automagically resolve all sorts of problems.

RE: nullable pointers

Posted Mar 31, 2011 19:56 UTC (Thu) by HelloWorld (guest, #56129) [Link]

I don't agree with your interpretation of Hoare's talk. In the beginning, he says that Algol, unlike other languages at the time, checked array subscripts at runtime, which he thought of as a good idea, even though it was slow. So clearly, efficiency wasn't his primary concern.

He explains later on why he added null, and what it boils down to is that he didn't know how to do better.

RE: nullable pointers

Posted Apr 1, 2011 3:58 UTC (Fri) by neilbrown (subscriber, #359) [Link]

It is very possible that my interpretation of Hoare's talk is flawed. It was a very unsatisfying talk, and I got the impression that he wasn't really expecting to give it and so wasn't really prepared. He said something about expecting to be part of a panel of people all sharing their "greatest mistakes" and went to some effort to get people in the audience to contribute - to argue for or against - to make it a more substantial discussion. So it would almost certainly be unfair to assess Tony Hoare's really thoughts based on that talk.

For the moment however that is all we have, and to my mind it came across a lot like the proverbial dot-com-boom business plan which you might remember as:
1 - give stuff away for free
2 - ....
3 - profit.

In this case it is

1 - Invent the NULL pointer
2 - ...
3 - Cause a billion dollars of damage.

In both cases there is no clear link from the first step and the third step.

I can happily agree that Tony Hoare invented NULL, and I can happy agree that programmer errors that manifested as dereferencing a NULL pointer have had a large cost, but I cannot see how the one leads to the other.
It would seem equivalent to blame Marie Currie for the current serious radiation problems in Japan. She coined the word "radioactivity" and radioactivity is causing serious problems and endangering lives. But it is hardly her fault.

RE: nullable pointers

Posted Mar 31, 2011 6:19 UTC (Thu) by wahern (subscriber, #37304) [Link]

In those circumstances you don't need a generic null reference, you just need a sentinel, which can be typed. A null reference is a conflation of several distinct notions. e.g., in poor layman's terms, uninitialized, missing, nothing-to-see-here, not-in-set, etc. A computational statistics professor once tried to explain to me the formal distinctions (including a new one he was proposing), but at the time I didn't grasp the importance and subtlety; all I remember now--a decade later--is that I wish I paid better attention.

I'm not exactly sure what is gained from removing generic null references entirely, though.

RE: nullable pointers

Posted Mar 31, 2011 11:00 UTC (Thu) by neilbrown (subscriber, #359) [Link]

This reminds me of an observation I read many years ago that a two-valued type could mean any one of true/false, on/off, yes/no, success/fail and many other more domain-specific alternatives.

Grouping them all under "Boolean" is thus confusing.

However this is no worse than the fact that '2' could mean inches or centimetres ... or kilograms or people or almost anything.

It is for this reason that Ada (and possibly other languages) allows "new integer" which defines a new type that behaves like integer but is not type-compatible with any other integer type. It is a concept that does have its place, but the extra book-keeping would probably drive most programmers crazy - even static typing seems to be out of favour with a lot of people these days (python? Perl?) and this extra type differentiation is like static typing on steroids.

You are correct of course that a generic NULL reference is not needed, but I fail to see that a typed NULL actually buys you anything at all...

I certainly agree that it is not clear what could be gained by removing generic null references.

RE: nullable pointers

Posted Mar 31, 2011 13:21 UTC (Thu) by HelloWorld (guest, #56129) [Link]

But when you really want a nullable pointer, as you do in a linked list or any tree data structure
You don't need nullable pointers for linked lists
abstract class List<T> { }
final class ListNil<T> { }
final class ListCons<T> { T elem; List<T> tail; }
or for trees
abstract class BinaryTree<T> { }
final class TreeNil<T> { }
final class TreeCons<T> { T elem; BinaryTree<T> left; BinaryTree<T> right; }
arbitrarily not allowing one seems pointless.
What is arbitrary is a null reference, not the lack of it. What could be more arbitrary than a magical value created out of thin air, which is a value of (almost) every type, yet supports none of the operations that all other values of these types support.

RE: nullable pointers

Posted Mar 31, 2011 15:45 UTC (Thu) by mister_m (guest, #72633) [Link]

Wouldn't you have bigger problems if you tried to access an element that was filled with garbage values?

RE: nullable pointers

Posted Mar 31, 2011 16:00 UTC (Thu) by HelloWorld (guest, #56129) [Link]

Of course, the compiler must ensure that a variable is never read before it is written. But this is fairly easy to do and does't require null.

RE: nullable pointers

Posted Mar 31, 2011 19:04 UTC (Thu) by mister_m (guest, #72633) [Link]

I am a bit of a novice. Would you mind explaining how I would enable the compiler to do that? I am missing some programming knowlege in this area I think = perhaps a suer defined type? In school, we always set unused pointers to null, and then check for null to see if we can dereference whatever is at the other end.

RE: nullable pointers

Posted Mar 31, 2011 20:36 UTC (Thu) by HelloWorld (guest, #56129) [Link]

> I am a bit of a novice. Would you mind explaining how I would enable the compiler to do that?
Unless you're the one writing the compiler, you can't. Of course, you could just use a language that does this by default, like Haskell or OCaml.

RE: nullable pointers

Posted Apr 12, 2011 16:04 UTC (Tue) by oelewapperke (guest, #74309) [Link]

The essence of the trick is that you simply make a type difference between "filled in" pointers and "null" pointers.

A tree would be defined like so :

abstract class TreeNode {}
abstract class EmptyTreeNode extends TreeNode {}
abstract class FilledTreeNode extends TreeNode {
TreeNode first;
TreeNode second;
}

If you combine this with runtime polymorphism you can do something like this (won't work in Java, but I'm using java code anyway)

void processTreeNode(EmptyTreeNode et) {}
void processTreeNode(FilledTreeNode ft) {
// you might want to do something useful here
processTreeNode(ft.first);
processTreeNode(ft.second);
// or here
}

then simply call processTreeNode(tree), and the compiler would figure out whether to call the first or the second function.

I would love for java to support this though.

RE: nullable pointers

Posted Apr 1, 2011 3:47 UTC (Fri) by neilbrown (subscriber, #359) [Link]

I'm not familiar with the language you are writing in (being primarily a C programmer), but I gather the a variable of type "List<T>" is allowed to refer to either an object of type "ListNil<T>" or an object of type "ListCons<T>". "ListNil<T>" fills exactly the same role a "NULL" so what you have created is exactly a NULLable pointer, only with a different name and more complex syntax.

A rose by any other name still has thorns, and a NULL de-reference by any other name still causes a run-time-exception or worse.

All values are "created out of thin air". They are abstractions which we create and use to try to model "real" things. Values that are useful we re-use. Values that are not, we forget. NULL turns out to be useful.

Your argument against NULL would seem to apply equally to the value '0'. It is an arbitrary magical value - that the Romans seemed to manage with out. It is a value of every (numerical) type, yet it doesn't support being the divisor in a division. We really should eradicate it - that would get rid of all divide-by-zero errors forever.

Of course, divide-by-zero errors are not caused by the availability of a 0 value, but by the misuse of that value. Similarly NULL pointer de-reference errors are not caused by the availability of a 'NULL' value, but by the misuse of that value.

RE: nullable pointers

Posted Apr 1, 2011 4:39 UTC (Fri) by HelloWorld (guest, #56129) [Link]

> I'm not familiar with the language you are writing in
It's Java (though I don't usually use that language).

> "ListNil<T>" fills exactly the same role a "NULL" so what you have created is exactly a NULLable pointer, only with a different name and more complex syntax.
No, it's not the same thing. As was pointed out multiple times by now, null references can show up everywhere, even in places where they don't make sense and where nobody ever intended or expected them to do. On the other hand, ListNil<T> objects are allowed only allowed where lists are allowed, and any sensible program that operates on lists expects those lists to end at some point (in eager languages at least).

> Your argument against NULL would seem to apply equally to the value '0'.
No, 0 isn't something that somebody just made up. If you don't have 0, you can't add two integers as the only sensible result of adding x and -x is 0. So yes, the fact that you can't divide by 0 is a problem, but getting rid of that is hard. No such argument can be made for null.
I also dislike the fact that null mixes up many different concepts that ought to be separate. null was used to represent optional values, empty lists, empty trees and probably all kinds of other things which don't have anything in common.

RE: nullable pointers

Posted Apr 1, 2011 6:06 UTC (Fri) by paulj (subscriber, #341) [Link]

Sure, but you still have a specialised nil type. Whether you use a null reference on a List type to denote the end of the list or a special EmptyList class, it still means your code has to take to check what exactly the reference is pointing to before going through it to avoid a runtime exception.

I don't see how getting a ClassCastException is any better than a NullPointerException. Even though you've created a new type, neither you nor the compiler is any better position...

RE: nullable pointers

Posted Apr 1, 2011 8:22 UTC (Fri) by nybble41 (subscriber, #55106) [Link]

The example was simply to show that you don't need all pointers to be nullable to have linked lists. The advantages for this case are minimal, since linked lists specifically depend on the 'next' pointer being nullable. However, linked lists are a hardly representative of pointers in general, particularly in a language like Java where all objects are accessed via pointers. The benefit of making pointers non-nullable by default is in the majority of cases where nulls are nothing but a source of runtime exceptions.

The advantages tend to be more pronounced in languages like Haskell, however, which have more advanced type systems which handle this sort of subtyping without any need for runtime casts. As you point out, in Java you would need to explicitly cast from List<T> to ListNil<T> or ListCons<T>, an operation which can fail in much the same manner as dereferencing a null pointer.

RE: nullable pointers

Posted Apr 1, 2011 13:33 UTC (Fri) by HelloWorld (guest, #56129) [Link]

> Sure, but you still have a specialised nil type. Whether you use a null reference on a List type to denote the end of the list or a special EmptyList class, it still means your code has to take to check what exactly the reference is pointing to before going through it to avoid a runtime exception.
How many times do you want me to repeat this? The fact that you need a way to represent the empty list does not mean that you need nullable pointers _everywhere_, even where they _don't_ have any purpose (unlike the list case).

RE: nullable pointers

Posted Apr 1, 2011 14:25 UTC (Fri) by paulj (subscriber, #341) [Link]

That isn't what you were arguing earlier though. You have been arguing that nullable types are *always* a bad thing (in a grandparent to this post) and there is *never* a need for them (to paraphrase several other comments of yours).

To my mind, having to go from checking the value of a reference to checking other properties of a composite object (the type, or a length attribute), and still being left open to runtime-exceptions (with a different name) if you forget, means you still are facing fundamentally the same problem.

Can the compiler maybe catch a couple more problems it wouldn't otherwise. Sure. Is it a silver bullet? Sadly, still no.

RE: nullable pointers

Posted Mar 31, 2011 13:29 UTC (Thu) by HelloWorld (guest, #56129) [Link]

> I'm having trouble understanding the argument against nullable pointers.
And I'm having trouble understanding why one would want a nullable pointer. I have yet to see a use case where they provide a substantial benefit. What I do see is that programs crash due to null pointer exceptions all the time.

RE: nullable pointers

Posted Apr 1, 2011 4:04 UTC (Fri) by neilbrown (subscriber, #359) [Link]

> What I do see is that programs crash due to null pointer exceptions all the time.

What do you think those programs would do there were no null pointers?
There are various possibilities: throw an 'invalid type' error instead, enter an infinite loop, produce an incorrect answer, kill your cat....

One cannot know for sure, but I am certain that simply disallowing NULL pointers will not make incorrect programs correct except in a tiny minority of cases (if at all).

The thing that would have made the talk much more interested would be a description of what could have been done at the time which could have made a difference.... but I don't think there is anything.

RE: nullable pointers

Posted Apr 1, 2011 4:19 UTC (Fri) by HelloWorld (guest, #56129) [Link]

> There are various possibilities: throw an 'invalid type' error instead, enter an infinite loop, produce an incorrect answer, kill your cat....
You forgot the one that matters: they may fail to compile. In languages like ML, a variable can't be "null". The closest thing to null that ML has is the Option datatype, and if you try to pass an expression of type t Option to a function expecting an argument of type t, your program won't compile.

RE: nullable pointers

Posted Apr 1, 2011 8:12 UTC (Fri) by neilbrown (subscriber, #359) [Link]

ahh... I get it now.

What you are saying (I think) is that not only should the language provide a 'non-nullable' pointer type (which the compiler assures will never be NULL), but that where a nullable pointer is used, the language should require that there be an explicit test for NULL before dereferencing the pointer or assigning it to a non-nullable pointer.

That makes sense.

So you will never get a NULL dereference, and the compiler/runtime doesn't have to do implicit tests because tests are required to be explicit.

RE: nullable pointers

Posted Apr 1, 2011 12:46 UTC (Fri) by HelloWorld (guest, #56129) [Link]

> What you are saying (I think) is that not only should the language provide a 'non-nullable' pointer type (which the compiler assures will never be NULL), but that where a nullable pointer is used, the language should require that there be an explicit test for NULL before dereferencing the pointer or assigning it to a non-nullable pointer.
Yes, this is exactly what I meant.

RE: nullable pointers

Posted Apr 1, 2011 8:48 UTC (Fri) by jezuch (subscriber, #52988) [Link]

> You forgot the one that matters: they may fail to compile.

Yup. You learn Haskell and/or ML (and/or ...) and the case against null pointers becomes so obvious you don't even know how to explain it.

RE: nullable pointers

Posted Apr 1, 2011 9:12 UTC (Fri) by paulj (subscriber, #341) [Link]

So in Haskell you never have to do things like have a separate case for an empty list?

If you specifically mean SEGVs, well that's very easy to explain: Haskell and functional languages generally do not have pointers.

RE: nullable pointers

Posted Apr 1, 2011 21:01 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

You still have to case on the empty list (e.g., you can't get the first value from it, so if you need it, you need to do /something/ different for the empty list unless you want exceptions about incomplete pattern matchings). An example:

safeHead :: [a] -> Maybe a
safeHead [] = Nothing
safeHead (a:_) = Just a

When calling safeHead, the caller must* check to see if a value was actually returned before using it. This can be done with fromMaybe:

fromMaybe sensibleDefault $ safeHead someList

* The caller can just use fromJust which raises an exception on a Nothing value, but that's the caller's burden at that point.

RE: nullable pointers

Posted Apr 5, 2011 6:27 UTC (Tue) by cmccabe (guest, #60281) [Link]

I think academics tend to dislike NULL is because it makes it harder to formally prove code correct. I'm kind of surprised nobody brought this up yet.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds