The return of None-aware operators for Python
The saga of the None-aware (or null-coalescing) operators for Python continues. We last looked in on the topic a little over a year ago and noted that either adoption or a clear rejection of the idea might help tamp down its regular recurrence. That has not happened, so, predictably, it was raised again—and does not look any closer to resolution this time around.
Back in October, Bora Savas posted a proposal for a "safe navigation operator" to the Python Ideas discussion forum. It would add an operator ("?.") that would only access an attribute if the object is not None:
if obj is not None: print(obj.attr) # could (mostly) be replaced with print(obj?.attr)Since "obj?.attr" would return None when obj is None, a value will always be printed, unlike with the if statement version, as was mentioned in the thread; one could add an "else: print(None)" for completeness. In any case, the operator would provide a way to short-circuit the attribute access, rather than require tests or raise AttributeError, which would make code more concise. As noted the other times that the idea has come up, several other languages (e.g. JavaScript) already have a similar operator.
Chris Markiewicz immediately pointed Savas at PEP 505 ("None-aware operators"), which proposes a more extensive set of operators. It has the "?." operator for attribute access, as Savas proposed, but also the "??" binary operator (along with an augmented-assignment version "??=") and an indexing operator ("?[]"). Their operation is fairly straightforward; our previous article, and another from 2018, have several examples, as does the PEP, of course.
Savas thought that the PEP was a good starting point for reviving the idea, but its status is "Deferred". Chris Angelico, who has long been an advocate for the feature, cautioned that further progress will require that the objections leading to the deferral be addressed. Normally, if a PEP is rejected, the reasons for that are described in it, but PEP 505 never reached that state, so it is lacking for concrete information about those objections. Sebastian Rittau said that made it difficult to proceed:
The PEP lists no objections, the commit changing the status to "Deferred" lists no reasons. The PEP doesn't even have a "Discussions-To" header. After five years it seems appropriate to start a new discussion, instead of digging through three years worth of mailing list archives. (The time between the creation of the PEP and the change to "Deferred".)
Jean Abou Samra pointed out that there have been several different discussions about the feature in the Python discussion forum, so one may not have to go back as far as the mailing-list archives. Savas suggested that the scope of the PEP might be part of the problem with it: having four separate operators may be making it too complex to adopt at once. One of the underlying problems, though, as Angelico (and others) pointed out, is that the PEP authors are not interested in pursuing the idea any longer; in fact, Steve Dower, one of those authors, is no longer even in favor of the idea.
From there, the thread took a fairly predictable path, with advocates extolling the virtues of the feature, while others complained about various aspects of it. Whether it is readable is obviously in the eye of the beholder and there are plenty of beholders of both types commenting. "Laurie O" had a somewhat different take, though, about the style of programming that the feature would enable based on her JavaScript experience:
I find code with more nulls flying around less safe, with types more fluid. After years of use of both languages, JS/TS feel less reliable to me due to this, whereas Python I have more confidence in the type (or shape, when I'm duck-typing).Making nulls more ubiquitous means every value must be considered as possibly null unless explicitly guarded. I find Python's current idiom where shapes are stronger to be safer.
A construct like:
a = b?.c?.d?.eeffectively hides where the None value is when a is None; it could be b, c, d, or e. She said that has led her to write code to inspect each, "
ending up with the very if-statements this proposals tries to replace". Dower agreed with her assessment:
[...] though in my case it's the extra years of C# experience that have helped sway me (C# already had these operators for years before the Python or JS proposals, and the movement since then has been towards non-nullable types that can never be null).Ensuring that functions always return correctly shaped objects (or raise if they cannot) is a better idiom. With ?., API designers will assume that their callers are happy to deal with the occasional None and will design lazy/slack APIs that return them instead of properly avoiding it.
There is a difference between Python and the languages that provide these operators, Beni Cherniavsky-Paskin said. Python raises exceptions, unless the programmer explicitly opts into using the APIs that can return a default (e.g. dict.get()):
It tends to exist in languages where common APIs/operations return such values a lot. JS obj.no_such_attr aka obj['no_such_attr'] fails by returning undefined; Ruby hash[no_such_key] fails by returning nil... Whereas Python raises exceptions — unless you specify explicit defaults to getattr() / dict.get(). For dictionaries, this pattern works great: (d or {}).get('e', {}).get('f', {})
Peter Lovett, who does a lot of Python teaching, pointed
out that there are some real advantages to avoiding None in
data structures. He agreed with Dower and said that "sometimes I've
found the best solution is to stop the None's getting into the list (or
whatever structure)
", because it encourages better coding practices.
He noted that Raymond Hettinger, who is a well-known Python instructor,
offers similar advice: "It's easier to get the length of a list that
doesn't have None's, than to find out how many non-None
objects a list
has.
"
But Angelico said
that avoiding adding a feature because it can be misused is problematic;
there are already lots of such features in the language. Instead, the
feature should be judged on its benefits, as well. "I'm more interested
in how something can be used well than in how it can be used
badly.
"
There were, of course, ideas about alternate syntax or ways to accomplish the same goal within the existing language; along with fairly short discussions of them and their merits—or lack thereof. All of that is pretty standard fare for the Ideas forum. What did not appear, however, was any real interest in picking up the PEP and pushing it forward. That would require a core developer to step up as the sponsor of the PEP and either push it themselves or work with others to do so.
But these kinds of discussions are going to keep popping up periodically until either the feature gets added or it gets rejected. The limbo that PEP 505 (and more generally the None-aware operators feature) occupy simply leaves room to discuss it all again—and again. One gets the sense that the feature faces too much headwind to get added to Python, but having a steering council make that pronouncement and documenting the rejection in a PEP would be a kind of public service at this point. That could head off further discussions down the road—as could adding the feature, of course.
Index entries for this article | |
---|---|
Python | None |
Python | Python Enhancement Proposals (PEP)/PEP 505 |
Posted Jan 5, 2024 7:59 UTC (Fri)
by NYKevin (subscriber, #129325)
[Link] (1 responses)
Posted Jan 11, 2024 18:01 UTC (Thu)
by smitty_one_each (subscriber, #28989)
[Link]
But the general disdain toward fresh syntax is easy to support.
Posted Jan 5, 2024 8:51 UTC (Fri)
by pbonzini (subscriber, #60935)
[Link] (1 responses)
But when writing Python I have never really felt the need for ?. or ?[] operators and, in particular, ?[] would be less useful than a shortcut for ".get(k, None)" that never raises an exception.
If anything, a Rust-like ? operator that *returns* None from the function might be more useful...
Posted Jan 5, 2024 9:31 UTC (Fri)
by kleptog (subscriber, #1183)
[Link]
I was about to say that ?[] would actually be useful, but you're right. What we need is something that shortens:
Which might be shorted to:
Since we care about the lookup, not whether
Every large project eventually grows a function that does something like that.
Posted Jan 5, 2024 9:17 UTC (Fri)
by marcH (subscriber, #57642)
[Link] (23 responses)
This is spot on; best and most accurate comment on the entire topic. Many times when writing code like "42 if x is None else x" I wondered "Wait, why can x be None in the first place?" and sometimes I did change the code so x could not be None any more.
Python's None is nowhere near as bad NULL in C/C++ (the so-called "billion dollar mistake") but it's still not great. So even if the new shortcuts were great, intuitive and very readable (which is debatable and debated), they would still only offer a very slightly lazier way to hide what is often another design issue.
Posted Jan 5, 2024 13:10 UTC (Fri)
by khim (subscriber, #9252)
[Link] (22 responses)
Huh? Python's Sure, but the only bulletproof solution is to drop dynamic typing and add mandatory types. And then it wouldn't be a Python anymore.
Posted Jan 5, 2024 16:11 UTC (Fri)
by marcH (subscriber, #57642)
[Link]
It's better because it gives you a clear stack trace instantly.
> Sure, but the only bulletproof solution is to drop dynamic typing and add mandatory types. And then it wouldn't be a Python anymore.
OK I should really have avoided this bad, pointless and distracting comparison with C...
Posted Jan 5, 2024 17:02 UTC (Fri)
by NYKevin (subscriber, #129325)
[Link] (3 responses)
This is a simplification. Python does provide support for static typing if you can be bothered to use it. If you run a FOSS project (or a closed-source project, for that matter), you can mandate that all code must pass a static type analysis (e.g. with mypy) before it may be merged. But that's up to you as a user of Python - the language won't force you to do it (just like it won't force you to write unit tests, run a linter, use a formatter, etc.).
Posted Jan 8, 2024 7:02 UTC (Mon)
by LtWorf (subscriber, #124958)
[Link] (2 responses)
And the typing cannot express some things, for example a function that takes T as a parameter and returns an instance of T as a result… there's been an issue open since years to support this, but in general it isn't supported.
Posted Jan 8, 2024 8:39 UTC (Mon)
by gdiscry (subscriber, #91125)
[Link] (1 responses)
Do you mean something like this? (Python 3.12 syntax for brevity, but previous versions only require small tweaks) If not, I'm curious about what you meant. Nevertheless, it's true that not all APIs can be correctly annotated in Python and it's frustrating when that happens. But when it can, it's really nice to avoid defensive programming at runtime.
Posted Jan 10, 2024 15:44 UTC (Wed)
by LtWorf (subscriber, #124958)
[Link]
https://github.com/python/mypy/issues/9773
https://mail.python.org/archives/list/typing-sig@python.o...
Posted Jan 5, 2024 18:00 UTC (Fri)
by marcH (subscriber, #57642)
[Link] (16 responses)
But you have a point: Option must be explicit and cannot be everywhere.
OK, enough Apples to Oranges comparisons, I said I would stop...
Posted Jan 5, 2024 20:44 UTC (Fri)
by NYKevin (subscriber, #129325)
[Link] (14 responses)
Because of Python's laissez-faire multi-paradigm attitude, it's actually quite difficult to design a good implementation of None-aware operators. You can't really use the Rust solution, because Python is dynamically-typed and likes to signal errors with exceptions rather than sentinel values (i.e. you can't reasonably define it to propagate Result<V, E> or the like, since Result is not even a thing in Python). But the TypeError or ValueError that you tend to get from None is usually the wrong exception to throw, and propagating None as if it was NaN will make it difficult to get a traceback. The idiomatic behavior, in some cases, is to eagerly check for None and raise an exception.
Maybe this syntax could work?
x = y ?? raise FooError('y should not be None.')
But that is going to be problematic. Raise is a statement, not an expression, so you'd need to make a special case to allow it in this one context, or you'd need to convert it into an expression. And then people will also want to write x = y ?? z, so you need to allow for that as well.
I have no idea how this is supposed to be extended for ?. and ?[], because where are you supposed to put the raise?
Posted Jan 7, 2024 14:21 UTC (Sun)
by cpitrat (subscriber, #116459)
[Link] (13 responses)
May I interest you in assert?
Posted Jan 8, 2024 22:06 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link] (12 responses)
(Also, you can't specify the exception type, but that's small potatoes in comparison.)
Posted Jan 8, 2024 22:13 UTC (Mon)
by khim (subscriber, #9252)
[Link] (4 responses)
I have just tested and that doesn't happen. What version of python are you using??? I have never observed that effect in Python, but there are many implementations, maybe one of them does that, but for me it's reason not to use it rather then change use of
Posted Jan 8, 2024 22:22 UTC (Mon)
by mb (subscriber, #50428)
[Link]
$ cat t.py
Posted Jan 8, 2024 23:19 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link] (1 responses)
Posted Jan 8, 2024 23:26 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link]
As far as I can tell, 1.4 did not have an assert statement, so the assert statement has never been unconditional in any released version of Python.
Posted Jan 8, 2024 23:40 UTC (Mon)
by ABCD (subscriber, #53650)
[Link]
Posted Jan 8, 2024 23:36 UTC (Mon)
by marcH (subscriber, #57642)
[Link] (1 responses)
> so the assert is just for documentation purposes (to inform the next person who reads the code that the invariant exists and is important).
Small contradiction here.
I didn't know about the -O flag and I've always "run" asserts and every time one is hit it is massively more useful than a comment!
> so the implication is that your code is expected to be correct even when asserts are not run.
Agreed.
Posted Jan 8, 2024 23:37 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link]
Sorry, my brain is smaller than yours, can you please elaborate?
Posted Jan 9, 2024 11:28 UTC (Tue)
by cpitrat (subscriber, #116459)
[Link]
But you can easily define your own raiseIfNone.
Posted Jan 9, 2024 15:11 UTC (Tue)
by atnot (subscriber, #124910)
[Link] (3 responses)
Posted Jan 9, 2024 15:28 UTC (Tue)
by kleptog (subscriber, #1183)
[Link]
Posted Jan 9, 2024 19:13 UTC (Tue)
by NYKevin (subscriber, #129325)
[Link]
Posted Jan 10, 2024 16:28 UTC (Wed)
by laarmen (subscriber, #63948)
[Link]
Posted Jan 6, 2024 0:40 UTC (Sat)
by tialaramex (subscriber, #21167)
[Link]
Let's make our own, custom user-defined generic sum type we'll call it Perhaps<T> and it'll have two values, Huh, and Well(T). Ours works exactly the same way and sure enough the Huh value of Perhaps<&T> will also be represented by an all-zeroes bit pattern.
This behaviour is guaranteed by the language (it is named the Guaranteed Niche Optimisation and is crucial to the language's design) but similar behaviours are delivered in practice for more exotic arrangements. For example Rust's char type needs 4 bytes, but it only handles Unicode "scalar values" (basically think codepoints unless you really care about the minutia of Unicode) so there are a *lot* of unused values. Accordingly, a sum type with Tafkap, SimpsonsMeme and Other(char) will fit in the same 4 bytes as the char alone, because Rust will just squeeze Tafkap and SimpsonsMeme in as bit patterns which aren't valid for char. You aren't promised this will work, but the Rust compiler is going to do it anyway because it's faster and smaller and easy.
† Rust only "really" has one loop, the one introduced by the keyword loop. But for and while both work fine, because the compiler just transforms them into loops, this "de-sugaring" is actually spelled out in the documentation, and the de-sugaring of for (since it's a modern iterator for-each not a C-style for) needs all of IntoIterator, Iterator, Some and None - and so they have to be langitems, annotated core library features which must exist or the language won't work.
In Python things are very different, None is a completely different type than whatever you expected, just as the null pointer is a distinct type in C++.
Posted Jan 5, 2024 12:39 UTC (Fri)
by Wol (subscriber, #4433)
[Link] (7 responses)
If death-date is valid then age is invalid else
(And I haven't trapped the case where death-date precedes birthdate, which is probably (but not necessarily :-) dud data).
Here we have a number that can have at least two different non-numeric values that MUST be dealt with, and that's a simple example. SQL has at least two possible context-dependent meanings for NULL, probably more ...
Until the problem space is properly defined, every fix is just going to create a new problem ...
Cheers,
Posted Jan 5, 2024 13:52 UTC (Fri)
by james (subscriber, #1325)
[Link] (1 responses)
Sometimes, that's what you want. Other times you have to put the checks in anyway. Occasionally you can get away with doing one check right at the end.
Posted Jan 5, 2024 14:49 UTC (Fri)
by Wol (subscriber, #4433)
[Link]
LIST PEOPLE WITH AGE NE INVALID BY.DSND BIRTHDATE
will get everybody in your genealogy database who is or could be alive. Of course, if they were born in 1900 they're probably dead, but they've still got three years to go to beat the record.
Cheers,
Posted Jan 5, 2024 17:37 UTC (Fri)
by NYKevin (subscriber, #129325)
[Link] (4 responses)
You cannot expect to encode all possible information about a value into the type system. There will always be a certain amount of context-dependent interpretation of the value, usually spread throughout your application logic. Ideally, you also document this interpretation somewhere (e.g. in a comment, saying something like "NULL death_date could mean still alive or death date unknown, check the is_alive field to distinguish them"). What you cannot reasonably do is give the type system awareness of that kind of semantic information. Yes, yes, you can use some kind of ADT to deal with this specific simple case, when using a language that provides ADTs (i.e. not SQL or any database language that I'm aware of). In SQL you can use CHECK constraints to at least rule out invalid states (e.g. if death_date is not NULL, then is_alive should be FALSE), and maybe hide the "how do I compute age?" complexity behind a VIEW or something. But the broader problem of "sometimes a variable has a semantic dependency on another variable, and it's possible for poorly-written code to misinterpret the meaning of a given value" is not one you are going to solve with fancy type theory, because type theory is fundamentally syntactic, and cannot* be aware of the full semantics of the program as a whole.
* See Rice's theorem.
Posted Jan 10, 2024 13:17 UTC (Wed)
by koh (subscriber, #101482)
[Link]
Posted Jan 10, 2024 15:05 UTC (Wed)
by Wol (subscriber, #4433)
[Link] (2 responses)
But again, is_alive is tri-state, yes, no or unknown. So we can't declare an "obviously boolean" property as boolean.
This is the big problem, we want to know the difference between "there is no answer" and "the answer is unknown". It's all very well saying "add an extra column and comment it", but at the end of the day, if I can't find out how old someone is just by querying the "age" column, then there is a problem (and yes it's a hard problem) with the database.
I'm not claiming to have a solution. It's just blindingly obvious that the solutions we have are over-complex - we need to do better AND recognise that there is a problem with the current setup. We need some sort of quartean logic :-) Maybe this is where FORTRAN's "arithmetic if" should make a comeback :-)
Cheers,
Posted Jan 10, 2024 18:52 UTC (Wed)
by mpr22 (subscriber, #60784)
[Link] (1 responses)
"not(reported_deceased)" is an adequate value to know, and "reported_deceased" really can be stored as a non-nullable boolean or, indeed, extrapolated from whether the nullable timestamp value "demise_reported_date" is not null.
Posted Jan 10, 2024 22:43 UTC (Wed)
by Wol (subscriber, #4433)
[Link]
Like I said, if we don't have a valid birth or death, then we don't know age, and this hasn't changed it. Like I said, "every fix is going to create a new problem", and that's exactly what's happened here.
I wouldn't create a "demise reported date", I'd just add a "type of death report" so you have your "died" date, and you know whether it is died date, death reported date, reported missing, presumed missing, etc etc.
But like I said, if you don't get your problem space definition right, things just go from bad to worse ...
Cheers,
Posted Jan 6, 2024 17:29 UTC (Sat)
by ceplm (subscriber, #41334)
[Link]
Posted Jan 14, 2024 5:08 UTC (Sun)
by DanilaBerezin (guest, #168271)
[Link]
That's the problem, it should be explicitly tested for. This hides the operation of the program and allows tons of ways for bugs to be silently introduced into the code without really knowing anything is wrong.
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
> But when writing Python I have never really felt the need for ?. or ?[] operators and, in particular, ?[] would be less useful than a shortcut for ".get(k, None)" that never raises an exception.
The return of None-aware operators for Python
foo.get('bar', {}).get('baz', {}).get('blaat')
foo['bar']?['bar']?['blaat']?
foo
is None or not. And you want it to work for array lookups, because there's no version of get()
that works for arrays. This is mostly a problem when dealing with JSON input with structures with optional parts. Though, the most readable version would be something like:
foo.getpath('bar.baz.blaat', None)
The return of None-aware operators for Python
> Python's None is nowhere near as bad NULL in C/C++ (the so-called "billion dollar mistake") but it's still not great.
The return of None-aware operators for Python
None
is worse than NULL
. At least with NULL you know that you have to deal with pointers to hit that corner-case. None
may be everywhere in Python.The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
And the typing cannot express some things, for example a function that takes T as a parameter and returns an instance of T as a result… there's been an issue open since years to support this, but in general it isn't supported.
def create_instance[T](cls: type[T]) -> T:
...
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
> assert is not properly used for any purpose other than as a "live comment" (i.e. a comment that actually gets executed). This is because the -O flag disables asserts, so the implication is that your code is expected to be correct even when asserts are not run.
The return of None-aware operators for Python
assert
.The return of None-aware operators for Python
assert 2+2==5
$ python3.11 t.py
[...]
AssertionError
$ python3.11 -O t.py
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
The return of None-aware operators for Python
0, null, unknown, invalid, (ad infinitum)
if birthdate is valid then
age is today() - birthdate
else
age is unknown
end
end
Wol
Or there's the 4GL approach, where you can write
0, null, unknown, invalid, (ad infinitum)
age = today - birthdate.
if deathdate <> ? then age = ?.
with either birthdate or deathdate possibly being ? (= NULL). A ? value in a calculation returns ? .
0, null, unknown, invalid, (ad infinitum)
Wol
0, null, unknown, invalid, (ad infinitum)
0, null, unknown, invalid, (ad infinitum)
0, null, unknown, invalid, (ad infinitum)
Wol
0, null, unknown, invalid, (ad infinitum)
0, null, unknown, invalid, (ad infinitum)
Wol
The return of None-aware operators for Python
The return of None-aware operators for Python