|From:||Nick Coghlan <ncoghlan-Re5JQEeQqe8AvxtiuMwx3w-AT-public.gmane.org>|
|To:||"M.-A. Lemburg" <mal-SVD0I98eSHvQT0dZR+AlfA-AT-public.gmane.org>|
|Subject:||Re: Please reconsider the Boolean evaluation of midnight|
|Date:||Thu, 6 Mar 2014 22:08:54 +1000|
|Cc:||Antoine Pitrou <solipsis-xNDA5Wrcr86sTnJN9+BGXg-AT-public.gmane.org>, "python-ideas-+ZN9ApsXKcEdnm+yROfE0A-AT-public.gmane.org" <python-ideas-+ZN9ApsXKcEdnm+yROfE0A-AT-public.gmane.org>|
On 6 March 2014 21:04, M.-A. Lemburg <mal-SVD0I98eSHvQT0dZR+AlfA@public.gmane.org> wrote: > Wait. Let's be clear on this: > > Writing > > if x: print ('x is None') > > or > > if x == None: print ('x is None') The case in question is essentially this one: if x: assert x is not None # Always valid! .... else: assert x is None # Valid for most user defined types There is a learned intuition that people naturally acquire when learning Python: - numbers may be false (it means zero) - containers may be false (it means empty) - everything else is always true (as that's the default for user defined classes) Where datetime().time() is confusing is the fact the expected behaviour changes based on *which* category you put the number in. The vast majority of Python's users will place structured date and time objects in the "arbitrary object" category, and expect them to always be true. When using None as a sentinel for such types, writing "if x:" when you really mean "if x is not None:" is a harmless style error, with no practical ill-effects. When dealing with time objects, such users are unlikely to be crafting test cases to ensure that "midnight UTC" is handled correctly, so their "harmless style error" is in fact a subtle data driven bug waiting to bite them. It is *really* hard for a static analyser to pick this up, because at point of use "if x:" gives no information about the type of "x", and hence any such alert would have an unacceptably high false positive rate. Now, let's consider the time with the *best* possible claim to being false: timestamp zero. How does that behave? Python 3.3: >>> bool(dt.datetime.fromtimestamp(0)) True >>> bool(dt.datetime.fromtimestamp(0).date()) True >>> bool(dt.datetime.fromtimestamp(0).time()) True Huh, if times are supposed to be valid as truth values, that looks rather weird. What's going on? >>> dt.datetime.fromtimestamp(0).time() datetime.time(10, 0) Oh, I'm in Brisbane - *of course* the truthiness of timestamps should depend on my timezone! Clearly, what I really meant was timestamp -36000, or perhaps 50400: >>> bool(dt.datetime.fromtimestamp(-36000).time()) False >>> bool(dt.datetime.fromtimestamp(50400).time()) False So, unless I happen to live in UTC, it's highly unlikely that I'm going to infer from Python's *behaviour* that datetime.time() (unlike datetime.date() and datetime.datetime()) belong in the "number" category, rather than the "arbitrary object" category. Perhaps it behaves like a number in other ways: >>> utcmidnight = dt.datetime.fromtimestamp(50400).time() >>> utcmidnight + 1 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +: 'datetime.time' and 'int' >>> utcmidnight * 1 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for *: 'datetime.time' and 'int' >>> int(utcmidnight) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: int() argument must be a string or a number, not 'datetime.time' Hmm, nope. And that last one *explicitly* tells me it's not a number! There's a great saying in the usability world: "You can't document your way out of a usability problem". What it means is that if all the affordances of your application (or programming language!) push users towards a particular logical conclusion (in this case, "datetime.time values are not numbers"), having a caveat in your documentation isn't going to help, because people aren't even going to think to ask the question. It doesn't matter if you originally had a good reason for the behaviour, you've ended up in a place where your behaviour is confusing and inconsistent, because there is one piece of behaviour that is out of line with an otherwise consistent mental model. But perhaps I've been told "midnight is false in boolean context". But which midnight? There are three that apply to me: >>> naivemidnight datetime.time(0, 0) >>> utcmidnight datetime.time(0, 0, tzinfo=datetime.timezone.utc) >>> localmidnight datetime.time(0, 0, tzinfo=datetime.timezone(datetime.timedelta(0, 36000))) Are they all False? No, no they're not (unless your local timezone is UTC): >>> bool(utcmidnight) False >>> bool(naivemidnight) False >>> bool(localmidnight) True There's a phrase for APIs like this one: "expert friendly". Experts like a particular behaviour because it lets them do advanced things (like leave the door open for modular arithmetic on timestamp.time() values). However, that's not normally something we see as a virtue when designing APIs for Python - instead, we generally aim for layered complexity, where simple things are simple, and we provide power tools for advanced users that want them. Now, suppose this boolean behaviour went away. How would I detect if a value was midnight or not? Well, first, I would need to clarify my question. Do I mean midnight UTC? Or do I mean midnight local time? It makes a difference for aware objects, after all. Or perhaps I'm not interested in aware objects at all, and only care about naive midnight. Using appropriately named values like those I defined above, those questions are all very easy to express explicitly in ways that don't rely on readers understanding that "bool(x)" on a datetime.time object means the same thing as "x is naive midnight or UTC midnight": if x not in (naivemidnight, utcmidnight): # This is the current bool(x) .... if x != naivemidnight: # This has no current shorthand .... if x != utcmidnight: # This has no current shorthand .... if x != localmidnight: # This has no current shorthand .... And just to demonstrate that equivalence is accurate: >>> local_10am = localmidnight.replace(hour=10) >>> local_10am datetime.time(10, 0, tzinfo=datetime.timezone(datetime.timedelta(0, 36000))) >>> bool(local_10am) False >>> local_10am != utcmidnight False >>> local_10am not in (naivemidnight, utcmidnight) False While it was originally the desire to reduce the impact of a very common bug that prompted me to reopen the issue, it is the improved consistency in the behaviour presented to users of Python 3.6+ that makes me belief this is actually worth fixing. We embarked on the whole Python 3 exercise in the name of making the language easier to learn by removing legacy features and fixing some problematic defaults and design warts - this is a tiny tweak by comparison, and will still go through the full normal deprecation cycle (warning in 3.5, behavioural change in 3.6) Regards, Nick. _______________________________________________ Python-ideas mailing list Python-ideas-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Copyright © 2014, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds