Applying PEP 8
Two recent threads on the python-ideas mailing list have overlapped to a
certain extent; both referred to Python's style guide, but the discussion
indicates that the advice in it may have been stretched further than intended. PEP 8
("Style Guide for Python Code
") is the longstanding set of
guidelines and suggestions for code that is going into the standard
library, but the "rules" in the PEP have been applied in settings and tools well outside of that
realm. There may be reasons to update the PEP—some unrelated work of that nature is
ongoing, in fact—but Pythonistas need to remember that the suggestions in
it are not carved in stone.
Emptiness
On August 21, Tim Hoffmann posted his idea for an explicit emptiness test (e.g. isempty()) in the language; classes would be able to define an __isempty__() member function to customize its behavior. Currently, PEP 8 recommends using the fact that empty sequences are false, rather than any other test for emptiness:
# Correct: if not seq: if seq: # Wrong: if len(seq): if not len(seq):
But Hoffmann said that an isempty() test would be more explicit
and more readable, quoting entries from PEP 20
("The Zen of Python
"). He also pointed to a video of a talk by Brandon
Rhodes, where Rhodes suggested that the second ("Wrong") version of the
test was more explicit, thus a better choice. Effectively Hoffmann wanted
to take that even further, but Steven D'Aprano said
that Python already has an explicit way to test collections for emptiness:
We do. It's spelled:
len(collection) == 0
You can't get more explicit than that.
He perhaps should have known that the last line would be too absolute for
other Python developers to resist; Serhiy Storchaka
and others came up with "more explicit" tests that D'Aprano laughingly
acknowledged.
But, perhaps more to the point, Chris Angelico wondered
what actual problems isempty() would solve. Testing a collection
in a boolean context (e.g. in an if statement or using bool()), as suggested in the PEP, works
for many types, he said; "Are there any situations
that couldn't be solved by either running a type checker, or by using
len instead of bool?
"
But, as Thomas Grainger pointed out, both NumPy arrays and pandas DataFrames have a different idea about what constitutes emptiness; evaluating those types as booleans will not produce the results expected. NumPy and pandas are popular Python projects for use in scientific and data-analysis contexts, so their behavior is important to take into account. Grainger also mentioned the "false" nature of time objects set to midnight, which was addressed back in 2014, as another example.
While the wisdom of treating zero as false in Python in general was questioned
by Christopher Barker, Angelico said
that the real problem with the false midnight was in treating midnight as
zero (thus false). In any case, Hoffmann believes
that objects should be able to decide whether they are empty: "It's a
basic concept and like __bool__ and __len__ it should be upon the objects
to specify what empty means.
" In a later message, he conceded
that adding a new emptiness protocol (i.e. __isempty__())
may well be overkill, however.
Several commenters asked about use cases where emptiness-test problems manifest; Hoffmann said that SciPy and Matplotlib both have functions that can accept NumPy arrays or Python lists and need to decide if they are empty at times. Using len() works, but:
We often can return early in a function if there is no data, which is where the emptiness check comes in. We have to take extra care to not do the PEP-8 recommended emptiness check using `if not data`.
He suggested that having two different ways to test for emptiness depending
on the types of the expected data was "unsatisfactory
";
"IMHO whatever the recommended syntax for
emptiness checking is, it should be the same for lists and arrays and
dataframes.
" But Paul Moore objected
to the rigid adherence to PEP 8:
You can write a local isempty() function in matplotlib, and add a requirement *in your own style guide* that all emptiness checks use this function.Why do people think that they can't write project-specific style guides, and everything must be in PEP 8? That baffles me.
But the inconsistency for using the object in a boolean context versus
checking its len() led
Hoffmann to suggest
that PEP 8 needs changing, "because 'For sequences, (strings,
lists, tuples), use the fact that empty sequences are false:' is not a
universal solution
". While Moore was not
opposed to changing the wording in PEP 8, he said that things are
not as clear cut as Hoffmann seems to think:
PEP 8 is a set of *guidelines* that people should use with judgement and thought, not a set of rules to be slavishly followed. And in fact, I'd argue that describing a numpy array or a Pandas dataframe as a "sequence" is pretty inaccurate anyway, so assuming that the statement "use the fact that empty sequences are false" applies is fairly naive anyway.But if someone wants to alter PEP 8 to suggest using len() instead, I'm not going to argue, I *would* get cross, though, if the various PEP 8 inspired linters started complaining when I used "if seq" to test sequences for emptiness.
Hoffmann eventually decided not to pursue either a language change or one for PEP 8. There are some differences of opinion within the thread, but, by and large, the Python core developers do not see anything that requires much in the way of change. Meanwhile, PEP 8 popped up again right at the end of August.
is versus ==
Nick Parlante posted a lengthy message about a problem he has encountered when teaching a first course in programming using Python. Unlike other languages (e.g. Java), Python has a much simpler rule for how to do comparisons:
To teach comparisons in Python, I simply say "just use ==" - it works for ints, for strings, even for lists. Students are blown away by how nice and simple this is. This is how things should work. Python really gets this right.
The problem is that PEP 8 has an entry in the "Programming
Recommendations" section that says: "Comparisons to singletons like None
should always be done with is or is not, never the
equality operators.
" Singletons are
classes that only have one instance—all references to None in
Python are to the same object.
Parlante calls the entry in the PEP the "mandatory-is rule"
and said that it complicates teaching the language unnecessarily; tests
like "x == None" generally work perfectly well.
Students often first encounter is in a warning from code that
tests a variable for equality to None, Parlante said. Integrated development
environments (IDEs) will typically complain about violations of PEP 8,
he said, which is usually "very helpful
". But there is an
exception: "Having taught thousands of introductory Python students, the one PEP8
rule that causes problems is this mandatory-is rule.
"
He suggested making the "rule" less ironclad by adding language about it
being optional to the PEP.
Angelico said
that the two operators are asking different questions, however; it is
important to eventually understand the difference, but "just use =="
is a fine place to start.
He also pointed to the specific language in the PEP and noted, again, that
"EVERYTHING in that document is optional for code that isn't part of
the Python standard library
".
He
suggested turning off the specific warning in the IDE if it was causing
problems. Ultimately, it is up to the instructor to determine the best
approach for their course—including the style guide.
Parlante pushed back a bit on the correctness of using is as specified in the PEP, but Angelico provided several examples of where the "x == None" test will not work. Perhaps unsurprisingly, NumPy was used in one of the examples; the point is that equality is not the right question to ask because some objects have odd views on what it means—or, like NumPy, are unwilling to even attempt to decide. NumPy raises ValueError when its multi-element arrays are tested using ==, for example.
Barker noted that he also teaches Python to beginners, but that he does teach about the difference between is and == early on. There are benefits to that approach, he said:
I have seen all too many instances of code like:if c is 0: ...Which actually DOES work, for small integers, on cPython -- but it is a very bad idea. (though I think it raises a Warning now)And your students are almost guaranteed to encounter an occasion where using == None causes problems at some point in their programming careers -- much better to be aware of it early on!
Barker suggested that Parlante leave the "mandatory is" warning turned on in the IDE,
but D'Aprano had a different
take. He is "not fond of linters that flag PEP 8
violations
" and agreed with Angelico's configuration suggestion. As
a practical matter, D'Aprano said, changing PEP 8 in order to affect the IDEs is
likely to be a slow way to go about fixing this problem—if there even is
one.
But "PEP-8 zealots
" (as D'Aprano called them) are actually
acting as a force for good, Parlante said. Students naturally pick up good
habits by seeing complaints from the IDE and fixing them, even though they come
from the completely optional guidelines in PEP 8. "I hope people
who care about PEP8 can have a moment of satisfaction, appreciating how
IDEs have become a sort of gamified instrument to bring PEP8 to the world
at low cost.
"
He has something of an ulterior motive to get to a more "== tolerant
"
world, but few, if any, commenters see things his way; as "Todd" put
it:
Using "==" is a brittle approach that will only work if you are lucky and only get certain sort of data types. "is" works no matter what. The whole point of Python in general and pep8 in particular is to encourage good behavior and discourage bad behavior, so you have an uphill battle convincing people to remove a rule that does exactly that.
Furthermore, as David Mertz pointed out, there are some important concepts that may be getting swept under the rug:
Moreover, I would strongly discourage any instructor from papering over the difference between equality and Identity. I guess in a pure functional language there's no difference. But in Python it's of huge importance.As noted REPEATEDLY, this isn't just about 'is None'. As soon as you see these, it is a CRUCIAL distinction:
a = b = [] c, d = [], []Nary a None in sight, yet the distinction is key.
In the example, all four variables are assigned to an empty list, but a and b are assigned to the same list. So:
>>> a == c
True
>>> a is b
True
>>> a is c
False
>>> c is d
False
Adding elements to a will add them to b and vice versa,
which is decidedly not the case for the other two lists.
D'Aprano thinks the dangers of using an equality test for None to be a bit overblown, but using is is still beneficial:
There are a bunch of reasons, none of which on their own are definitive, but together settle the issue (in my opinion).
- Avoid rare bugs caused by weird objects.
- Slightly faster and more efficient.
- Expresses the programmer's intent.
- Common idiom, so the reader doesn't have to think about it.
It looks rather unlikely that we will see any changes to PEP 8 for either of the ideas raised in these two threads. It is important to recognize what PEP 8 is (and is not)—no matter what IDEs and linters do by default. Hopefully the PEP's goals and intent were reinforced in the discussions. Meanwhile, Barker has been working on changes to the PEP to remove Python-2-specific language from it.
Other communities might not appreciate this kind of discussion, some of which can question the foundations of the language at times. But Python (and the python-ideas mailing list in particular) seems to welcome it for the most part. Over the years, those sorts of discussions have led to PEPs of various kinds—some adopted, others not—and to a better understanding of the underpinnings of the language and its history.
| Index entries for this article | |
|---|---|
| Python | Enhancements |
| Python | Python Enhancement Proposals (PEP)/PEP 8 |
