An "enum" for Python 3
Designing an enumeration type (i.e. "enum") for a language may seem like a straightforward exercise, but the recently "completed" discussions over Python's PEP 435 show that it has a few wrinkles. The discussion spanned several long threads in two mailing lists (python-ideas, python-devel) going back to January in this particular iteration, but the idea is far older than that. A different approach was suggested in PEP 354, which was proposed in 2005 but rejected at that time, largely due to lack of widespread interest. A 2010 discussion also led nowhere (at least in terms of the standard library), but the most recent discussions finally bore fruit: Guido van Rossum accepted PEP 435 on May 9.
The basic idea is to have a class that implements an enum, which, in Python, might look a lot like:
from enum import Enum
class Color(Enum):
red = 1
green = 2
blue = 3
That would allow using Color.green (and the others) as a constant,
effectively.
Not only would Color.blue have a value, but it would also have a
name ('blue') and an order (based on the declaration order). Enums can
also be iterated over, so that:
for color in Color:
print(color, color.name, color.value)
gives:
Color.red red 1
Color.green green 2
Color.blue blue 3
Along the way, there were several different enum proposals made. Ethan Furman offered one that incorporated multiple types of enum, including ones for bit flags, string-valued enums, and automatically numbered sequences. Alex Stewart came up with a different syntax for defining enums to avoid the requirement to specify each numeric value. Neither made it to the PEP stage, though pieces of both were adopted into the first draft of PEP 435, which was authored by Eli Bendersky and Barry Warsaw.
There are a couple of fairly obvious motivations for adding enums, which were laid out in the PEP. An immutable set of related, constant values is a useful construct. Making them their own type, rather than just using sequences of some other basic type (like integer or string) means that error checking can be done (i.e. no duplicates) and that nonsensical operations can raise errors (e.g. Color.blue * 42). Finally, it is convenient to be able to declare enum members once but to still be able to get a string representation of the member name (i.e. without some kind of overt assignment like: green.name='green').
Some of the use cases mentioned early in
the discussion of the PEP are for values like stdin and
stdout, the
flags for socket() or seek() calls, HTTP error codes,
opcodes from the
dis (Python bytecode disassembly) module, and so forth. One of
the questions that was
immediately raised about the original
version of the PEP was its insistence that "Enums are not
integers!
", so ordered comparisons like:
Color.red < Color.green
would raise an exception, though equality tests would not:
print(Color.green == 2)
True
To some, that seemed to run directly counter to the whole idea of an enum
type, but allowing ordered comparisons has some unexpected consequences as Warsaw
described. Two different enums could be compared with potentially
nonsensical results:
print(MyAnimal.cat == YourAnimal.dog)
True
In general, the belief is that "named integers" is a small subset of the
use cases for enums, and that most uses do not need ordered comparisons.
But, the final accepted PEP does have an IntEnum variant
that provides the ordering desired by some. IntEnum members are also a
subclass of int, so they can be used to replace user-facing
constants in the
standard library that are already treated as integers (e.g. HTTP error codes,
socket() and seek() flags, etc.).
A second revision of the PEP was posted in April, after lengthy discussion both in python-devel and python-ideas. Furman offered up another proposal, this time as an unnumbered PEP with four separate classes for different types of enums. Two different views of enums arose in the discussion, as Furman summarized:
The critical aspect of using or not using an integer as the base type is: what happens when an enumerator from one class is compared to an enumerator from another class? If the base type is int and they both have the same value, they'll be equal -- so much for type safety; if the base type is object, they won't be equal, but then you lose your easy to use int aspect, your sorting, etc.
Worse, if you have the base type be an int, but check for enumeration membership such that Color.red == 1 == Fruit.apple, but Color.red != Fruit.apple, you open a big can of worms because you just broke equality transitivity (or whatever it's called). We don't want that.
Furman's proposal looked overly complex to Bendersky and others commenting on a fairly short python-ideas thread. Meanwhile in python-devel, another monster thread was spinning up. The first objection to the revised PEP was in raising a NotImplementedError when doing ordered comparisons of enum members. That was quickly dispatched with a recognition that TypeError made far more sense. Other issues, such as the ordered comparison issue that was handled with IntEnum in the final version, did not resolve quite as quickly.
One question, originally raised by Antoine Pitrou, concerned the type of the enum members. The early PEP revisions considered Color.red to not be an instance of the Color class, and Warsaw strongly defended that view. At some level, that makes sense (since the members are actually attributes of the class), but it is confusing in other ways. In a sub-thread, Van Rossum, Warsaw, and others looked at the pros and cons of the types of enum members, as well as implementation details of various options. In the end, Van Rossum made some pronouncements on various features, including the question of member type, so:
isinstance(Color.blue, Color)
True
is now an official part of the specification.
As Python's benevolent dictator for life (BDFL), which is Van Rossum's only-semi-joking title, he can put an end to arguments and/or "bikeshedding" about language features. In the same thread, he made some further pronouncements (along with a plea for a halt to the bikeshedding). It is a privilege that he exercises infrequently, but it is clearly useful to the project to have someone in that role. Much like Linus Torvalds for the kernel, it can be quite helpful to have someone who can stop a seemingly endless thread.
Van Rossum's edicts came after Furman summarized the outstanding issues (after a summary request from Georg Brandl). That is a fairly common occurrence in long-running Python threads: someone will try to boil down the differences into a concise list of outstanding issues. Another nice feature of Python discussions is their tone, which is generally respectful and flame-free. Participants certainly disagree, sometimes strenuously, but the tone is refreshingly different from many other projects' mailing lists.
Not everyone is happy with the end result for enums, however. Andrew Cooke is particularly sad about the outcome. He points out that several expected behaviors for enums are not present in PEP 435:
class Color(Enum):
red = 1
green = 1
is not an error; Color.green is an alias for Color.red
(a dubious "feature", he noted with a bit of venom). In addition, there is a way to avoid having to assign values for each enum member (auto-numbering, essentially), but its syntax is clunky:
Color = Enum('Color', 'red green blue')
Beyond having to repeat the class name as a string (which violates the
"don't repeat yourself" (DRY) principle), it starts the numbering from one,
rather than zero. Nick Coghlan responded
to Cooke's complaints by more or less agreeing with the criticism. There
is still room for improvement in Python enums, but PEP 435 represents a
solid step forward, according to Coghlan.
It is instructive to watch the design of a language feature play out in public as they do for Python (and other languages). Enums are something that the developers will have to live with for a long time, so it is not surprising that there would be lots of participation and examination of the feature from many different angles. While PEP 435 probably didn't completely satisfy anyone's full set of requirements, there is still room for more features, both in the standard library and elsewhere, as Coghlan pointed out. The story of enums in Python likely does not end here.
