Type hinting for Python
Python is a poster child for dynamically typed languages, but if Guido van Rossum gets his way—as benevolent dictator for life (BDFL), he usually does—the language will soon get optional support for static type-checking. The discussion and debate has played out since August (at least), but Van Rossum has just posted a proposal that targets Python 3.5, which is due in September 2015, for including this "type hinting" feature. Unlike many languages (e.g. C, C++, Java), Python's static type-checking would be optional—programs can still be run even if the static checker has complaints.
Static vs. dynamic type checking is one of the big religious wars in computer language design. Statically typed languages rigidly enforce the rules on the types that can be assigned to variables or passed to functions, typically refusing to compile or run programs that violate those rules. Dynamically typed (some of which, like Python, are known as "duck typed") languages, on the other hand, allow an object of any type to be assigned to any variable or passed to any function, which adds a great deal of flexibility, but can also lead to runtime errors that could never occur in a statically typed program.
Both type systems have merits—and proponents—but they are usually mutually exclusive in a particular language. Either dynamic or static typing is inherent in the language's design, so it is rare to have both. What Van Rossum is proposing is to adopt the notation used by the mypy static type-checker for Python to, optionally, specify the types of function parameters and their return types, along with the types of variables. That would allow a static type-checker (either mypy or something else) to reason (and complain) about the wrong types being used.
For function argument and return types, the syntax to annotate functions is already present in Python 3, and has been since its first release. It is a little-used feature that was put into the language to see where it led. One of those outcomes was mypy. Function annotation is fairly straightforward:
def foo(a, b: int, c: str) -> str:
pass
That defines a do-nothing function that takes two integers and a string as
arguments and returns a string.
For those unfamiliar with Python, the "normal" definition of that function
would look like:
def foo(a, b, c):
pass
As might be guessed, user-defined class names
can be used, as can various built-in aggregation types
(e.g. List[int] or Dict[str, int]). Python function
annotations are completely open-ended (just associating some value with
each function argument and its return value), so the conventions used to specify
type information are largely derived from mypy (which has good documentation
that includes the type conventions).
There is not yet a Python Enhancement Proposal (PEP) for type hinting, though there is work in progress on one. In addition, Van Rossum has put together a theory document that he referenced in his proposal. It describes a new relationship ("is-consistent-with") between types; if type t1 is a subclass of t2, t1 is consistent with t2 (but not vice versa). The Any type is kind of a wildcard, it is consistent with every type (but Any is not a subclass of any other type) and every type is a subclass of Any (thus consistent with it).
Those simple rules allow various kinds of reasoning about the types that
get assigned to variables and passed to functions. Van Rossum's ideas come
from a number of sources, but "gradual
typing" is clearly central to his thinking. His theory document refers
those looking for "a longer explanation and motivation
" to
a gradual
typing blog post from Jeremy Siek. It is, essentially, a blueprint
for providing both static and dynamic typing for a language.
There is more to type hinting than just function annotations, however. The proposal does not add any syntax to the language for declaring variables to be a certain type, so comments with a specific format are used:
x = {} # type: Dict[str, str]
That would declare x to be a dictionary with strings as both keys
and values.
For situations where there are several options for types, the Union type can be used:
def feed(a: Union[Animal, Sequence[Animal]]) -> bool:
pass
That allows the argument a to either be an instance of the
Animal class or a sequence of Animals (or, of course, any
type that is-consistent-with those). feed()
returns True or False. To simplify types in
declarations, aliases can be used:
point = Tuple[float, float]
def distance(a, b: point) -> float:
pass
Forward references are handled by strings with the type name that are to be evaluated later:
class C:
def foo(self, other: 'C') -> int:
pass
The class C is being referred to before it is defined, so using
the name as a string (i.e. 'C') avoids the forward reference.
There is more, of course, but that should give a reasonable flavor of the proposal. Van Rossum has chosen a fairly aggressive schedule for adding the feature, which he acknowledged in his message:
The reaction has been largely positive. Part of the reason is that the feature only affects those few who are already using function annotations in some other way (which won't break, at least yet) or who want some amount of static checking added to the language. While there are other ways to do so, Van Rossum's approach is minimally intrusive—it doesn't change the language at all—so those who want to can simply ignore it entirely.
It will be interesting to watch this feature play out over the coming months. Even more interesting, though, will be seeing what uses Python developers find for a standardized static type system. Certain kinds of large projects and organizations are likely to benefit from static typing, so some pieces that might have been written in other languages may stick with Python instead. In any case, it is yet more evidence that Python has come a long way since its genesis in the late 1980s.
