Inspecting and modifying Python types during type checking
Python has a unique approach to static typing. Python programs can contain type annotations, and even access those annotations at run time, but the annotations aren't evaluated by default. Instead, it is up to external programs to ascribe meaning to those annotations. The annotations themselves can be arbitrary Python expressions, but in practice usually involve using helpers from the built-in typing module, the meanings of which external type-checkers mostly agree upon. Yet the type system implicitly defined by the typing module and common type-checkers is insufficiently powerful to model all of the kinds of dynamic metaprogramming found in real-world Python programs. PEP 827 ("Type Manipulation") aims to add additional capabilities to Python's type system to fix this, but discussion of the PEP has been of mixed sentiment.
The problem
Python decorators are functions that take in a function or class as an argument, and return a modified version. A commonly used example is the dataclasses.dataclass() function that takes a class definition and automatically adds a constructor, code to print out instances of the class in human-readable form, and so on.
from dataclasses import dataclass
@dataclass
class Dog:
name: str
size: float
print(Dog("Rufus", 9.0))
# Prints "Dog(name='Rufus', size=9.0)"
How can a type checker, which is external to Python, know how a decorator such as dataclass() will modify the code that it is attempting to check? In the specific case of dataclasses, PEP 681 ("Data Class Transforms") specifies a decorator that can be used to annotate decorators that behave in ways similar to dataclass(), so that type checkers can recognize them and take this into account.
from typing import dataclass_transform
# Tell type checkers that this is a decorator similar to @dataclass:
@dataclass_transform()
def my_custom_transformer(function):
...
# Now the type checker can understand a class using it:
@my_custom_transformer
class Cat:
name: str
coat_color: str
That solution is far from universal, though — it doesn't apply to other kinds of decorator, let alone Python's other metaprogramming facilities such as metaclasses or context managers.
Decorators that modify function definitions mostly don't run into this problem, since they can be defined to return a Callable[T ..., V] (that is, something with a __call__() method, which Python will treat like a function). The type checker can rely on the return type of the decorator (as instantiated with any generic types from the function being modified) to tell it how the resulting callable can be used. Decorators that modify classes, however, run into the problem that there is currently no way to specify a type that computes modifications to another type in Python.
For example, here is a decorator that removes declared int fields from a class, which cannot currently be given a correct type in Python:
def remove_int_members(clss):
for name, annotation in list(clss.__annotations__.items()):
if annotation is int:
del clss.__annotations__[name]
if hasattr(clss, name):
delattr(clss, name)
return clss
This is — despite all appearances — not a niche problem. There are plenty of useful Python libraries that automatically generate or adapt classes, such as object-relational mapping libraries like SQLAlchemy that use type annotations to indicate how fields correspond to database columns, or HTTP libraries like FastAPI that generate client code from an API definition. Currently, those libraries must use code generation (that adds an extra build step), go untyped (which makes them harder to use), or implement type-checker plugins (that require implementing one plugin per mutually-incompatible type checker that users of the library want to use). Even something like the contrived example above could be used to create separate database-facing and customer-facing types that remove sensitive fields, for example.
The solution
Michael Sullivan, Daniel Park, and Yury Selivanov proposed PEP 827 to address this perceived deficiency in Python's type system. It adds features to the typing module that let library authors write modified types using a set of type-level constructs inspired by TypeScript's type-level operators. These features would make it possible to correctly specify the type of a decorator that modifies a class, among other uses.
The most fundamental addition is a new type (IsAssignable[T, S]) that evaluates to a type that corresponds to True when an object of type T can be assigned to a variable of type S, and a type corresponding to False otherwise. The types that IsAssignable evaluates to are not the literal Python values True and False because the PEP authors wanted to avoid requiring type checkers to implement a full Python runtime. Instead, the specific types involved can be freely decided on by individual type checkers, as long as they conform to the interface provided in the PEP.
The True and False types, for example, must be usable in an if expression. A new Iter type would have to be usable in list comprehensions as well. The description of these types is spread throughout the PEP, but the core purpose is to bring control flow (conditionals and loops) into the type system in a way that does not require type checkers to reimplement all of Python's semantics in one go. Iter is essentially used as a signal that a tuple type should be looped over. The True and False types would let Python programmers write type annotations for functions that return different types depending on whether an input type is assignable to another type. For example, here is the type signature of a function that produces a string unless its argument is already duck-typed like a string (i.e., has an interface compatible with that of a string), in which case the argument is passed through unchanged:
def foo[T](input: T) -> T if IsAssignable[T, str] else str: ...
That ability becomes more useful when paired with the types introduced in the rest of the PEP. Members[T], for example, takes a class or typed dictionary T and evaluates to a tuple of types representing the class or dictionary's members. The NewProtocol[Ms] and NewTypedDict[Ms] types can then put a tuple of member types back together into a new protocol (Python's equivalent of an interface) or dictionary type. This allows a type annotation to destructure, modify, and reconstitute classes during type checking.
Here is the type of the example remove_int_members() decorator from above using the PEP's new types:
type WithoutInts[T] = [
Member
for Member in Iter[Members[T]]
if not IsAssignable[Member, int]
]
def remove_int_members[T](clss: Class[T]) -> NewProtocol[*WithoutInts[T]]: ...
A type checker that supported the types added in the PEP could evaluate the type of this decorator to correctly check uses of the modified class, even though the modified form of the class never appears in the actual Python source program.
Discussion
The PEP includes a fairly large number of new types, including types for raising errors at type-checking time, types for manipulating function arguments and results, types for handling unions of disjoint types, and more. On seeing this complexity, a natural question might be why the PEP needs to introduce special types that act like built-in Python values, mimicking their semantics, instead of allowing normal Python functions to be used to compute modifications to types. Cornelius Krupp thought that approach would be cleaner and reduce complexity of implementation.
Selivanov
disagreed,
saying that requiring type checkers to implement a Python runtime in order to
type check Python code would be highly non-trivial. Krupp's proposal "shifts
the complexity and makes it someone else's problem, which in reality will mean
that we're just not solving this problem at all
".
Sullivan
suggested
that if type checkers were to take that approach, functions that compute types
"wouldn't really be normal Python functions,
" since they would
be interpreted by the type checker and not Python itself. This would lead to
needless confusion between actual Python code, and code that merely looks like
Python code and is written in Python files, but which is actually executed by a
separate program according to its own rules, he said.
Justine Krejcha worried that introducing this extra complexity to the type system would lead to slow type checking and cryptic error messages. She thought that judicious use of the Any type was a more reasonable approach for libraries that have highly dynamic behavior. Other participants expressed similar concerns, including the inevitable discussion of syntax.
The PEP did receive some support in its current form, however. Sebastián Ramírez
said
that the PEP would "enable so many features in things I've built or wanted to
build.
"
"Philipp A."
said:
"The functionality in this PEP is something I've been reaching for again
and again
".
Jelle Zijlstra
thought the scale of the PEP was "a bit
scary
", but that it could "make the type system radically more
powerful.
" Zijlstra and Steve Dower both asked for the PEP to be implemented
in at least one type checker for people to experiment with before trying to
include it in the typing module in the standard library. Dower
wasn't a fan of seeing big, complicated types added to Python code.
Selivanov was
dubious
about the possibility of getting real-world testing out of the proposal before
adding it to the standard library. Today's users rely on integrated development
environments (IDEs), and those IDEs rely on their own internal type checking;
implementing the PEP's ideas in a single type checker "will not give you any
actionable data,
" he said. Users would also not necessarily need to see
complicated types directly, he pointed out. As with any existing code base,
maintainers can keep the code tidy by factoring out complex expressions into
their own definitions — something that is actually easier with more powerful
abstractions.
At the time of writing, discussion of the PEP is still ongoing. There seems to be little danger of a consensus emerging any time soon, but there are several other tangentially related proposals that could make the complexity introduced by this PEP more palatable. A draft PEP would add syntactic sugar for typed dictionaries, for example, that would make creating and manipulating types using PEP 827 types somewhat more streamlined. The Python community has also discussed the viability of introducing more existing Python syntax into type annotations, including the use of tuples and operators.
If Python did adopt the ability to use regular Python functions in type annotations, that would give it a similar ability to Zig, which lets users write functions that create new types at compile time. Even if Python doesn't go that far, however, its type system has consistently become more complex and flexible over time. It seems likely that, even if this particular PEP is not adopted as proposed, library authors will eventually enjoy the flexibility to implement static types for complex operations if they think the complexity is worth it.
