|
|
Subscribe / Log in / New account

Super Python (part 1)

By Jake Edge
April 19, 2022

A mega-thread in the python-ideas mailing list is hardly surprising, of course; we have covered quite a few of them over the years. A recent example helps shine a light into a dark—or at least dim—corner of the Python language: the super() built-in function for use by methods in class hierarchies. There are some, perhaps surprising, aspects to super() along with wrinkles in how to properly use it. But it has been part of the language for a long time, so changes to its behavior, as was suggested in the thread, are pretty unlikely.

Inheritance and super()

Perhaps a bit of an introduction to Python inheritance is in order, at least for some readers. Python is generally written in an object-oriented fashion, with classes that can incorporate behavior from other classes via various mechanisms. One of those ways is via inheritance: one class directly references a parent class and gets some of its behavior from the parent. That can look like the following:

    class Shape: pass
    
    class Ellipse(Shape): pass

    class Circle(Ellipse): pass

The idea is that inheritance is a way to express an "is a" relationship; in the above, Circle is an Ellipse, which is a Shape, thus, transitively, Circle is also a Shape. Implicitly, Shape is an object, which is the base for all classes in Python. Behavior and attributes that are common to all shapes would be added to a parent class and those would be inherited by all of the descendant classes.

Perhaps this class hierarchy is being used in some sort of graphical display program, so all shapes might have a position and color as attributes. For behavior, shapes might need some basic operations, such as move() and draw(), but those are somewhat beyond the scope of what is needed to explain inheritance. If we modify Shape a bit, we can see this inheritance in action:

    class Shape:
	def __init__(self, x, y, color):
	    self.x = x
	    self.y = y
	    self.color = color

    c = Circle(x = 0, y = 0, color = 'blue')
    print(c.color)  # prints "blue"

The __init__() function initializes a shape with the given attributes and Circle inherits that initialization routine from Shape (by way of Ellipse); Python automatically follows the hierarchy when it cannot find a function and uses the first that it finds. The call to Circle() creates an instance of that type and initializes it using the __init__() from Shape; the result is a Circle object with three instance variables (x, y, and color).

But what if Circle needs to be initialized with additional information specific to that kind of shape? Maybe it needs a radius; a somewhat simplistic way to add that might be the following:

    class Circle(Ellipse):
	def __init__(self, x, y, color, r):
	    self.r = r
	    Ellipse.__init__(self, x, y, color)

Now, a Circle gets initialized with four parameters; it peels off one and calls its parent class's __init__() explicitly. Given the example so far, it could simply go straight to the __init__() method of Shape, but that is generally a bad practice; Ellipse may well have (or get) an __init__() of its own. But now, if the class hierarchy is refactored in some way and a class is inserted between Ellipse and Circle, or Circle gets some other parent, it needs to be changed in the class definition and every member function that specifically refers to Ellipse. The super() function is meant to help with that:

    class Circle(Ellipse):
	def __init__(self, x, y, color, r):
	    self.r = r
	    super().__init__(x, y, color)

super() is effectively a proxy for the parent class, without needing to name it directly. Should the hierarchy change, the new parent class just needs to be named once in the class definition. In a straightforward single-inheritance hierarchy like this, it is clear the path that Python will take to resolve super(); it simply walks the hierarchy upward until it finds a parent that implements the member function in question.

super() and multiple inheritance

But with multiple inheritance, things are a bit different. If we have a new class, that inherits from Circle and another class, there might be some question how methods get resolved—and in what order. If we torture our example a bit further, we might have:

    class Circle(Ellipse):
	def __init__(self, r, **kwargs):
	    self.r = r
	    super().__init__(**kwargs)

    class Shape3D(Shape):
	def __init__(self, z, **kwargs):
	    self.z = z
	    super().__init__(**kwargs)

    class Sphere(Circle, Shape3D):
	def __init__(self, **kwargs):
	    super().__init__(**kwargs)

So Sphere inherits from both Circle and the new Shape3D class, which in turn inherits from Shape. Since we are using keyword arguments to initialize the objects, using **kwargs allows the __init__() functions to just peel off the arguments of interest to them and pass the rest up the chain via super(). A Sphere could be created as follows:

    s = Sphere(x = 0, y = 0, z = -2, r = 2, color = 'red')

In order for that all to work, both parents of Sphere need to have their initialization routines called, and that is exactly what happens. Python has a method resolution order (MRO) that was adopted for Python 2.3 back in 2003. It uses C3 linearization to determine how to order the class hierarchy in a deterministic way that preserves the relationships between the classes; if no such ordering can be found, Python will raise an error when the classes are defined. The class.mro() method can be used to examine the MRO; for Sphere it looks like this:

    [<class '__main__.Sphere'>, <class '__main__.Circle'>,
    <class '__main__.Ellipse'>, <class '__main__.Shape3D'>,
    <class '__main__.Shape'>, <class 'object'>] 

That list shows the order in which the classes will be consulted when trying to resolve a method (or class attribute) for Sphere. It also indicates the path that super() will walk as it works its way through the hierarchy.

super()-thread

It is against that backdrop that Martin Milon ("malmiteria") posted to python-ideas. He said that the MRO is effectively masking problems that the programmer should be required to resolve manually:

in case of multiple [inheritance], resolving a child method from it's parent isn't an obvious task, and mro comes as a solution to that. However, i don't understand why we don't let the programmer solve it. I think this is similar to a merge conflict, and not letting the programmer resolve the conflict feels like silencing an error. This is especially infuriating when you realise that mro doesn't solve all possible scenarios, and then, simply refuses the opportunity to solve it to the programmer. Then, super relying on mro gives off some weird behaviors, mainly, it's possible for a child definition to affect what a call to super means in it's parent. This feels like a side effect (which is the 'too implicit' thing i refer to).

He had a proposed solution to that problem, which is further described in a GitHub project repository, that he calls "explicit method resolution" (EMR). It would make no changes for single inheritance, but would force developers to use a new __as_parent__() method to explicitly choose a particular implementation of a method from among multiple possibilities in the parents, grandparents, and so on of a class. If that is not done, an exception would be raised.

But explicitly choosing the method resolution is already available in Python, David Mertz said: "You don't need `super()` to call `SomeSpecificClass.method(self, other, args)`". Milon agreed, but said that "super acts as a proxy to a parent, and is by default the feature to refer a parent from within a class method". Christopher Barker pointed out that super() is actually a proxy for the parents, not just a parent; "even if the current class inherits from only one class, that superclass may inherit from more than one class".

Milon gave a simple example, which did not use super() at all, that shows how he thinks the MRO is causing problems: two classes, A and B both implement method() and class C inherits from both (class C(A, B)):

Today, a code such as ```C().method()``` works without any problems except of course when the method you wanted to refer to was the method from B. If class A and B both come from libraries you don't own, and for some reason have each a method named the same (named run, for example) the run method of C is silently ignoring the run method of B.

The C3 mechanism uses the order in which two or more parent classes are listed in a class definition to determine the MRO. But Milon is saying that having two methods in the parent classes constitutes a "conflict" that should be resolved by the programmer. As he said in another message: "My point is that : it is not correct to assume it is meaningful to give an order to multiple [inheritance]". Stephen J. Turnbull disagreed:

[...] of course it's *meaningful* to give an order to multiple inheritance. Sometimes that order is *not useful*. Other times it may be *confusing*. Pragmatically, C3 seems to give useful results frequently, unuseful or confusing results infrequently [...]

Steven D'Aprano had a more fundamental objection to the example: "Why are you inheriting from A if you don't want to inherit from A?" He noted that A and B were not designed for multiple inheritance, which is the basis of the problem.

Multiple [inheritance] in Python is **cooperative** -- all of the classes in question have to work together. If they don't, as A and B don't, bad things happen.

You can't just inherit from arbitrary classes that don't work together. "Uncooperative multiple inheritance" is an unsolvable problem, and is best refactored using composition instead.

In that earlier message, Milon also showed an example to better explain where he thinks super() is confusing:

super is not a proxy to parents, even plural, it's a proxy to the next in mro order.

in this case :

class Top:
    def method(self):
        print('Top')
class Left(Top):
    def method(self):
        print('Left')
        super().method()
class Right(Top):
    def method(self):
        print('Right')
        super().method()
class Bottom(Left, Right):
    def method(self):
        print('Bottom')
        super().method()
Bottom().super() would print "Bottom", "Left", "Right", "Top". super, in Left, [refers] to Right, which is not a parent of Left (and is completely absent of it's definition)

He presumably means Bottom().method() there toward the end. That result might be surprising to some—possibly even confusing—but the super() call in Left uses the MRO context of Bottom, which is where the original super() call was made. The MRO for Bottom is: Bottom, Left, Right, Top, object. In that context, Left.method() will call Right.method() via super(). So the result is what many, but probably not all, Python programmers would expect. super() allows for an automatic way to pass a method call up through a set of cooperating classes; as Barker put it: "If you didn't use super in the whole chain— it would stop when it found the method. If you do, then they are all [called], and each one only once." In order to use super(), programmers have to follow some rules; "And then it works predictably. Whether that's helpful or not depends on your use case."

Some of those "rules" are embodied in two classic blog posts, as D'Aprano pointed out. One is "a provocatively titled blog post called 'super considered harmful'" by James Knight, though the title has changed. D'Aprano said that after some discussion, Knight backed down a bit, but still said: "super is great, 'but you can't use it.'". D'Aprano disagreed: "Except of course thousands of people do use it. And it works." The other blog post is Raymond Hettinger's "Python's super() considered super!", which describes how to properly use the feature.

D'Aprano followed up with another group of links to multiple blog posts by Michele Simionato, who wrote the MRO documentation for Python 2.3 linked above. There are two series, one on super() and one on Mixins, and some other topics, including multiple inheritance in various forms. D'Aprano said:

Multiple inheritance in Python works the way it does because it is modelling cooperative MI and the MRO and super are the Right Way to handle cooperative MI.

That doesn't mean that cooperative MI doesn't have problems. Other languages forbid MI altogether because of those problems, or only allow it in a severely restricted version, or use a more primitive form of super. MI as defined by Python is perhaps the most powerful, but also the most potentially complicated, complex, convoluted and confusing.

The discussion, links, and such had given Milon some food for thought, which he appreciated. He planned to regroup and come back with a clearer proposal that incorporates elements from the discussion and tries to address a persistent call for "real life" examples, rather than the somewhat abstract examples he had been using. He posted the results on April 3, which led to the second "half" of the mega-thread. That will provide more opportunities to dig into super() and other Python inheritance topics, which is coming in part 2 of our tale.


Index entries for this article
PythonInheritance
Pythonsuper()


to post comments

Super Python (part 1)

Posted Apr 19, 2022 16:48 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (5 responses)

In "part 2" of that megathread (which is very long and I haven't looked at the whole thing yet), I was rather surprised to see that nobody was objecting to the class C(A(B)): syntax on the grounds that this already means something (specifically, it means "make the parent class of C whatever A(B) returns") and would therefore break backwards-compatibility if anyone is dynamically evaluating their parent classes at runtime.

But OTOH the whole thread is a bit of a disaster, so I guess they had other objections which were more important.

Super Python (part 1)

Posted Apr 19, 2022 16:59 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

Ah, I just needed to read a little further, they did get to that point eventually.

Super Python (part 1)

Posted Apr 19, 2022 17:01 UTC (Tue) by stop50 (subscriber, #154894) [Link] (3 responses)

Dynamic classes exist. Django uses them for their user classes.

Super Python (part 1)

Posted Apr 19, 2022 19:52 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (2 responses)

Honestly, I'm starting to think that this person can get 80% of what they want by turning their mixins into functions like this:

T = typing.TypeVar('T', bound=SomeBaseClass)

def Mixin(Parent: type[T]) -> type[T]:
    class MixinClass(Parent):
        # Put mixin code here.
    return MixinClass

# Elsewhere in the codebase:

class SomeClass(SomeBaseClass):
    # Put the parent's code here.

class Derived(Mixin(SomeClass)):
    # Put the child's code here.

You don't need to futz with metaclasses to do that, you get the nice class Derived(Mixin(SomeClass)): syntax that this person seems to want, as well as an explicit, purely linear inheritance hierarchy, and (I think) you can even run this through a type-linter. The only downside is that pickling tends to dislike such dynamic classes, but frankly pickling is evil anyway.

If even that is too much typing, you can make a decorator that automagicks your mixin classes into Mixin()-like functions.

Super Python (part 1)

Posted Apr 19, 2022 21:48 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (1 responses)

One issue here is that the `MixinClass` will be different from all other usages. If you use something like `isinstance` to detect if a mixin interface is available, you're kind of out of luck. But hey, this is Python, just call those methods and hope a duck quacks instead of a goose honking when you do that ;) .

Super Python (part 1)

Posted Apr 19, 2022 23:23 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

Oh, I don't think this is actually a Good Idea. I just think the author can accomplish what they want without modifying the core language. Python is flexible, but that doesn't mean you should go around flexing it into whatever shape you like.

Real Examples?

Posted Apr 19, 2022 23:05 UTC (Tue) by tialaramex (subscriber, #21167) [Link] (3 responses)

What is Python's co-operative multiple inheritance actually used for?

Are there things I actually do in Python that in fact quietly depend upon multiple inheritance (and thus co-operative multiple inheritance)? Or is this purely for third party stuff, and so we would find all the co-operating classes provided presumably in a single third party component?

The examples presented here start with the usual woolly/ useless OOP stuff where we imagine a class hierarchy like shapes, which is not a real thing anybody should do, and online examples all immediately start naming classes A, B, C, D, E, F and so on suggesting they're purely about an abstract idea and have no connection to real problems.

Real Examples?

Posted Apr 19, 2022 23:17 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (2 responses)

By far the most common use is mixin classes (i.e. classes that add extra functionality when you inherit from them, but which are not intended to stand alone as "complete" classes that you would normally instantiate or use as the sole parent class). See https://docs.djangoproject.com/en/4.0/topics/class-based-... for concrete examples of real code using mixins (in this case, the classes are designed to convert incoming HTTP requests into HTTP responses, for dynamic web content).

Real Examples?

Posted Apr 20, 2022 7:14 UTC (Wed) by tialaramex (subscriber, #21167) [Link] (1 responses)

From that page:

> Not all mixins can be used together, and not all generic class based views can be used with all other mixins. Here we present a few examples that do work

In the context provided by the article about the need for co-operation and necessity of every multiply inheritable class knowing about every other such class, I come away thinking that feature was a bad choice and Python would likely have been better off not providing it at all so as to oblige people who want... something here to figure out what it is they want and build that. I can't imagine that such an arrangement would be worse here, and likely it would be better, and at least much more coherently documented and probably also more extensible.

Real Examples?

Posted Apr 21, 2022 1:23 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

Multiple inheritance, in its modern form, has existed continuously since Python 2.3 (albeit with minor syntax changes in 3.0). We are long past the point where we can say "this feature should not exist," or even "this feature should be less powerful."

Regardless, this is very much in line with Python's "consenting adults" philosophy. If you want to MI random classes together and make a mess, it's your problem to deal with, just like accessing private attributes would be your problem to deal with.

Super Python (part 1)

Posted Apr 20, 2022 3:07 UTC (Wed) by foom (subscriber, #14868) [Link] (1 responses)

Wow, what a blast from the past. I don't follow python much these days, but I thought this discussion was done almost 15 years ago!

Of course, it's still the case that nearly every call to super in real-world python code is incorrect -- it expects super() to simply be a shortcut for "my single declared parent type". If/when that turns out not to be the case at runtime (because super() actually means "next in MRO", which is controllable by subclasses), it's just busted.

Really unfortunate. Oh well.

Super Python (part 1)

Posted Apr 20, 2022 5:03 UTC (Wed) by NYKevin (subscriber, #129325) [Link]

IMHO super() is making the best of a bad situation. Specifically:

0. If you actually want a specific parent class, you can just write Parent.method(self), and it'll work exactly as you would expect, generally without any MRO shenanigans. Of course, if the parent class itself calls super(), then that will invoke the MRO and you are out of luck, but it is always the case that the parent class might decide to do something which you do not want it to do. Furthermore, this was standard before super() existed, so IMHO if you go out of your way to ask for the MRO to be involved, then you really should not be surprised that the MRO is in fact involved. The whole point of super() is that you don't necessarily know what to write for Parent, so you write super() instead.
1. Regardless of language or specific implementation, there is a general, OOP-wide expectation that inheritance hierarchies respect the Liskov substitution principle - that is, if super() calls into a different parent class from the one you are expecting, nothing should break, because both the method you wanted and the method you actually called should satisfy the same invariants and provide the same API. That's the same reason that most static type systems allow you to use runtime polymorphism - corresponding methods should satisfy the same API, or else your program will break. In practice, this is difficult to achieve for __init__() unless you use keyword-only arguments for everything (the recommended workaround for that problem), but most other methods are fine because they have fixed signatures. OTOH, sometimes people write code which is not Liskov-respecting, but there is little that Python can do to prevent that.
2. Python expects *multiple* inheritance to be cooperative. That is, it expects that all of the classes participating in the MI hierarchy are at least vaguely aware of each other and will not step all over each others' toes. If you inherit from random combinations of unrelated classes, then there's a good chance the resulting subclass will just randomly break because private attributes can clash, invariants may get violated, etc. Again, there's little that Python can do (though more widespread use of __private_name_mangling would probably help).


Copyright © 2022, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds