Inheritance versus composition
The idea of "inheritance" is something that most students learn about early on when they are studying object-oriented programming (OOP). But one of the seminal books about OOP recommends favoring "composition" over inheritance. Ariel Ortiz came to PyCon in Cleveland, Ohio to describe the composition pattern and to explain the tradeoffs between using it and inheritance.
Ortiz is a full-time faculty member at Tecnológico de Monterrey, which is a private university in Mexico. He noted that the title of his talk, "The Perils of Inheritance", sounded like "clickbait"; he jokingly suggested that perhaps he should have named it: "4 dangers of inheritance; you won't believe number 3!". That elicited a good laugh, but he said that clickbait was not his intent.
He has been teaching computer science for more than 30 years, using many languages, including Python. He likes Python and uses it for several courses, including data structures, web development, and compiler construction. He started with Python 2.0 in 2001 or so.
Definitions
In order to proceed, Ortiz said, he needed to define the two concepts at hand. "Inheritance" in the OOP sense starts with one class, known as the base class or superclass, and another class that is related to it. That second class is known as the derived class or subclass. The derived class has various attributes, such as variables and methods, that have come directly from the base class. Those attributes can be overridden in the derived class; new attributes can be added as well.
"Composition", on the other hand, has the composite or wrapper class and a component that is being wrapped, sometimes known as the "wrapee". Inheritance embodies the idea of an "is a" relationship, while composition creates a "has a" relationship.
In Python terms, you might have a Vehicle class that is used as the base for a derived Bicycle class, which would be created as follows:
class Bicycle(Vehicle):
pass
So, a Bicycle "is a" Vehicle. An example of composition
might be a class Engine that is used by another class:
class Car:
def __init__(self):
self.engine = Engine()
So, a Car "has a" (an) Engine.
The famous Design Patterns: Elements of Reusable Object-Oriented Software book has suggested favoring composition over inheritance. At the time it was published, over 20 years ago, most OO programmers were favoring inheritance in languages like C++ and Java. But, even all these years later, inheritance is the first and main tool that programmers reach for as part of their code reuse strategy.
Inheritance and composition are two mechanisms for code reuse. Inheritance is "white-box" (or even better, "crystal-box", Ortiz said) reuse. The derived class can see (too) many of the details of the base class. He prefers "black-box" reuse, which is what composition provides. The implementation of the component object is kept as a secret from the composite class. That secrecy is not because the object developer does not want people to know how it does its job, but because knowing those details is irrelevant. The component object can simply be used through its interface without needing to know anything further about its guts.
There are some advantages to inheritance, however. It is the easiest way to reuse code, he said; in Python you just list the base classes that you want to inherit from in the derived class definition. It is also easy to change the inherited implementation and to add to it.
The disadvantages of inheritance exist as well, of course. For one, the relationship between the base and derived classes is statically fixed; technically, that is not true for Python because of its dynamic nature, but is for C++ and Java. Inheritance supports weak encapsulation, so that derived classes can use parts of the base class in ways that the designer did not intend; also, name collisions can occur.
Beyond that, derived classes get everything from the base class, even things that they don't want. As the creator of Erlang, Joe Armstrong, put it: "You wanted a banana but what you got was a gorilla holding the banana and the entire jungle." And, finally, any changes in the base class impact all of the classes deriving from it. So, for example, adding a new method to the base class can cause name collisions all over the hierarchy.
Example
In order to show some of the differences between inheritance and composition, he presented two implementations of a simple last-in-first-out (LIFO) linked list that can be seen in a GitHub repository (and in his slides or the YouTube video of the talk). The code implements LinkedList as one class along with an InsertCounter class that is meant to count all of the insertions done to the list (it ignores removals from the list so the value is not the same as the length).
The example is a bit contrived, perhaps, but it does show that changing the implementation of the insert_all() method (which takes an iterable and adds each element to the list) in the base class actually affects the derived class. It leads to the count being incorrect in the InsertCounter object.
The composition version passes the LinkedList into the initialization of the InsertCounter:
counter = InsertCounter(LinkedList())
After that, the counter object can be used in code that looks
exactly the same as it was for the inheritance version:
counter.insert(42)
counter.insert_all([1, 2, 3])
counter.remove()
print(f'len = {len(counter)}, counter = {counter.counter}')
The final print statement would yield:
len = 3, counter = 4
But, as Ortiz pointed out, there is lots that InsertCounter gets for free when it uses inheritance that needs to be explicitly added to the class when it uses composition. Many methods (e.g. remove(), clear(), __len__(), etc.) that came with the LinkedList via inheritance needed to be implemented for the class when it uses composition. On the other hand, the composed InsertCounter was not affected by the changes to LinkedList. But, on the gripping hand, the implementation of insert_all() was, presumably deliberately, set up to be susceptible to this problem; a more realistic example would be hard to fit into the 30-minute slot for the talk.
Ortiz pointed to the forwardable module as a way to avoid having to implement all of that extra code for the composite object (InsertCounter). The module makes it easy to delegate functionality from the composite object to the component. His last change to his example code used forwardable in the InsertCounter.
There are several advantages to composition. The implementation of the component can be chosen at runtime because it can be passed in as part of the initialization of the composite object. Interface changes have a limited ripple effect. If the component interface changes, only the code in the composite object needs to change; users of the composite object are unaffected. With composition, one can create a class that has relationships with multiple components; this is a design pattern known as the façade pattern.
There are, of course, disadvantages to composition. It requires more code than inheritance, as we saw, Ortiz said. It is often more difficult to read the code using composition than it is to read code using inheritance.
His talk was not meant to demonize inheritance, there are good reasons to use it sometimes. In particular, when the base and derived classes are in the same package and are under the control of the same developers, inheritance may well be the right choice. "Inheritance is not wrong", he said, there are many circumstances where it makes sense, especially when there truly is an "is a" relationship.
He ended with a Spanish question that was evidently made popular in a US advertisement: "¿Por qué no los dos?", which means "Why not both?". Inheritance and composition are both useful tools; developers should understand when to use each.
[I would like to thank LWN's travel sponsor, the Linux Foundation, for
travel assistance to Cleveland for PyCon.]
| Index entries for this article | |
|---|---|
| Conference | PyCon/2019 |
| Python | Inheritance |
