Multiple Mixins (and naming conflicts) in Python

Jun 28, 2020 · #programming #python

Mixins are quite useful. In case you're not familiar with the idea, it's a step in the direction of "composition over inheritance".

In Object Oriented Programming, consider that you're implementing a class and you want a bunch of functions to be available inside of that class. The idea behind mixins is to implement these functions in other independent classes and inherit your actual class from these utility classes. Such utility classes are called "mixins".

In Python, for instance, this is possible due to multiple inheritance.

As an example, assume that we want to implement an Animal class, the objects of which should have the capability to both "bark" and "meow". Putting biological concerns aside, this can be implemented by defining the "barking" functionality inside a Barker class and the "meowing" functionality inside a Meower class. You can then inherit the Animal class from both Barker and Meower.

class Barker:
    def bark(self):
        print('Woof!')

class Meower:
    def meow(self):
        print('Meow!')

class Animal(Barker, Meower):
    def greet(self):
        self.bark()
        self.meow()

This has the advantage that if later on we want to add another Animal-ish class which wants the ability to "bark", we simply have to choose Barker as one of the base classes to have the bark() function available in the new class.

While mixins come in super handy when solving real-world problems, they often come with their own set of issues. And some of these issues come from how the programming language you're using implements multiple inheritance.

In the Python world, one such problem is — what happens if two different mixins define the same function?

Let's take a real-world-ish problem to illustrate the issue.

Consider that we're writing a web application serving HTTP requests. At the end of every web request we would like to perform some cleanup related to the database and the cache. Let's also assume that our web framework of choice provides an on_finish hook on individual requests to place such cleanup code.

When implementing something like that using mixins, this is how it could roughly look like.

class DatabaseCleaner:
    def on_finish(self):
        print('Cleaning the database')

class CacheCleaner:
    def on_finish(self):
        print('Cleaning the cache')

class Request(DatabaseCleaner, CacheCleaner):
    pass

For this code snippet, what do you think the output would be?

Because of the way Python implements multiple inheritance, it ends up calling the function defined in the first base class. In our example, this happens to be DatabaseCleaner.on_finish, which is what ends up being called. This has the unfortunate side-effect that CacheCleaner.on_finish never gets called.

This is obviously not what we want. We want to print both "Cleaning the database" as well as "Cleaning the cache".

One solution for this problem is to name those on_finish functions differently. So that DatabaseCleaner implements on_finish_database and CacheCleaner implements on_finish_cache.

While that could work (with a little bit of extra effort), it sounds a little odd. What if those two classes are written by two completely different authors who don't know about the existence of each other? Not to mention that this now puts the responsibility on the calling code to remember to call those on_finish_* functions, defeating the mixin-magic in the first place.

Luckily, Python provides a way to call the "next in line" function, which refers to the function with the same name in the "next" base class. Let's modify the clean() functions in both the mixins to something like the following.

class DatabaseCleaner:
    def clean(self):
        print('Cleaning the database')

        try:
            next_clean = super().clean()
        except AttributeError:
            pass
        else:
            next_clean()

Here, inside the try/except clause, we first try to find the function that's "next in line" by using super(). In case one was found, we simply call it. In case no such function was found (which most likely means that this is the last base class), then super().clean() would raise an AttributeError, in which case we can choose to do nothing and do a clean exit.

While this does add 6 extra lines to the mixin code, I feel it's a small price to pay. As library authors, it's nice to act as good citizens and keep in mind that there are other libraries out there as well which our users might use which may result in name clashes. I feel it's a good idea to make our code resistant to such things.

I came across this problem when writing tornado-sqlalchemy, which is a Python package that provides SQLAlchemy integration for Tornado projects. This integration is provided using a DatabaseMixin which makes database Session objects available in the request classes.

When using this library in a work project which also had Sentry integration enabled through sentry-python, I noticed that both tornado-sqlalchemy and sentry-python provided (at least at that time) mixins to perform cleanup at the end of the request. tornado-sqlalchemy did the database cleanup while sentry-python did some cleanup related to Sentry.

But since Tornado provides only a single cleanup function called on_finish, the same function was being overridden by both the libraries. This resulted in the fact that if a user was using both the mixins in their application code, depending on which class order they used when defining the (multiple) inheritance, one cleanup function would be skipped completely.

I've since patched the DatabaseMixin provided by tornado-sqlalchemy so that this problem doesn't exist anymore. But it was still interesting to run into this issue and find the solution.