How Django Uses Deferred Imports To Scale

Django's Smart Use Of Python's importlib

This post aims to dispel ambiguities introduced by Django for those learning it. These things initially perplexed me. I found it hard to grasp and articulate, and for my first years in Python, they were akin to magic.

The idea of using strings in Django settings and having them resolving python code seemed very indirect. How did that work? Was it a feature of python itself? Or was it some Rails-like meta programming sorcery? Is the notion from The Zen of Python of "Explicit is better than implicit" be challenged?

The concept of resolving strings into evaluated source code reaches beyond Django. For instance, enabling extensions in Sphinx. Even command line interfaces in standard Python, easily-accessible as $ python -m in PEP 338 or unittest's CLI interface.

This article tries to document where these conditions arise, so we how to distinguish where and when we see them, especially in django. Finally, we'll look into how it works underneath the hood in terms of the broader python language, and demonstrate something useful with it and relase it into open source.

"Import strings" have a lot of useful applications. I'd call them a necessity in a framework like Django, or else there'd be race conditions and circular dependenciess. Django's loading of settings, applications and models is actually rather intricate, and in my opinion, well-executed.

We've all seen INSTALLED_APPS and through this, we can declare python string literals that later load applications. To clarify specific examples of Django's extensive usage of import strings, let's try to document examples.

Django's string imports

There are also other settings in Django that load modules, classes, and functions via strings:

DJANGO_SETTINGS_MODULE

The first and most famous import string in Django is DJANGO_SETTINGS_MODULE. This is imported via import_module() in django/conf/__init__.py.

The string you use for it loads a python module, which equates to file. If you have the current directory in your site-packages/, and your settings are at project/settings/local.py, then your DJANGO_SETTINGS_MODULE should be set to DJANGO_SETTINGS_MODULE=project.settings.local.

Settings variables

There are accessiable via attributes of the django.conf.settings during runtime.

variable

example of import string

INSTALLED_APPS

INSTALLED_APPS = ['path.to.myapp']

ROOT_URLCONF

ROOT_URLCONF = 'myapp.urls'

WSGI_APPLICATION

WSGI_APPLICATION = 'develtech.wsgi.application'

STATICFILES_FINDERS

STATICFILES_FINDERS = ('django.contrib.staticfiles.finders.FileSystemFinder',)

AUTH_USER_MODEL

This isn't a "pure" import string. This works via django.apps.apps.get_model()

AUTH_USER_MODEL = 'user.User'

MIDDLEWARE

MIDDLEWARE = (
   'django.contrib.sessions.middleware.SessionMiddleware',
)

TEMPLATES

Inside of BACKEND and OPTIONS['context_processors']

TEMPLATES = [
    {
        'BACKEND': 'django.template.backends.django.DjangoTemplates',
        'OPTIONS': {
            'context_processors': [
                'django.template.context_processors.request'
            ],
        },
    },
]

AUTHENTICATION_BACKENDS

AUTHENTICATION_BACKENDS = [
    'guardian.backends.ObjectPermissionBackend',
    'allauth.account.auth_backends.AuthenticationBackend',
]

STATICFILES_STORAGE

STATICFILES_STORAGE = 'django.core.files.storage.FileSystemStorage'

See source code of FileSystemStorage

EMAIL_BACKEND

EMAIL_BACKEND ='django.core.mail.backends.smtp.EmailBackend'

URL Routes

You've probably seen that ROOT_URLCONF is itself an import string, but the code inside urls.py files also uses them.

First, let's do a real object example:

from django.urls import include, re_path
from django.contrib.auth.views import logout

urlpatterns = [
    re_path(r'^logout/', logout, name='logout', kwargs={'next_page': '/'}),
]

Django's route system also allow use of import strings via include(), which allows import strings to url python files (with a urlpatterns inside them).

from django.urls import include, re_path

urlpatterns = [
    re_path(r'^accounts/', include('allauth.urls')),
]

Where allauth.urls is allauth/urls.py

And also when you're declaring error pages in urls.py, such as django.conf.urls.handler404:

handler404 = 'based.django.views.errors.page_not_found'

Models

The next place you'll see string references to objects is in relational models, such as django.db.models.ForeignKey. Here is an excerpt taken directly from Django's documentation:

from django.db import models

class Car(models.Model):
    manufacturer = models.ForeignKey(
        'Manufacturer',
        on_delete=models.CASCADE,
    )

This establishing a relationship with a class Manufacturer.

Template engine

Template's are probably the most intricate usage of import strings.

https://docs.djangoproject.com/en/1.11/ref/templates/api/#loader-types

Template tags

Here is an example taken from the django documentation (concatenated so you can see both import string cases):

Engine(
    libraries={
        'myapp_tags': 'path.to.myapp.tags',
        'admin.urls': 'django.contrib.admin.templatetags.admin_urls',
    },
    builtins=['myapp.builtins'],
)

This is what it looks like when Django configures a template engine internally, e.g. path.to.myapp.tags, which are available automatically when you use that template engine. But more often, engines are configured declaratively via a settings file:

TEMPLATES = [{
    'BACKEND': 'django.template.backends.django.DjangoTemplates',
    'OPTIONS': {
        'libraries':{
            'myapp_tags': 'path.to.myapp.tags',
            'admin.urls': 'django.contrib.admin.templatetags.admin_urls',
        },
        'builtins': ['myapp.builtins'],
    },
}]

Despite the oppurtunity to have custom tags builtin like django's default tags (added in django/template/engine.py), it's more common to for Django developers to opt-in to loading template tags via {% load libraryname %} tag that picks up files inside the templatetags/ directory of applications, or the key used in OPTIONS['libraries'].

Dig deeper into Django's template engine

To dig deeper into Django's templates, I recommend django/templates/base.py

Check out {% load %} template tag on GitHub. This adds the library to a registry, eventually down the line it's loaded via import_library.

Context processors

Context processors allow information to be added to the request object. For the node.js programmers out there, these are sort of like passing contextual information passed through Express middleware.

In Django settings:

TEMPLATES = [{
    'BACKEND': 'django.template.backends.django.DjangoTemplates',
    'OPTIONS': {
        'context_processors': [
            'django.template.context_processors.request'
        ],
    },
}]

In Django plugins

One of the reasons import strings are used is it also makes third party extensions easier to implement.

Wherever an import string is used, Django settings can also have third party-plugins that fit the interface/class. For a first one, EMAIL_BACKEND supports third party extensions, like anymail/django-anymail (which I recommend!)

variable

project + example of import string

EMAIL_BACKEND

anymail/django-anymail

EMAIL_BACKEND = 'anymail.backends.mandrill.EmailBackend'

STATICFILES_STORAGE

jschneier/django-storages

STATICFILES_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'

TEMPLATES

nigma/django-easy-pjax (example taken from README):

TEMPLATES=[
    {
        "BACKEND": "django.template.backends.django.DjangoTemplates",
        "DIRS": [...],
        "APP_DIRS": True,
        "OPTIONS": {
            "builtins": [
                "easy_pjax.templatetags.pjax_tags"
            ],
            "context_processors": [
                "django.template.context_processors.request",
                "django.template.context_processors.static",
                # ...
            ]
        }
    }
]

In addition, third-party extensions have their own variables.

variable

project + example of import string

GUARDIAN_GET_INIT_ANONYMOUS_USER

django-guardian/django-guardian

GUARDIAN_GET_INIT_ANONYMOUS_USER = 'app.models.get_anonymous_user_instance'

ACCOUNT_FORMS

pennersr/django-allauth

ACCOUNT_FORMS = ({
    'login': 'myapp.app.user.forms.account.LoginForm',
}]

DEBUG_TOOLBAR_PANELS

jazzband/django-debug-toolbar (hey cool, django jazzband!)

DEBUG_TOOLBAR_PANELS = (
   'debug_toolbar.panels.versions.VersionsPanel',
   # .. and so on
)

You can even plugin dmclain/django-debug-toolbar-line-profiler:

if 'debug_toolbar_line_profiler' in INSTALLED_APPS:
    DEBUG_TOOLBAR_PANELS += (
        'debug_toolbar_line_profiler.panel.ProfilingPanel',
    )

Machinery behind string imports

Django makes use of two general functions. First is Django's django.utils.module_loading.import_string. See the source in django/utils/module_loading.py:

def import_string(dotted_path):
    """
    Import a dotted module path and return the attribute/class designated by the
    last name in the path. Raise ImportError if the import failed.
    """
    try:
        module_path, class_name = dotted_path.rsplit('.', 1)
    except ValueError:
        msg = "%s doesn't look like a module path" % dotted_path
        six.reraise(ImportError, ImportError(msg), sys.exc_info()[2])

    module = import_module(module_path)

    try:
        return getattr(module, class_name)
    except AttributeError:
        msg = 'Module "%s" does not define a "%s" attribute/class' % (
            module_path, class_name)
        six.reraise(ImportError, ImportError(msg), sys.exc_info()[2])

In the end, it's a friendly wrapper around Python standard library's import_module() that handles errors better and allows accessing variables, functions, and classes in the module (file) loaded.

For a deeper dive, take a look at Lib/importlib/__init__.py and the rest of Lib/importlib/.

More fun browsing source code

For that matter, why stop there? There source of the official Python implementation is available to read at python/cpython. Fun times!

Use tags to browse specific releases, such as v3.6.3.

Branches for different release streams of Python are available:

  • Plain old python/cpython is the next coming release (3.7 as of 2017-11-24)

  • Branch 2.7 is where Python 2 is being maintained, tentative end-of-life 2020-01-01 (python docs also mentions this thread on the python-dev list.

  • And other branches like:

    • 3.4 (end-of-life 2019-03-16)

    • 3.5 (end-of-life 2020-09-13)

    • 3.6 (end-of-life 2021-12-23)

Outside of Django

Older examples

My earliest exposure to superb usage of string import was from my favorite Python programmers.

First, Armin Ronacher's usage of it in Flask before they switched from plain-old unittest to pytest (which is fine, because pytest is awesome). It's viewable in flask/testsuite/__init__.py of Flask 0.10. This would move through flask's test modules and collect the available test suites.

This next one took some digging to find:

In the early days of pypa/warehouse before it switched from pallets/werkzeug to Pylons/pyramid, there was a great central Warehouse object by Donald Stufft that would scour and load up models. Remnants of it in my fork at warehouse/application.py.

A lot of these were phased out one way or another by using libraries that encouraged more conventionality. So those days of clever python sorcery, while fondly remembered, are more and more often getting usurped by libraries like pytest over unittest, and pyramids over plain-old Werkzeug.

More current examples

  • In modern flask: flask configurations, e.g.

    app.config.from_object('yourapplication.default_settings')
    
  • tensorflow/tensorflow's uses delayed imports "to avoid pulling in large dependnecies ... and allows [them] only to be loaded when they are used". Here is TensorFlow's LazyLoader class:

    class LazyLoader(types.ModuleType):
        """Lazily import a module, mainly to avoid pulling in large dependencies.
        `contrib`, and `ffmpeg` are examples of modules that are large and not always
        needed, and this allows them to only be loaded when they are used.
        """
    
        # The lint error here is incorrect.
        def __init__(self, local_name, parent_module_globals, name):  # pylint: disable=super-on-old-class
            self._local_name = local_name
            self._parent_module_globals = parent_module_globals
    
            super(LazyLoader, self).__init__(name)
    
        def _load(self):
            # Import the target module and insert it into the parent's namespace
            module = importlib.import_module(self.__name__)
            self._parent_module_globals[self._local_name] = module
    
            # Update this object's dict so that if someone keeps a reference to the
            #   LazyLoader, lookups are efficient (__getattr__ is only called on lookups
            #   that fail).
            self.__dict__.update(module.__dict__)
    
            return module
    
        def __getattr__(self, item):
            module = self._load()
            return getattr(module, item)
    
        def __dir__(self):
            module = self._load()
            return dir(module)
    

    And the implementation in the main tensorflow module:

    from tensorflow.python.util.lazy_loader import LazyLoader
    contrib = LazyLoader('contrib', globals(), 'tensorflow.contrib')
    del LazyLoader
    

    That is a clever way to make a friendly API that balances features while staying performant.

  • Sphinx has string-level module resolution peppered everywhere. For instance, when resolving a module or function with sphinx autodoc, there's a need to resolve the Noodle in .. autoclass:: Noodle .

    Another prime example in sphinx is the extensions variable in your conf.py:

    extensions = [
        'sphinx.ext.autodoc', 'sphinx.ext.doctest', 'sphinx.ext.intersphinx',
        'sphinx.ext.todo', 'sphinx.ext.viewcode', 'alagitpull'
    ]
    

    These strings end up being resolved in load_extension().

Putting it into practice

Finally, you can use import strings with your own libraries as a way to make your code more reuseable. For instance, django 1.11 has a slugify function. For each django website, I have special cases where the default behavior is unsatisfactory. Often, the rules on how you'd handle slugification are dependent on the niche of the website.

For devel.tech, the default behavior for slugify-ing "C++" is to remove the plus signs. So it shows up as "c", which collides with the "C" programming language. "C#" is also trimmed down to "c". The django model field's will append numbers behind them, "c-2", "c-3" when autogenerating them.

What if we could make it so Django could slugify "C++" is "cpp", and "C#" is "c-sharp".

Term

django.utils.text.slugify

Better

C

c (correct)

N/A

C++

c

cpp

C#

c

c-sharp

There are more generic cases, such as $ being blank with Django's stock django.utils.text.slugify. This could depend on the region of the website, since many nations have their own dollar (e.g. USD, AUD, CAD.) US$ to USD, AU$ to AUD, and so on?

Not just that, but when slugifying URL's, we are space sensitive and may prefer abbreviations/short names. For instance, New York City being nyc instead of new-york-city. What would a person on a smartphone type into Google?

Term

django.utils.text.slugify

What you (may) want

New York City

new-york-city

nyc

Y Combinator

y-combinator

yc

Portland

portland

pdx

Texas

texas

tx

$

'' (empty)

usd, aud, etc?

US$

us

usd

A$

a

aud

bitcoin

bitcoin

btc

United States

united-states

usa

League of Legends

league-of-legends

league

Apple® iPod Touch

apple-ipod-touch

ipod-touch

GNU/Linux

gnulinux

GNU/Linux 1

So there's two problems: Almost universally, the default slugify utilities in Django can lose valuable context information. Secondly, there's a need to handle custom cases depending on the needs of the website. One-size-fits-all solutions are possible to attempt, but an Australian website doesn't want to print $ as USD without asking. A gaming website may want to slugify League of Legends as lol, which is ambiguous with Laugh Out Loud, and better summated as league.

So we know that this isn't unique to just me, it would apply to many Django developers. Yay, an oppurtunity to make an open source project!

So let's make the system that handles slugification into a list of filters. Remember context_processors? We can use import strings as a way to "plug-in" callback functions to handle slugification cases. In our settings:

SLUGIFY_PROCESSORS = [
     'myproject.myapp.slugify.slugify_programming_languages',
     'myproject.myapp.slugify.slugify_geo',
]

Here's an example of what slugify_programming_languages in myproject/myapp/slugify.py:

def slugify_programming_languages(value):
    value = value.replace('c++', 'cpp')
    return value

def slugify_geo(value):
    value = value.replace('United States', 'us')
    return value

Let's sweep in the SLUGIFY_PROCESSORS with a customized slugify() function that falls back on Django's (1.11+) default behavior:

from django.conf import settings
from django.utils.module_loading import import_string
from django.utils.text import slugify as django_slugify

def slugify(value, allow_unicode=False):
    if hasattr(settings, 'SLUGIFY_PROCESSORS'):
        for slugify_fn_str in settings.SLUGIFY_PROCESSORS:
            slugify_fn_ = import_string(slugify_fn_str)
            value = slugify_fn_(value)

    return django_slugify(value, allow_unicode)

This could be used as a custom slug function for django-extension's or django-autoslug. We can then also make it available as a template tag, too:

from django import template
from django.template.defaultfilters import stringfilter

from ..text import slugify as _slugify

register = template.Library()


@register.filter(is_safe=True)
@stringfilter
def slugify(value):
    return _slugify(value)

To demonstrate the above code, I forked off part of devel.tech's slugication code into develtech/django-slugify-processor (pypi). The README has instructions on how to configure and implement it.

Wrapping up

The delayed resolution of imports via strings plays an instrumental role in the Django frameowrk, but also a host of other python appliacations. The machinery behind it is part of Python's standard library.

There is a value equation to strike when using import strings for your project. If you value reuseability, customization, and have a recurring pattern that's package-worthy and you don't want to pull in the whole kitchen sink by default, using import strings are the solution in the end, one way or another.

Using Django as an obvious example of its success in the field, import strings also allow codebases to scale big while avoiding race conditions caused by circular dependencies.

Finally, we closed with a homemade example of using import strings. You can check out the source code, package, and docs.

1

Just an easter egg to see if anyone's reading :)