django-treebeard vs django-mptt (Draft)

Handling Hierarchical / Tree-like DB Schemas

My background

I forgot whether I used first django-treebeard or django-mptt first. But it was years ago. In the early stages, I didn't know much about the underlying mechanics of how it worked. Ergo, this article will cover both plugins, the SQL techniques they employ (both generally, and in-depth), public API's, internal python code, communities, and my own experience using them in production.

As for hierarchical data and SQL in general:

I mostly rely on ORM's. I do study plain-old SQL in my spare time. My favorite book for it is Joe Celko's SQL for Smarties.

Eventually, I wanted to develop a faceted search system for my ideas without having to pull in something heavy duty (like ElasticSearch). This would require a system that could "drill-down" deeper into categories.

I was very fortunate that Celko seemed to be one of the few in the forefront of handling the domain of hierarchical data in SQL. Apparently, he holds some recognition with pushing the concept of Nested set model forward. He also wrote another book, Trees and Hierarchies in SQL for Smarties. It's probably the best book on nested data in SQL.

So for further reading on the background mechanics of tree data in SQL, check those out. In future updates I hope to also dig into new stuff like matthiask/django-cte-forest.

django-mptt

community / open source

disclaimer on numbers

I don't advise using contributor count, stars, releases, and commits as a sole factor. They're helpful for describing how well a plugin may be entrenched, but they don't factor in whether a plugin could have been outside GH for years, or the quality/depth of commits.

mptt, as of 2017-12-07:

  • 93 contributors

  • ~1600 stars

  • 28 releases

  • 994 commits

Methodology

Modified Preorder Traversal Tree is supposed to be a combination of Adjacency List, Nested Sets, and Materialized Path.

I don't know whether it's "more" complicated to balance, since balancing anything in SQL is already tedious, especially if nested-set stuff is involved.

django-treebeard

community / open source

Treebeard, as of 2017-12-07:

  • 30 contributors

  • ~400

  • 18 releases

  • 654 commits

While studying Django source code at-scale, I seen divio/django-cms move from django-mptt to django-treebeard. In addition, wagtail/wagtail used treebeard.

So, two industrial-sized django CMS systems use treebeard.

Comparison

This is an example, of why stars and commits alone, even using a "superior" technology in the innards, may not make a library better.

MPTT is supposed to employ all three of treebeard's strategies at once, being able to cope with fast read under robust situations.

The problem is, if the problem being solved doesn't require the complication, the engineering overheard incurred isn't worth it.

The internals of django-mptt are heavy with metaprogramming. On top of using an already complicated SQL technique, trying to read into mptt feels impenetrable.

django-mptt was also late for django 2.0's release. Nobody was around. The GitHub organization is a username.

According to stars and contributors, treebeard is feels like the runt of the litter.

More reading

First, reading the source code of treebeard and its documentation. In particular:

Also, two of Django's major CMS systems implement treebeard, check that out:

I don't really recommend django-mptt's internals. It's very meta class heavy. I'll leave it here for posterity, but it's very complicated: https://github.com/django-mptt/django-mptt/tree/master/mptt

Earlier on, two books by Joe Celko. I prefer to stick to his since they cover stuff well. For general SQL, SQL for Smarties, and Trees and Hierarchies in SQL for Smarties for stuff like adjacency lists, materialized path, and nested sets.

Another author that's authoritative on hierchical design patterns is Vadim Tropashko, author of SQL Design Patterns.