Models - Django Design Patterns and Best Practices (2015)

Django Design Patterns and Best Practices (2015)

Chapter 3. Models

In this chapter, we will discuss the following topics:

· The importance of models

· Class diagrams

· Model structural patterns

· Model behavioral patterns

· Migrations

M is bigger than V and C

In Django, models are classes that provide an object-oriented way of dealing with databases. Typically, each class refers to a database table and each attribute refers to a database column. You can make queries to these tables using an automatically generated API.

Models can be the base for many other components. Once you have a model, you can rapidly derive model admins, model forms, and all kinds of generic views. In each case, you would need to write a line of code or two, just so that it does not seem too magical.

Also, models are used in more places than you would expect. This is because Django can be run in several ways. Some of the entry points of Django are as follows:

· The familiar web request-response flow

· Django interactive shell

· Management commands

· Test scripts

· Asynchronous task queues such as Celery

In almost all these cases, the model modules would get imported (as a part of django.setup()). Hence, it is best to keep your models free from any unnecessary dependencies or to import any other Django components such as views.

In short, designing your models properly is quite important. Now let's get started with the SuperBook model design.

Note

The Brown Bag Lunch

Author's Note: The progress of the SuperBook project will appear in a box like this. You may skip the box but you will miss the insights, experiences, and drama of working in a web application project.

Steve's first week with his client, the SuperHero Intelligence and Monitoring or S.H.I.M. for short, was a mixed bag. The office was incredibly futuristic but getting anything done needed a hundred approvals and sign-offs.

Being the lead Django developer, Steve had finished setting up a mid-sized development server hosting four virtual machines over two days. The next morning, the machine itself had disappeared. A washing machine-sized robot nearby said that it had been taken to the forensic department due to unapproved software installations.

The CTO, Hart was, however, of great help. He asked the machine to be returned in an hour with all the installations intact. He had also sent pre-approvals for the SuperBook project to avoid any such roadblocks in future.

Later that afternoon, Steve was having a brown-bag lunch with him. Dressed in a beige blazer and light blue jeans, Hart arrived well in time. Despite being taller than most people and having a clean-shaven head, he seemed cool and approachable. He asked if Steve had checked out the previous attempt to build a superhero database in the sixties.

"Oh yes, the Sentinel project, right?" said Steve. "I did. The database seemed to be designed as an Entity-Attribute-Value model, something that I consider an anti-pattern. Perhaps they had very little idea about the attributes of a superhero those days." Hart almost winced at the last statement. In a slightly lowered voice, he said, "You are right, I didn't. Besides, they gave me only two days to design the whole thing. I believe there was literally a nuclear bomb ticking somewhere."

Steve's mouth was wide open and his sandwich had frozen at its entrance. Hart smiled. "Certainly not my best work. Once it crossed about a billion entries, it took us days to run any kind of analysis on that damn database. SuperBook would zip through that in mere seconds, right?"

Steve nodded weakly. He had never imagined that there would be around a billion superheroes in the first place.

The model hunt

Here is a first cut at identifying the models in SuperBook. Typical to an early attempt, we have represented only the essential models and their relationships in the form of a class diagram:

The model hunt

Let's forget models for a moment and talk in terms of the objects we are modeling. Each user has a profile. A user can make several comments or several posts. A Like can be related to a single user/post combination.

Drawing a class diagram of your models like this is recommended. Some attributes might be missing at this stage but you can detail them later. Once the entire project is represented in the diagram, it makes separating the apps easier.

Here are some tips to create this representation:

· Boxes represent entities, which become models.

· Nouns in your write-up typically end up as entities.

· Arrows are bi-directional and represent one of the three types of relationships in Django: one-to-one, one-to-many (implemented with Foreign Keys), and many-to-many.

· The field denoting the one-to-many relationship is defined in the model on the Entity-relationship model (ER-model). In other words, the star is where the Foreign Key gets declared.

The class diagram can be mapped into the following Django code (which will be spread across several apps):

class Profile(models.Model):

user = models.OneToOneField(User)

class Post(models.Model):

posted_by = models.ForeignKey(User)

class Comment(models.Model):

commented_by = models.ForeignKey(User)

for_post = models.ForeignKey(Post)

class Like(models.Model):

liked_by = models.ForeignKey(User)

post = models.ForeignKey(Post)

Later, we will not reference the User directly but use the more general settings.AUTH_USER_MODEL instead.

Splitting models.py into multiple files

Like most components of Django, a large models.py file can be split up into multiple files within a package. A package is implemented as a directory, which can contain multiple files, one of which must be a specially named file called __init__.py.

All definitions that can be exposed at package level must be defined in __init__.py with global scope. For example, if we split models.py into individual classes, in corresponding files inside models subdirectory such as postable.py, post.py, and comment.py, then the__init__.py package will look like:

from postable import Postable

from post import Post

from comment import Comment

Now you can import models.Post as before.

Any other code in the __init__.py package will be run when the package is imported. Hence, it is the ideal place for any package-level initialization code.

Structural patterns

This section contains several design patterns that can help you design and structure your models.

Patterns – normalized models

Problem: By design, model instances have duplicated data that cause data inconsistencies.

Solution: Break down your models into smaller models through normalization. Connect these models with logical relationships between them.

Problem details

Imagine if someone designed our Post table (omitting certain columns) in the following way:

Superhero Name

Message

Posted on

Captain Temper

Has this posted yet?

2012/07/07 07:15

Professor English

It should be 'Is' not 'Has'.

2012/07/07 07:17

Captain Temper

Has this posted yet?

2012/07/07 07:18

Capt. Temper

Has this posted yet?

2012/07/07 07:19

I hope you noticed the inconsistent superhero naming in the last row (and captain's consistent lack of patience).

If we were to look at the first column, we are not sure which spelling is correct—Captain Temper or Capt. Temper. This is the kind of data redundancy we would like to eliminate through normalization.

Solution details

Before we take a look at the fully normalized solution, let's have a brief primer on database normalization in the context of Django models.

Three steps of normalization

Normalization helps you efficiently store data. Once your models are fully normalized, they will not have redundant data, and each model should contain data that is only logically related to it.

To give a quick example, if we were to normalize the Post table so that we can unambiguously refer to the superhero who posted that message, then we need to isolate the user details in a separate table. Django already creates the user table by default. So, you only need to refer to the ID of the user who posted the message in the first column, as shown in the following table:

User ID

Message

Posted on

12

Has this posted yet?

2012/07/07 07:15

8

It should be 'Is' not 'Has'.

2012/07/07 07:17

12

Has this posted yet?

2012/07/07 07:18

12

Has this posted yet?

2012/07/07 07:19

Now, it is not only clear that there were three messages posted by the same user (with an arbitrary user ID), but we can also find that user's correct name by looking up the user table.

Generally, you will design your models to be in their fully normalized form and then selectively denormalize them for performance reasons. In databases, Normal Forms are a set of guidelines that can be applied to a table to ensure that it is normalized. Commonly found normal forms are first, second, and third normal forms, although they could go up to the fifth normal form.

In the next example, we will normalize a table and create the corresponding Django models. Imagine a spreadsheet called 'Sightings' that lists the first time someone spots a superhero using a power or superhuman ability. Each entry mentions the known origins, super powers, and location of first sighting, including latitude and longitude.

Name

Origin

Power

First Used At (Lat, Lon, Country, Time)

Blitz

Alien

Freeze

Flight

+40.75, -73.99; USA; 2014/07/03 23:12

+34.05, -118.24; USA; 2013/03/12 11:30

Hexa

Scientist

Telekinesis

Flight

+35.68, +139.73; Japan; 2010/02/17 20:15

+31.23, +121.45; China; 2010/02/19 20:30

Traveller

Billionaire

Time travel

+43.62, +1.45, France; 2010/11/10 08:20

The preceding geographic data has been extracted from http://www.golombek.com/locations.html.

First normal form (1NF)

To confirm to the first normal form, a table must have:

· No attribute (cell) with multiple values

· A primary key defined as a single column or a set of columns (composite key)

Let's try to convert our spreadsheet into a database table. Evidently, our 'Power' column breaks the first rule.

The updated table here satisfies the first normal form. The primary key (marked with a *) is a combination of 'Name' and 'Power', which should be unique for each row.

Name*

Origin

Power*

Latitude

Longitude

Country

Time

Blitz

Alien

Freeze

+40.75170

-73.99420

USA

2014/07/03 23:12

Blitz

Alien

Flight

+40.75170

-73.99420

USA

2013/03/12 11:30

Hexa

Scientist

Telekinesis

+35.68330

+139.73330

Japan

2010/02/17 20:15

Hexa

Scientist

Flight

+35.68330

+139.73330

Japan

2010/02/19 20:30

Traveller

Billionaire

Time travel

+43.61670

+1.45000

France

2010/11/10 08:20

Second normal form or 2NF

The second normal form must satisfy all the conditions of the first normal form. In addition, it must satisfy the condition that all non-primary key columns must be dependent on the entire primary key.

In the previous table, notice that 'Origin' depends only on the superhero, that is, 'Name'. It doesn't matter which Power we are talking about. So, Origin is not entirely dependent on the composite primary key—Name and Power.

Let's extract just the origin information into a separate table called 'Origins' as shown here:

Name*

Origin

Blitz

Alien

Hexa

Scientist

Traveller

Billionaire

Now our Sightings table updated to be compliant to the second normal form looks like this:

Name*

Power*

Latitude

Longitude

Country

Time

Blitz

Freeze

+40.75170

-73.99420

USA

2014/07/03 23:12

Blitz

Flight

+40.75170

-73.99420

USA

2013/03/12 11:30

Hexa

Telekinesis

+35.68330

+139.73330

Japan

2010/02/17 20:15

Hexa

Flight

+35.68330

+139.73330

Japan

2010/02/19 20:30

Traveller

Time travel

+43.61670

+1.45000

France

2010/11/10 08:20

Third normal form or 3NF

In third normal form, the tables must satisfy the second normal form and should additionally satisfy the condition that all non-primary key columns must be directly dependent on the entire primary key and must be independent of each other.

Think about the Country column for a moment. Given the Latitude and Longitude, you can easily derive the Country column. Even though the country where a superpowers was sighted is dependent on the Name-Power composite primary key it is only indirectly dependent on them.

So, let's separate the location details into a separate Countries table as follows:

Location ID

Latitude*

Longitude*

Country

1

+40.75170

-73.99420

USA

2

+35.68330

+139.73330

Japan

3

+43.61670

+1.45000

France

Now our Sightings table in its third normal form looks like this:

User ID*

Power*

Location ID

Time

2

Freeze

1

2014/07/03 23:12

2

Flight

1

2013/03/12 11:30

4

Telekinesis

2

2010/02/17 20:15

4

Flight

2

2010/02/19 20:30

7

Time travel

3

2010/11/10 08:20

As before, we have replaced the superhero's name with the corresponding User ID that can be used to reference the user table.

Django models

We can now take a look at how these normalized tables can be represented as Django models. Composite keys are not directly supported in Django. The solution used here is to apply the surrogate keys and specify the unique_together property in the Meta class:

class Origin(models.Model):

superhero = models.ForeignKey(settings.AUTH_USER_MODEL)

origin = models.CharField(max_length=100)

class Location(models.Model):

latitude = models.FloatField()

longitude = models.FloatField()

country = models.CharField(max_length=100)

class Meta:

unique_together = ("latitude", "longitude")

class Sighting(models.Model):

superhero = models.ForeignKey(settings.AUTH_USER_MODEL)

power = models.CharField(max_length=100)

location = models.ForeignKey(Location)

sighted_on = models.DateTimeField()

class Meta:

unique_together = ("superhero", "power")

Performance and denormalization

Normalization can adversely affect performance. As the number of models increase, the number of joins needed to answer a query also increase. For instance, to find the number of superheroes with the Freeze capability in USA, you will need to join four tables. Prior to normalization, any information can be found by querying a single table.

You should design your models to keep the data normalized. This will maintain data integrity. However, if your site faces scalability issues, then you can selectively derive data from those models to create denormalized data.

Tip

Best Practice

Normalize while designing but denormalize while optimizing.

For instance, if counting the sightings in a certain country is very common, then add it as an additional field to the Location model. Now, you can include the other queries using Django (object-relational mapping) ORM, unlike a cached value.

However, you need to update this count each time you add or remove a sighting. You need to add this computation to the save method of Sighting, add a signal handler, or even compute using an asynchronous job.

If you have a complex query spanning several tables, such as a count of superpowers by country, then you need to create a separate denormalized table. As before, we need to update this denormalized table every time the data in your normalized models changes.

Denormalization is surprisingly common in large websites because it is tradeoff between speed and space. Today, space is cheap but speed is crucial to user experience. So, if your queries are taking too long to respond, then you might want to consider it.

Should we always normalize?

Too much normalization is not necessarily a good thing. Sometimes, it can introduce an unnecessary table that can complicate updates and lookups.

For example, your User model might have several fields for their home address. Strictly speaking, you can normalize these fields into an Address model. However, in many cases, it would be unnecessary to introduce an additional table to the database.

Rather than aiming for the most normalized design, carefully weigh each opportunity to normalize and consider the tradeoffs before refactoring.

Pattern – model mixins

Problem: Distinct models have the same fields and/or methods duplicated violating the DRY principle.

Solution: Extract common fields and methods into various reusable model mixins.

Problem details

While designing models, you might find certain common attributes or behaviors shared across model classes. For example, a Post and Comment model needs to keep track of its created date and modified date. Manually copy-pasting the fields and their associated method is not a very DRY approach.

Since Django models are classes, object-oriented approaches such as composition and inheritance are possible solutions. However, compositions (by having a property that contains an instance of the shared class) will need an additional level of indirection to access fields.

Inheritance can get tricky. We can use a common base class for Post and Comments. However, there are three kinds of inheritance in Django: concrete, abstract, and proxy.

Concrete inheritance works by deriving from the base class just like you normally would in Python classes. However, in Django, this base class will be mapped into a separate table. Each time you access base fields, an implicit join is needed. This leads to horrible performance.

Proxy inheritance can only add new behavior to the parent class. You cannot add new fields. Hence, it is not very useful for this situation.

Finally, we are left with abstract inheritance.

Solution details

Abstract base classes are elegant solutions used to share data and behavior among models. When you define an abstract class, it does not create any corresponding table in the database. Instead, these fields are created in the derived non-abstract classes.

Accessing abstract base class fields doesn't need a JOIN statement. The resulting tables are also self-contained with managed fields. Due to these advantages, most Django projects use abstract base classes to implement common fields or methods.

Limitations of abstract models are as follows:

· They cannot have a Foreign Key or many-to-many field from another model

· They cannot be instantiated or saved

· They cannot be directly used in a query since it doesn't have a manager

Here is how the post and comment classes can be initially designed with an abstract base class:

class Postable(models.Model):

created = models.DateTimeField(auto_now_add=True)

modified = models.DateTimeField(auto_now=True)

message = models.TextField(max_length=500)

class Meta:

abstract = True

class Post(Postable):

...

class Comment(Postable):

...

To turn a model into an abstract base class, you will need to mention abstract = True in its inner Meta class. Here, Postable is an abstract base class. However, it is not very reusable.

In fact, if there was a class that had just the created and modified field, then we can reuse that timestamp functionality in nearly any model needing a timestamp. In such cases, we usually define a model mixin.

Model mixins

Model mixins are abstract classes that can be added as a parent class of a model. Python supports multiple inheritances, unlike other languages such as Java. Hence, you can list any number of parent classes for a model.

Mixins ought to be orthogonal and easily composable. Drop in a mixin to the list of base classes and they should work. In this regard, they are more similar in behavior to composition rather than inheritance.

Smaller mixins are better. Whenever a mixin becomes large and violates the Single Responsibility Principle, consider refactoring it into smaller classes. Let a mixin do one thing and do it well.

In our previous example, the model mixin used to update the created and modified time can be easily factored out, as shown in the following code:

class TimeStampedModel(models.Model):

created = models.DateTimeField(auto_now_add=True)

modified = models.DateTimeField(auto_now =True)

class Meta:

abstract = True

class Postable(TimeStampedModel):

message = models.TextField(max_length=500)

...

class Meta:

abstract = True

class Post(Postable):

...

class Comment(Postable):

...

We have two base classes now. However, the functionality is clearly separated. The mixin can be separated into its own module and reused in other contexts.

Pattern – user profiles

Problem: Every website stores a different set of user profile details. However, Django's built-in User model is meant for authentication details.

Solution: Create a user profile class with a one-to-one relation with the user model.

Problem details

Out of the box, Django provides a pretty decent User model. You can use it when you create a super user or log in to the admin interface. It has a few basic fields, such as full name, username, and e-mail.

However, most real-world projects keep a lot more information about users, such as their address, favorite movies, or their superpower abilities. From Django 1.5 onwards, the default User model can be extended or replaced. However, official docs strongly recommend storing only authentication data even in a custom user model (it belongs to the auth app, after all).

Certain projects need multiple types of users. For example, SuperBook can be used by superheroes and non-superheroes. There might be common fields and some distinctive fields based on the type of user.

Solution details

The officially recommended solution is to create a user profile model. It should have a one-to-one relation with your user model. All the additional user information is stored in this model:

class Profile(models.Model):

user = models.OneToOneField(settings.AUTH_USER_MODEL,

primary_key=True)

It is recommended that you set the primary_key explicitly to True to prevent concurrency issues in some database backends such as PostgreSQL. The rest of the model can contain any other user details, such as birthdate, favorite color, and so on.

While designing the profile model, it is recommended that all the profile detail fields must be nullable or contain default values. Intuitively, we can understand that a user cannot fill out all his profile details while signing up. Additionally, we will ensure that the signal handler also doesn't pass any initial parameters while creating the profile instance.

Signals

Ideally, every time a user model instance is created, a corresponding user profile instance must be created as well. This is usually done using signals.

For example, we can listen for the post_save signal from the user model using the following signal handler:

# signals.py

from django.db.models.signals import post_save

from django.dispatch import receiver

from django.conf import settings

from . import models

@receiver(post_save, sender=settings.AUTH_USER_MODEL)

def create_profile_handler(sender, instance, created, **kwargs):

if not created:

return

# Create the profile object, only if it is newly created

profile = models.Profile(user=instance)

profile.save()

Note that the profile model has passed no additional initial parameters except for the user instance.

Previously, there was no specific place for initializing the signal code. Typically, they were imported or implemented in models.py (which was unreliable). However, with app-loading refactor in Django 1.7, the application initialization code location is well defined.

First, create a __init__.py package for your application to mention your app's ProfileConfig:

default_app_config = "profiles.apps.ProfileConfig"

Next, subclass the ProfileConfig method in app.py and set up the signal in the ready method:

# app.py

from django.apps import AppConfig

class ProfileConfig(AppConfig):

name = "profiles"

verbose_name = 'User Profiles'

def ready(self):

from . import signals

With your signals set up, accessing user.profile should return a Profile object to all users, even the newly created ones.

Admin

Now, a user's details will be in two different places within the admin: the authentication details in the usual user admin page and the same user's additional profile details in a separate profile admin page. This gets very cumbersome.

For convenience, the profile admin can be made inline to the default user admin by defining a custom UserAdmin as follows:

# admin.py

from django.contrib import admin

from .models import Profile

from django.contrib.auth.models import User

class UserProfileInline(admin.StackedInline):

model = Profile

class UserAdmin(admin.UserAdmin):

inlines = [UserProfileInline]

admin.site.unregister(User)

admin.site.register(User, UserAdmin)

Multiple profile types

Assume that you need several kinds of user profiles in your application. There needs to be a field to track which type of profile the user has. The profile data itself needs to be stored in separate models or a unified model.

An aggregate profile approach is recommended since it gives the flexibility to change the profile types without loss of profile details and minimizes complexity. In this approach, the profile model contains a superset of all profile fields from all profile types.

For example, SuperBook will need a SuperHero type profile and an Ordinary (non-superhero) profile. It can be implemented using a single unified profile model as follows:

class BaseProfile(models.Model):

USER_TYPES = (

(0, 'Ordinary'),

(1, 'SuperHero'),

)

user = models.OneToOneField(settings.AUTH_USER_MODEL,

primary_key=True)

user_type = models.IntegerField(max_length=1, null=True,

choices=USER_TYPES)

bio = models.CharField(max_length=200, blank=True, null=True)

def __str__(self):

return "{}: {:.20}". format(self.user, self.bio or "")

class Meta:

abstract = True

class SuperHeroProfile(models.Model):

origin = models.CharField(max_length=100, blank=True, null=True)

class Meta:

abstract = True

class OrdinaryProfile(models.Model):

address = models.CharField(max_length=200, blank=True, null=True)

class Meta:

abstract = True

class Profile(SuperHeroProfile, OrdinaryProfile, BaseProfile):

pass

We grouped the profile details into several abstract base classes to separate concerns. The BaseProfile class contains all the common profile details irrespective of the user type. It also has a user_type field that keeps track of the user's active profile.

The SuperHeroProfile class and OrdinaryProfile class contain the profile details specific to superhero and non-hero users respectively. Finally, the profile class derives from all these base classes to create a superset of profile details.

Some details to take care of while using this approach are as follows:

· All profile fields that belong to the class or its abstract bases classes must be nullable or with defaults.

· This approach might consume more database space per user but gives immense flexibility.

· The active and inactive fields for a profile type need to be managed outside the model. Say, a form to edit the profile must show the appropriate fields based on the currently active user type.

Pattern – service objects

Problem: Models can get large and unmanageable. Testing and maintenance get harder as a model does more than one thing.

Solution: Refactor out a set of related methods into a specialized Service object.

Problem details

Fat models, thin views is an adage commonly told to Django beginners. Ideally, your views should not contain anything other than presentation logic.

However, over time pieces of code that cannot be placed anywhere else tend to go into models. Soon, models become a dump yard for the code.

Some of the tell-tale signs that your model can use a Service object are as follows:

1. Interactions with external services, for example, checking whether the user is eligible to get a SuperHero profile with a web service.

2. Helper tasks that do not deal with the database, for example, generating a short URL or random captcha for a user.

3. Involves a short-lived object without a database state, for example, creating a JSON response for an AJAX call.

4. Long-running tasks involving multiple instances such as Celery tasks.

Models in Django follow the Active Record pattern. Ideally, they encapsulate both application logic and database access. However, keep the application logic minimal.

While testing, if we find ourselves unnecessarily mocking the database even while not using it, then we need to consider breaking up the model class. A Service object is recommended in such situations.

Solution details

Service objects are plain old Python objects (POPOs) that encapsulate a 'service' or interactions with a system. They are usually kept in a separate file named services.py or utils.py.

For example, checking a web service is sometimes dumped into a model method as follows:

class Profile(models.Model):

...

def is_superhero(self):

url = "http://api.herocheck.com/?q={0}".format(

self.user.username)

return webclient.get(url)

This method can be refactored to use a service object as follows:

from .services import SuperHeroWebAPI

def is_superhero(self):

return SuperHeroWebAPI.is_hero(self.user.username)

The service object can be now defined in services.py as follows:

API_URL = "http://api.herocheck.com/?q={0}"

class SuperHeroWebAPI:

...

@staticmethod

def is_hero(username):

url =API_URL.format(username)

return webclient.get(url)

In most cases, methods of a Service object are stateless, that is, they perform the action solely based on the function arguments without using any class properties. Hence, it is better to explicitly mark them as static methods (as we have done for is_hero).

Consider refactoring your business logic or domain logic out of models into service objects. This way, you can use them outside your Django application as well.

Imagine there is a business reason to blacklist certain users from becoming superhero types based on their username. Our service object can be easily modified to support this:

class SuperHeroWebAPI:

...

@staticmethod

def is_hero(username):

blacklist = set(["syndrome", "kcka$$", "superfake"])

url =API_URL.format(username)

return username not in blacklist and webclient.get(url)

Ideally, service objects are self-contained. This makes them easy to test without mocking, say, the database. They can be also easily reused.

In Django, time-consuming services are executed asynchronously using task queues such as Celery. Typically, the Service Object actions are run as Celery tasks. Such tasks can be run periodically or after a delay.

Retrieval patterns

This section contains design patterns that deal with accessing model properties or performing queries on them.

Pattern – property field

Problem: Models have attributes that are implemented as methods. However, these attributes should not be persisted to the database.

Solution: Use the property decorator on such methods.

Problem details

Model fields store per-instance attributes, such as first name, last name, birthday, and so on. They are also stored in the database. However, we also need to access some derived attributes, such as full name or age.

They can be easily calculated from the database fields, hence need not be stored separately. In some cases, they can just be a conditional check such as eligibility for offers based on age, membership points, and active status.

A straightforward way to implement this is to define functions, such as get_age similar to the following:

class BaseProfile(models.Model):

birthdate = models.DateField()

#...

def get_age(self):

today = datetime.date.today()

return (today.year - self.birthdate.year) - int(

(today.month, today.day) <

(self.birthdate.month, self.birthdate.day))

Calling profile.get_age() would return the user's age by calculating the difference in the years adjusted by one based on the month and date.

However, it is much more readable (and Pythonic) to call it profile.age.

Solution details

Python classes can treat a function as an attribute using the property decorator. Django models can use it as well. In the previous example, replace the function definition line with:

@property

def age(self):

Now, we can access the user's age with profile.age. Notice that the function's name is shortened as well.

An important shortcoming of a property is that it is invisible to the ORM, just like model methods are. You cannot use it in a QuerySet object. For example, this will not work, Profile.objects.exclude(age__lt=18).

It might also be a good idea to define a property to hide the details of internal classes. This is formally known as the Law of Demeter. Simply put, the law states that you should only access your own direct members or "use only one dot".

For example, rather than accessing profile.birthdate.year, it is better to define a profile.birthyear property. It helps you hide the underlying structure of the birthdate field this way.

Tip

Best Practice

Follow the law of Demeter, and use only one dot when accessing a property.

An undesirable side effect of this law is that it leads to the creation of several wrapper properties in the model. This could bloat up models and make them hard to maintain. Use the law to improve your model's API and reduce coupling wherever it makes sense.

Cached properties

Each time we call a property, we are recalculating a function. If it is an expensive calculation, we might want to cache the result. This way, the next time the property is accessed, the cached value is returned.

from django.utils.functional import cached_property

#...

@cached_property

def full_name(self):

# Expensive operation e.g. external service call

return "{0} {1}".format(self.firstname, self.lastname)

The cached value will be saved as a part of the Python instance. As long as the instance exists, the same value will be returned.

As a failsafe mechanism, you might want to force the execution of the expensive operation to ensure that stale values are not returned. In such cases, set a keyword argument such as cached=False to prevent returning the cached value.

Pattern – custom model managers

Problem: Certain queries on models are defined and accessed repeatedly throughout the code violating the DRY principle.

Solution: Define custom managers to give meaningful names to common queries.

Problem details

Every Django model has a default manager called objects. Invoking objects.all(), will return all the entries for that model in the database. Usually, we are interested in only a subset of all entries.

We apply various filters to find out the set of entries we need. The criterion to select them is often our core business logic. For example, we can find the posts accessible to the public by the following code:

public = Posts.objects.filter(privacy="public")

This criterion might change in the future. Say, we might want to also check whether the post was marked for editing. This change might look like this:

public = Posts.objects.filter(privacy=POST_PRIVACY.Public,

draft=False)

However, this change needs to be made everywhere a public post is needed. This can get very frustrating. There needs to be only one place to define such commonly used queries without 'repeating oneself'.

Solution details

QuerySets are an extremely powerful abstraction. They are lazily evaluated only when needed. Hence, building longer QuerySets by method-chaining (a form of fluent interface) does not affect the performance.

In fact, as more filtering is applied, the result dataset shrinks. This usually reduces the memory consumption of the result.

A model manager is a convenient interface for a model to get its QuerySet object. In other words, they help you use Django's ORM to access the underlying database. In fact, managers are implemented as very thin wrappers around a QuerySet object. Notice the identical interface:

>>> Post.objects.filter(posted_by__username="a")

[<Post: a: Hello World>, <Post: a: This is Private!>]

>>> Post.objects.get_queryset().filter(posted_by__username="a")

[<Post: a: Hello World>, <Post: a: This is Private!>]

The default manager created by Django, objects, has several methods, such as all, filter, or exclude that return QuerySets. However, they only form a low-level API to your database.

Custom managers are used to create a domain-specific, higher-level API. This is not only more readable but less affected by implementation details. Thus, you are able to work at a higher level of abstraction closely modeled to your domain.

Our previous example for public posts can be easily converted into a custom manager as follows:

# managers.py

from django.db.models.query import QuerySet

class PostQuerySet(QuerySet):

def public_posts(self):

return self.filter(privacy="public")

PostManager = PostQuerySet.as_manager

This convenient shortcut for creating a custom manager from a QuerySet object appeared in Django 1.7. Unlike other previous approaches, this PostManager object is chainable like the default objects manager.

It sometimes makes sense to replace the default objects manager with our custom manager, as shown in the following code:

from .managers import PostManager

class Post(Postable):

...

objects = PostManager()

By doing this, to access public_posts our code gets considerably simplified to the following:

public = Post.objects.public_posts()

Since the returned value is a QuerySet, they can be further filtered:

public_apology = Post.objects.public_posts().filter(

message_startswith="Sorry")

QuerySets have several interesting properties. In the next few sections, we can take a look at some common patterns that involve combining QuerySets.

Set operations on QuerySets

True to their name (or the latter half of their name), QuerySets support a lot of (mathematical) set operations. For the sake of illustration, consider two QuerySets that contain the user objects:

>>> q1 = User.objects.filter(username__in=["a", "b", "c"])

[<User: a>, <User: b>, <User: c>]

>>> q2 = User.objects.filter(username__in=["c", "d"])

[<User: c>, <User: d>]

Some set operations that you can perform on them are as follows:

· Union: This combines and removes duplicates. Use q1 | q2 to get [<User: a>, <User: b>, <User: c>, <User: d>]

· Intersection: This finds common items. Use q1 and q2 to get [<User: c>]

· Difference: This removes elements in second set from first. There is no logical operator for this. Instead use q1.exclude(pk__in=q2) to get [<User: a>, <User: b>]

The same operations can be done using the Q objects:

from django.db.models import Q

# Union

>>> User.objects.filter(Q(username__in=["a", "b", "c"]) | Q(username__in=["c", "d"]))

[<User: a>, <User: b>, <User: c>, <User: d>]

# Intersection

>>> User.objects.filter(Q(username__in=["a", "b", "c"]) & Q(username__in=["c", "d"]))

[<User: c>]

# Difference

>>> User.objects.filter(Q(username__in=["a", "b", "c"]) & ~Q(username__in=["c", "d"]))

[<User: a>, <User: b>]

Note that the difference is implemented using & (AND) and ~ (Negation). The Q objects are very powerful and can be used to build very complex queries.

However, the Set analogy is not perfect. QuerySets, unlike mathematical sets, are ordered. So, they are closer to Python's list data structure in that respect.

Chaining multiple QuerySets

So far, we have been combining QuerySets of the same type belonging to the same base class. However, we might need to combine QuerySets from different models and perform operations on them.

For example, a user's activity timeline contains all their posts and comments in reverse chronological order. The previous methods of combining QuerySets won't work. A naïve solution would be to convert them to lists, concatenate, and sort them, like this:

>>>recent = list(posts)+list(comments)

>>>sorted(recent, key=lambda e: e.modified, reverse=True)[:3]

[<Post: user: Post1>, <Comment: user: Comment1>, <Post: user: Post0>]

Unfortunately, this operation has evaluated the lazy QuerySets object. The combined memory usage of the two lists can be overwhelming. Besides, it can be quite slow to convert large QuerySets into lists.

A much better solution uses iterators to reduce the memory consumption. Use the itertools.chain method to combine multiple QuerySets as follows:

>>> from itertools import chain

>>> recent = chain(posts, comments)

>>> sorted(recent, key=lambda e: e.modified, reverse=True)[:3]

Once you evaluate a QuerySet, the cost of hitting the database can be quite high. So, it is important to delay it as long as possible by performing only operations that will return QuerySets unevaluated.

Tip

Keep QuerySets unevaluated as long as possible.

Migrations

Migrations help you to confidently make changes to your models. Introduced in Django 1.7, migrations are an essential and easy-to-use parts of a development workflow.

The new workflow is essentially as follows:

1. The first time you define your model classes, you will need to run:

2. python manage.py makemigrations <app_label>

3. This will create migration scripts in app/migrations folder.

4. Run the following command in the same (development) environment:

5. python manage.py migrate <app_label>

This will apply the model changes to the database. Sometimes, questions are asked to handle the default values, renaming, and so on.

6. Propagate the migration scripts to other environments. Typically, your version control tool, for example Git, will take care of this. As the latest source is checked out, the new migration scripts will also appear.

7. Run the following command in these environments to apply the model changes:

8. python manage.py migrate <app_label>

9. Whenever you make changes to the models classes, repeat steps 1-5.

If you omit the app label in the commands, Django will find unapplied changes in every app and migrate them.

Summary

Model design is hard to get it right. Yet, it is fundamental to Django development. In this chapter, we looked at several common patterns when working with models. In each case, we looked at the impact of the proposed solution and various tradeoffs.

In the next chapter, we will examine the common design patterns we encounter when working with views and URL configurations.