updated May 29, 2019
Django ORM performance tips
The article explains how to use QuerySet
API to produce less SQL queries or more effecient SQL so the database operations work faster.
0. How to check SQL produced by QuerySet
from django.db import connection users = User.objects.all() if users: user = users[0] print(len(connection.queries)) # output: 2
1. Avoid objects existance check before getting it
Bad code: 2 SQL queries:users = User.objects.all() if users: # 1st SQL execution user = users[0] # 2nd SQL executionGood code: one SQL with
LIMIT 1
:
user = User.objects.first()or
users = User.objects.all() try: user = users[0] except IndexError: pass
2. Store query set in variable if iterate several times over it
Bad code: each iteration executes SQL query:for user in User.objects.all(): # 1st SQL execution print(user) for user in User.objects.all(): # 2nd SQL execution print(user)Good code: cache query set in a variable:
users = User.objects.all() for user in users: # execute SQL print(user) for user in users: # iterate over cached objects print(user)
3. Prefer exists()
to count()
to check if the objects exist
Bad code: scans all the database rows that match QuerySet
criteria:
if User.objects.count() == 0: print("No users")Good code: SQL scans the 1st row only:
if not User.objects.exists(): print("No users")
4. Foreign key property vs foreign object id property
Bad code: makes extra SQL query to load the author object:author_id = article.author.idGood code: no extra SQL:
author_id = article.author_id
5. select_related()
: load foreign key objects in one SQL query instead N + 1
Bad code: makes extra SQL queries to load author properties: one query per article so the total queries count is 1+ len(articles)
:
for article in Article.objects.all(): print(article.author.name) # extra SQL query for each article to load the authorGood code: one SQL that joins the author table:
for article in Article.objects.select_related('author'): print(article.author.name) # no extra SQL query
6. prefetch_related()
: load many-to-many relations in 2 queries instead N + 1
Bad code: makes extra SQL query to load the comments for each article so the total queries count is 1 + len(articles)
:
for article in Article.objects.all(): for comment in article.comments.all(): print(comment.text)Good code: 2 SQL queries that load articles and comments:
for article in Article.objects.prefetch_related('comments'): for comment in article.comments.all(): print(comment.text)
7. Return object properties values tuples instead objects
Load only requested properties values without model object creation:names = User.objects.values_list('first_name', 'last_name') # (str, str) tuple list names = User.objects.values_list('first_name') # (str, ) tuple listUse
flat=True
to return values instead tuples with one value:
names = User.objects.values_list('first_name', flat=True) # str list user_ids = User.objects.filter(rating=5).values_list('id', flat=True)
8. Subquery vs extra SQL query
One SQL with subquery for users:users = User.objects.filter(rating=5) Comment.object.filter(author__in=users) # SQL with sub-query for usersTwo SQL queries but they are simpler, and their total execution time could be less than for the complex query above if the users count is small:
user_ids = list(User.objects.filter(rating=5).values_list('id', flat=True)) Comment.object.filter(author__id__in=user_ids)