Loading…

What does @python_2_unicode_compatible() do?

If you followed the official Django tutorial or any other up-to-date tutorial, you might have seen a model definition like this:

@python_2_unicode_compatible
class Person(models.Model):
    first_name = models.CharField(max_length=255)
    last_name = models.CharField(max_length=255)</code>

    def __str__(self):
        return text_type("{} {}").format(self.first_name, self.last_name)

You might have wondered “What is the purpose of @python_2_unicode_compatible? Do I really need?” Let’s have a look at the docs:

A decorator that defines __unicode__ and __str__ methods under Python 2. Under Python 3 it does nothing.

To support Python 2 and 3 with a single code base, define a __str__ method returning text (use six.text_type() if you’re doing some casting) and apply this decorator to the class.

It does nothing under Python 3, so if you do not want to support Python 2, it is safe to omit it and stop reading here. Otherwise, carry on.

As you can tell from the at-sign, @python_2_unicode_compatible is a decorator. If you are not familiar with decorators, have a look at Python documentation or check out this step-by-step introduction into Python decorators by Dan Bader. But simply speaking, adding a decorator to a function or in this case the Person class, it the same as this line:

Person = python_2_unicode_compatible(Person)

The decorator is called with the thing is decorates, and the variable gets replaced with the result.

So we have a decorator that does nothing under Python 3 and under Python 2, messes with __str__ and __unicode__ methods. What do these methods do?

The method __str__ is called when you want to represent an object as a string: when you inspect it in the Python shell, display it in Django admin or render it in a template.

Let’s assume our model doesn’t have a __str__ method:

class Person(models.Model):
    first_name = models.CharField(max_length=255)
    last_name = models.CharField(max_length=255)

Let’s create a model instance in the Python shell with python manage.py shell.

>>> daniel = Person(first_name='Daniel', last_name='Hepper')
>>> print(daniel)
Person object
>>> daniel

That is not very helpful. Let’s add a __str__ method like the example at the beginning.

@python_2_unicode_compatible
class Person(models.Model):
    first_name = models.CharField(max_length=255)
    last_name = models.CharField(max_length=255)</code>

    def __str__(self):
        return "{} {}".format(self.first_name, self.last_name)

Much better:

>>> daniel = Person(first_name='Daniel', last_name='Hepper')
>>> print(daniel)
Daniel Hepper
>>> daniel
<Person: Daniel Hepper>

But what happens when we have someone with a fancy character like “é” in their name? Under Python 3, it just works fine:

>>> amelie = Person(first_name='Amélie', last_name='Poulain')
>>> print(amelie)
Amélie Poulain
>>> amelie
<Person: Amélie Poulain>

However, under Python 2, something seems off.

$ python manage.py shell
Python 2.7.13 (default, Dec 30 2016, 21:31:00)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from main.models import *
>>> amelie = Person(first_name='Amélie', last_name='Poulain')
>>> amelie
<Person: [Bad Unicode data]>
>>> print(amelie)
Amélie Poulain

Depending on your system, printing it might work like it did in my case, but it is a gamble. But why?

A big change between Python 2 and Python 3 was the handling of strings. I might talk about that in a future post, but for now you just need to know that:

In Python 2

  • there are strings (str) and unicode strings (unicode)
  • __str__ should return a string (str)
  • __unicode__ should return a unicode string (unicode)

In Python 3

  • all strings (str) are unicode string, there is no unicode
  • __str__ should return a string (str)
  • __unicode__ does not exist

That seems… incompatible. So how do you create a class that is compatible with both Python 2 and 3?
@python_2_unicode_compatible to the rescue! Here is a complete working example:

from django.db import models
from django.utils.six import python_2_unicode_compatible, text_type</code>

@python_2_unicode_compatible
class Person(models.Model):
    first_name = models.CharField(max_length=255)
    last_name = models.CharField(max_length=255)

    def __str__(self):
        return text_type("{} {}").format(self.first_name, self.last_name)

When using the decorator, __str__ should always return a unicode string. Therefor we cast the string to text_type.

Under Python 3, text_type is just an alias for str, but in Python 2, it is an alias for unicode

The code

Finally, let’s have a look at the source code of our decorator:

def python_2_unicode_compatible(klass):
    if PY2:
        if '__str__' not in klass.__dict__:
            raise ValueError("@python_2_unicode_compatible cannot be applied "
                             "to %s because it doesn't define __str__()." %
                             klass.__name__)
        klass.__unicode__ = klass.__str__
        klass.__str__ = lambda self: self.__unicode__().encode('utf-8')
    return klass

Under Python 2, it checks if the class has a __str__ method, and if it doesn’t, it raises an error. Otherwise, it assigns the __str__ method to the name __unicode__ and redefines __str__ as a lamdba function that calls __unicode__ and encodes the result as UTF-8.

Under Python 3, it just returns the unmodified class.

Note that the decorator originally comes from the six package, which consists of multiple utilities to enable Python 2 and 3 compatibility. If you ever want to use it in a non-django project, just use six.

@python_2_unicode_compatible Cheat Sheet

Phew, quite a bit that is going on here! Don’t worry if you didn’t grasp everything on the first read. Wrapping your head around strings, Unicode and encodings and the different behaviors between Python 2 and 3 can take a while.

I’ll try to give a concise summary:

  • When your code should only run under Python 3, write a __str__ method and don’t use @python_2_unicode_compatible.
  • When your code should support Python 2 and Python 3, write a __str__ method, make sure it returns unicode under Python 2 and use @python_2_unicode_compatible.
  • When your code runs under Python 2 and will never run under Python 3, write a __unicode__ method, make sure it actually returns unicode and don’t use @python_2_unicode_compatible.

Leave a Reply