What does @python_2_unicode_compatible() do?
If you followed the official Django tutorial or any other up-to-date tutorial, you might have seen a model definition like this:
@python_2_unicode_compatible class Person(models.Model): first_name = models.CharField(max_length=255) last_name = models.CharField(max_length=255)</code> def __str__(self): return text_type("{} {}").format(self.first_name, self.last_name)
You might have wondered “What is the purpose of @python_2_unicode_compatible? Do I really need?” Let’s have a look at the docs:
A decorator that defines __unicode__ and __str__ methods under Python 2. Under Python 3 it does nothing.
To support Python 2 and 3 with a single code base, define a __str__ method returning text (use six.text_type() if you’re doing some casting) and apply this decorator to the class.
It does nothing under Python 3, so if you do not want to support Python 2, it is safe to omit it and stop reading here. Otherwise, carry on.
As you can tell from the at-sign, @python_2_unicode_compatible
is a decorator. If you are not familiar with decorators, have a look at Python documentation or check out this step-by-step introduction into Python decorators by Dan Bader. But simply speaking, adding a decorator to a function or in this case the Person
class, it the same as this line:
Person = python_2_unicode_compatible(Person)
The decorator is called with the thing is decorates, and the variable gets replaced with the result.
So we have a decorator that does nothing under Python 3 and under Python 2, messes with __str__
and __unicode__
methods. What do these methods do?
The method __str__
is called when you want to represent an object as a string: when you inspect it in the Python shell, display it in Django admin or render it in a template.
Let’s assume our model doesn’t have a __str__
method:
class Person(models.Model): first_name = models.CharField(max_length=255) last_name = models.CharField(max_length=255)
Let’s create a model instance in the Python shell with python manage.py shell
.
>>> daniel = Person(first_name='Daniel', last_name='Hepper') >>> print(daniel) Person object >>> daniel
That is not very helpful. Let’s add a __str__
method like the example at the beginning.
@python_2_unicode_compatible class Person(models.Model): first_name = models.CharField(max_length=255) last_name = models.CharField(max_length=255)</code> def __str__(self): return "{} {}".format(self.first_name, self.last_name)
Much better:
>>> daniel = Person(first_name='Daniel', last_name='Hepper') >>> print(daniel) Daniel Hepper >>> daniel <Person: Daniel Hepper>
But what happens when we have someone with a fancy character like “é” in their name? Under Python 3, it just works fine:
>>> amelie = Person(first_name='Amélie', last_name='Poulain') >>> print(amelie) Amélie Poulain >>> amelie <Person: Amélie Poulain>
However, under Python 2, something seems off.
$ python manage.py shell Python 2.7.13 (default, Dec 30 2016, 21:31:00) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. (InteractiveConsole) >>> from main.models import * >>> amelie = Person(first_name='Amélie', last_name='Poulain') >>> amelie <Person: [Bad Unicode data]> >>> print(amelie) Amélie Poulain
Depending on your system, printing it might work like it did in my case, but it is a gamble. But why?
A big change between Python 2 and Python 3 was the handling of strings. I might talk about that in a future post, but for now you just need to know that:
In Python 2
- there are strings (
str
) and unicode strings (unicode
) __str__
should return a string (str
)__unicode__
should return a unicode string (unicode
)
In Python 3
- all strings (
str
) are unicode string, there is nounicode
__str__
should return a string (str
)__unicode__
does not exist
That seems… incompatible. So how do you create a class that is compatible with both Python 2 and 3?
@python_2_unicode_compatible
to the rescue! Here is a complete working example:
from django.db import models from django.utils.six import python_2_unicode_compatible, text_type</code> @python_2_unicode_compatible class Person(models.Model): first_name = models.CharField(max_length=255) last_name = models.CharField(max_length=255) def __str__(self): return text_type("{} {}").format(self.first_name, self.last_name)
When using the decorator, __str__
should always return a unicode string. Therefor we cast the string to text_type
.
Under Python 3, text_type
is just an alias for str
, but in Python 2, it is an alias for unicode
The code
Finally, let’s have a look at the source code of our decorator:
def python_2_unicode_compatible(klass): if PY2: if '__str__' not in klass.__dict__: raise ValueError("@python_2_unicode_compatible cannot be applied " "to %s because it doesn't define __str__()." % klass.__name__) klass.__unicode__ = klass.__str__ klass.__str__ = lambda self: self.__unicode__().encode('utf-8') return klass
Under Python 2, it checks if the class has a __str__
method, and if it doesn’t, it raises an error. Otherwise, it assigns the __str__
method to the name __unicode__
and redefines __str__
as a lamdba function that calls __unicode__
and encodes the result as UTF-8.
Under Python 3, it just returns the unmodified class.
Note that the decorator originally comes from the six package, which consists of multiple utilities to enable Python 2 and 3 compatibility. If you ever want to use it in a non-django project, just use six.
@python_2_unicode_compatible Cheat Sheet
Phew, quite a bit that is going on here! Don’t worry if you didn’t grasp everything on the first read. Wrapping your head around strings, Unicode and encodings and the different behaviors between Python 2 and 3 can take a while.
I’ll try to give a concise summary:
- When your code should only run under Python 3, write a
__str__
method and don’t use@python_2_unicode_compatible.
- When your code should support Python 2 and Python 3, write a
__str__
method, make sure it returnsunicode
under Python 2 and use@python_2_unicode_compatible.
- When your code runs under Python 2 and will never run under Python 3, write a
__unicode__
method, make sure it actually returnsunicode
and don’t use@python_2_unicode_compatible.