Django URLValidator with scheme deny list

If you want to check that some string is a valid URL in your Django application, you might reach for django.core.validators.URLValidator.

By default, it accepts URLs with the schemes http, https, ftp, ftps. If you want to accept different schemes, you can provide your own list of valid schemes.

But what if you want to accept all schemes, except some forbidden schemes, e.g. file and javascript?

A quick glance at the source of URLValidator reveals that it uses in to check if the scheme of the given URL is in the list of valid schemes.

Thanks to duck typing, the list of valid schemes does not have to be a list, it can be anything that supports the in operator. If we know our dunder methods, we can create a container-like object that contains everything except the schemes we want to exclude.

Here is what it would look like:

class Everything:

    def __init__(self, *, excluding):
        self.excluding = excluding

    def __contains__(self, element):
        return element not in self.excluding

Two things worth noting:

  1. The argument excluding for the __init__ method is a keyword-only argument, which means you have to provide. That gives you a nice, self-documenting class instantiation:
    valid_schemes = Everything(excluding=["file", "javascript"])
  2. Initially, I wanted to use except as argument name, but except is a reserved keyword.

This class can be used like this:

>>> valid_schemes = Everything(excluding=["file", "javascript"])
>>> validator = URLValidator(schemes=valid_schemes)
>>> validator("http://example.com/")
>>> validator("wss://example.com/")
>>> validator("file:///etc/passwd")
Traceback (most recent call last):
File "", line 1, in 
File "/Users/danielhepper/.local/share/virtualenvs/django-beyond-ofndDysG/lib/python3.7/site-packages/django/core/validators.py", line 114, in __call__
raise ValidationError(self.message, code=self.code)
django.core.exceptions.ValidationError: ['Enter a valid URL.']

Now, we come to the somewhat disappointing conclusion of this article. When I came up with this hack for django.core.validators.URLValidator, I actually wanted allow URLs like tel:+491234567890 or email:daniel@consideratecode.com.

As it turns out, the validation that Django’s URLValidator applies is too strict for this use case, so it is back to the drawing board for this one. Maybe the Everything class will find its purpose some other day.


The more SEO-friendly title of this post would have been “Django URLValidator with scheme blacklist”, but the tech world is deprecating the term whitelist/blacklist in favour of more inclusive terminology. (1) (2) (3)

Leave a Reply