Django URLValidator with scheme deny list
If you want to check that some string is a valid URL in your Django application, you might reach for django.core.validators.URLValidator
.
By default, it accepts URLs with the schemes http
, https
, ftp
, ftps
. If you want to accept different schemes, you can provide your own list of valid schemes.
But what if you want to accept all schemes, except some forbidden schemes, e.g. file
and javascript
?
A quick glance at the source of URLValidator
reveals that it uses in to check if the scheme of the given URL is in the list of valid schemes.
Thanks to duck typing, the list of valid schemes does not have to be a list, it can be anything that supports the in
operator. If we know our dunder methods, we can create a container-like object that contains everything except the schemes we want to exclude.
Here is what it would look like:
class Everything:
def __init__(self, *, excluding):
self.excluding = excluding
def __contains__(self, element):
return element not in self.excluding
Two things worth noting:
- The argument
excluding
for the__init__
method is a keyword-only argument, which means you have to provide. That gives you a nice, self-documenting class instantiation:valid_schemes = Everything(excluding=["file", "javascript"])
- Initially, I wanted to use
except
as argument name, butexcept
is a reserved keyword.
This class can be used like this:
>>> valid_schemes = Everything(excluding=["file", "javascript"])
>>> validator = URLValidator(schemes=valid_schemes)
>>> validator("http://example.com/")
>>> validator("wss://example.com/")
>>> validator("file:///etc/passwd")
Traceback (most recent call last):
File "", line 1, in
File "/Users/danielhepper/.local/share/virtualenvs/django-beyond-ofndDysG/lib/python3.7/site-packages/django/core/validators.py", line 114, in __call__
raise ValidationError(self.message, code=self.code)
django.core.exceptions.ValidationError: ['Enter a valid URL.']
Now, we come to the somewhat disappointing conclusion of this article. When I came up with this hack for django.core.validators.URLValidator
, I actually wanted allow URLs like tel:+491234567890
or email:daniel@consideratecode.com
.
As it turns out, the validation that Django’s URLValidator
applies is too strict for this use case, so it is back to the drawing board for this one. Maybe the Everything
class will find its purpose some other day.
P.S.:
The more SEO-friendly title of this post would have been “Django URLValidator with scheme blacklist”, but the tech world is deprecating the term whitelist/blacklist in favour of more inclusive terminology. (1) (2) (3)