Python descriptors have been around for a long time, but probably because of the lack of good documentation they are still not widely used nor understood.
Here is what the Python documentation says about descriptors:
In general, a descriptor is an object attribute with "binding behavior", one whose attribute access has been overridden by methods in the descriptor protocol. Those methods are
__get__()
,__set__()
, and__delete__()
. If any of those methods are defined for an object, it is said to be a descriptor.
This simply means that a class with the methods __get__
, __set__
and __delete__
can be bound to another class and these methods
will overwrite the attributes that class is bound to.
OK, this is still confusing. Let's write some code to explain all this. To explain how this thing works we are going to write a "validator" class, similar to the ones you are used to see in web frameworks like Django, or Flask.
We have a class describing a product at a hardware store. For now this class has the name of the product and the quantity in stock. We need to ensure that the quantity is an integer. One way to do that would be to use a property setter.
class Product(object):
@property
def quantity(self):
return self._quantity
@quantity.setter
def quantity(self, value):
if not isinstance(value, int):
raise ValueError('Only integer is allowed')
self._quantity = value
The problem with getters
and setters
is that you need one for
every property in every class in your project.
This is how you solve that problem using descriptors.
class IntValidator(object):
def __get__(self, instance, otype):
return self.value
def __set__(self, instance, value):
if not isinstance(value, int):
raise ValueError('Only integer is allowed')
self.value = value
class Product(object):
quantity = IntValidator()
instock = Product()
instock.name = 'Nails'
instock.quantity = 'twelve'
--------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-14-36dea9a2eacc> in <module>()
20 instock = Product()
21 instock.name = 'Nails'
---> 22 instock.quantity = 'twelve'
<ipython-input-14-36dea9a2eacc> in __set__(self, instance, value)
11 print self.__class__, 'set'
12 if not isinstance(value, int):
---> 13 raise ValueError('Only integer is allowed')
14 self.value = value
15
ValueError: Only integer is allowed
The attribute quantity
only accepts integers.
instock.quantity = 12
print instock.quantity
12
print type(instock.quantity)
int
This looks fantastic, but that version doesn't really work. There are
few gotchas. For python to automatically invoke the __get__
and
__set__
methods, descriptors need to be defined at the class
level. The problem is that all the instances of Product
share the
same instance of IntValidator
, leading to the following kind of
behavior.
class Product(object):
quantity = IntValidator()
instock = Product()
instock.quantity = 12
ordered = Product()
ordered.quantity = 42
print 'instock:', instock.quantity
print 'ordered:', ordered.quantity
instock: 42
ordered: 42
For this to work we need to some bookkeeping and track the data for each instance of the class using that particular descriptor.
The first argument of the descriptor's methods is the caller's instance. We can use a dictionary to save the data of each instance using that argument as key. Like in the following example:
from weakref import WeakKeyDictionary
class IntValidator(object):
def __init__(self, default=None):
self.values = WeakKeyDictionary()
def __get__(self, instance, otype):
return self.values[instance]
def __set__(self, instance, value):
if not isinstance(value, int):
raise ValueError('Only integer is allowed')
self.values[instance] = value
class Product(object):
quantity = IntValidator()
instock = Product()
instock.name = 'Bolts'
instock.quantity = 12
ordered = Product()
ordered.quantity = 42
print 'instock', instock.quantity
instock 12
print 'ordered', ordered.quantity
ordered 42
This will work for as long as the instance can be hashed. For instance
you cannot use the IntValidators
described here in a class that
subclasses a list
, a dict
or a set
.
To work around this problem you can use metaclasses to label each descriptor. It solves the problem of non hashable instances, but it does it by adding all the complexity of metaclasses. I won't cover the details of metaclasses in this article because this is a subject for a entire new blog post. This method covers more than 90% of the use cases you will encounter in your project.