Python: Protocols

Python interpreter recognizes special methods like __method__() and invokes them in hard-wired situations. A group of related such “dunder” or “double underscore” methods are called a protocol. An object that implements a protocol, can be used in idiomatic ways.

Object Protocol

About object creation, initialization, destruction, and representation.

  • __new__(cls [, *args [, **kwargs]])
  • __init__(self [, *args [, **kwargs]])
  • __del__(self)
  • __repr__(self)

x = SomeClass(args) is translated into:

x = SomeClass.__new__(SomeClass, args) # creation
if isinstance(x, SomeClass):
    x.__init__(args) # initialization

It is unusual to implement __new__(), but may be useful in bypassing __init__() or for creating singleton or for caching.

The __del__() method is invoked when an object is about to be garbage collected. Note, del x only decrements reference count, it does not necessarily invoke __del__().

The __repr__() method should be such that eval( __repr__(obj) ) == obj. If such representation isn’t possible (say file handle) the convention is to return a string like: <...message...>.

f = open('foo.txt')
a = repr(f)
# a = "<_io.TextIOWrapper name='foo.txt' mode='r' encoding='UTF-8'>"

Number Protocol

Object supporting mathematical operations.

  • __add__(self, other)
  • __sub__(self, other)
  • __mul__(self, other)
  • __truediv__(self, other)
  • __floordiv__(self, other)
  • __mod__(self, other)
  • __matmul__(self, other): self @ other
  • __divmod__(self, other)
  • __pow__(self, other [, modulo])
  • __lshift__(self, other)
  • __rshift__(self, other)
  • __and__(self, other): self & other
  • __or__(self, other): self | other
  • __xor__(self, other)
  • __neg__(self): -self
  • __pos__(self): +self
  • __invert__(self): ~self
  • __abs__(self)
  • __round__(self, n)
  • __ceil__(self): math.ceil(self)
  • __floor__(self): math.floor(self)
  • __trunc__(self): math.trunc(self)

For __add__() through to __xor__(), there are __radd__() and __iadd__() variants. If __add__(self, other) is invoked on self + other, then __radd__(self, other) is invoked on other + self. The __iadd__(self, other) is for in-place increment like self += other and may be more efficient than self = self + other.

For x + y, first x.__add__(y) is tried and if it fails with NotImplemented, then y.__radd__(x) is tried. If both fail, the entire operation fails. In the special case where y is a subtype of x, y.__radd__(x) is used directly.

>>> x = 314
>>> y = 2.718
>>> x.__add__(y)
NotImplemented
>>> x.__radd__(y)
NotImplemented
>>> y.__add__(x)
316.718
>>> y.__radd__(x)
316.718

Comparison Protocol

  • __bool__(self)
  • __eq__(self, other)
  • __ne__(self, other)
  • __lt__(self, other)
  • __le__(self, other)
  • __gt__(self, other)
  • __ge__(self, other)
  • __hash__(self)

The method __bool__() is used in expression like if (a). If a.__bool__() is not implemented, a.__len__() is used; if the later is not implemented, the object is considered True.

The method __eq__() is for == or != operators. By default __eq__() uses identity comparison. With __eq__() implemented, although not necessary, the method __ne__() can be implemented for specializing != operator.

To evaluate a < b, the Python interpreter first tries a.__lt__(b). If __lt__() isn’t implemented or if it returns NotImplemented (different from NotImplementedError), b.__gt__(a) is used. In one particular case, when b is a subtype of a, b.__gt__(a) is used directly.

To be able use in sort(), min(), or max(), the object must implement __lt__().

For a user-defined class @total_ordering class decorator from functools module can be handy. Given the class has implemented __eq__() + another like __lt__(), this decorator will generate the remaining comparison functions.

__hash__() is used when an object is placed in a set or is used as a key for a dict. It should return integer and the values must be equal for two instances that compare as equal. The method __eq__() must be implemented with __hash__(). For two different instances, if __hash__() gives the same value, __eq__() is used to resolve the conflict.

Conversion Protocol

To convert an object into built-in types like string or number.

  • __str__(self)
  • __bytes__(self)
  • __format__(self, format_spec)
  • __bool__(self)
  • __int__(self)
  • __float__(self)
  • __complex__(self)
  • __index__(self)

When the object is used to index a container like list, __index__() is called. If an object does not implement __int__(), in an expression int(my_object), my_object.__index__() is called.

Python does not perform implicit type conversion. Even if x implements __int__(), the expression x+3 still produces TypeError.

Container Protocol

For container objects.

  • __len__(self)
  • __getitem__(self, key)
  • __setitem__(self, key, value)
  • __delitem__(self, key)
  • __contains__(self, obj)

Iteration Protocol

An object supporting iteration, say via for loop.

  • __iter__(), __aiter__()
  • __next__(), __anext__()
  • __reversed__()

The method __iter__() should return an iterator which in turn should implement __next__(). The iterator should signal end of iteration raising StopIteration exception. The Python interpreter will translate for x in my_iterable_object as follows:

_iter = my_iterable_object.__iter__()
while True:
    try:
        x = _iter.__next__()
    except StopIteration:
        break
    # Execute statements in loop-body using x

The yield expression can be handy when implementing __iter__():

class MyAwesomeRange:
    def __init__(self, start, stop, step):
        self.start = start
        self.stop = stop
        self.step = step
    def __iter__(self):
        x = self.start
        while x < self.stop:
            yield x
            x += self.step

We can then use MyAwesomeRange as below:

numbers = MyAwesomeRange(0.0, 1.0, 0.1)
for x in numbers:
    print(round(x, 1)) # 0.0, 0.1, 0.2, ...

Note, yield makes the __iter__() a generator function and when a call to the generator function does not hit the yield, Python interpreter internally raises StopIteration exception to signal the end of iteration.

Attribute Protocol

  • __getattribute__(self, name)
  • __getattr__(self, name)
  • __setattr__(self, name)
  • __delattr__(self, name)

Function Protocol

Object emulating function.

  • __call__(self, *args)

type‘s implement __call__() to create new instance. Library function functools.partial() creates objects that implement __call__().

Context Manager Protocol

For simplified resource management.

with my_context_manager_object [as var]:
    statements
  • __enter__(self), __aenter__(...)
  • __exit__(self, type, value, traceback), __aexit__(...)

Operators without dunders

  • Logical: and, or, not
  • Identity: is, is not
  • Membership: in, not in
  • Walrus: :=
  • del

Leave a comment