Understanding python reflection: Build a Custom dataclass from Scratch
Table of contents
A simple search on YouTube will show you many videos titled "Data classes are amazing." However, this video is not about how amazing data classes are. Instead, it focuses on the power and elegance of reflection in Python and how simple type annotations in Python can help you create your own dataclasses.
Reflection
Reflection is a powerful feature in programming that allows a program to inspect and interact with its own structure and behavior at runtime. In simpler terms, reflection allows your code to examine itself. It can look at the types, attributes, and methods of objects and classes dynamically, and even modify them.
Some common examples of reflection used in Python are setattr
, getattr
, callable
, issubclass
, and inspect.getsource
.
Type annotations
Type annotations have been part of Python for some time now. However, since Python is dynamically typed, these annotations aren't enforced by the interpreter but are used by static type checkers like mypy
and other LSPs
. I believe that combining these type annotations with reflection can be a powerful tool to create your data validators. In fact, Pydantic is developed based on these concepts.
def sum(no1: int, no2: int)-> int:
return no1 + no2
sum.__annotations__
# Output - {'no1': int, 'no2': int, 'return': int}
class Vehicle:
mileage: float
top_speed: float
odometer_reading: int
Vehicle.__annotations__
# Output - {'mileage': float, 'top_speed': float, 'odometer_reading': int}
The above example explains how type annotations are stored as dunder attributes in a python class.
Dataclass
from dataclasses import dataclass
@dataclass
class Vehicle:
mileage: float
top_speed: float
odometer_reading: int
vehicle = Vehicle(42, 120, 45000)
print(vehicle.__str__())
# Output: Vehicle(mileage=42, top_speed=120, odometer_reading=45000)
vehicle.__dict__
# Output: {'mileage': 42, 'top_speed': 120, 'odometer_reading': 45000}
The above code shows how a dataclass works. We can see that various special methods are generated, and a constructor is created that takes the class annotations as parameters and initializes them as instance variables.
This is not a built-in Python feature; all this work is done by the dataclass decorator. Now, we will create our own minimal dataclass decorator that performs the same tasks.
Implementation
Now let's see what happens when we don't use any dataclass decorator.
from dataclasses import dataclass
class Vehicle:
mileage: float
top_speed: float
odometer_reading: int
vehicle = Vehicle(42, 120, 45000)
# Output(error): TypeError: Vehicle() takes no arguments
Without the dataclass decorator, special methods like __init__ are not implemented. Using reflection and annotations, we will create these special methods ourselves.
For this we will first create a decorator named datakilas
.
def datakilas(cls):
annotations = cls.__annotations__
print(f"{annotations=}")
# logic to dynamically assign __init__ method into cls
return cls
@datakilas
class Vehicle:
mileage: float
top_speed: float
# Output: annotations={'mileage': <class 'float'>, 'top_speed': <class 'float'>}
Now that we know our next task, let's dynamically create the constructor and representational (__str__) functions for the decorated class.
def datakilas(cls):
annotations = cls.__annotations__
print(f"{annotations=}")
# Define the __init__ method for the class
def init_func(self, **kwargs):
for key in annotations:
setattr(self, key, kwargs.get(key, None))
# Define the __str__ method
def str_func(self):
# Create a string representation of the class with its attributes
attr_str = ', '.join(f'{key}={getattr(self, key)}' for key in annotations)
return f'{cls.__name__}({attr_str})'
# Add the __init__ and __str__ method to the class attributes
cls.__init__ = init_func
cls.__str__ = str_func
return cls
In the above program, we dynamically create the __init__ and __str__ dunders using the class annotations. Now we're using the new datakilas
decorator, we can easily create a simple dataclass.
@datakilas
class Vehicle:
mileage: float
top_speed: float
vehicle = Vehicle(mileage=15, top_speed=200)
print(vehicle)
# Output: Vehicle(mileage=15, top_speed=200)
print(vehicle.__dict__)
# Output: {'mileage': 15, 'top_speed': 200}
The current implementation expects keyword arguments in class constructors. To understand it better, you can update the implementation to accept non-keyword arguments as well.
Conclusion
This way, we can easily create a basic version of a complex feature like dataclass
using Python's simple reflection abilities. Although we didn't explicitly use reflection-specific tools, we used Python's powerful language features to update a class's behavior at runtime and dynamically add constructors and other methods to it. I think this serves as a perfect example of the power of reflection in Python, showing how easy and sophisticated it is to write complex programming logic in Python.