Metadata in Data Classes (Python version 3.7+)

Metadata in Data Classes (Python version 3.7+)

Data Classes in the module dataclasses are versatile and can be used to create user defined classes with type hints. Users can add metadata to each field during initialization, which can be used for proper documentation, data dictionary creation and other use cases. If you want to learn more about Data Classes, there is a great talk by Raymond Hettinger. This blog post is focused on the metadata parameter of the field() function in the dataclasses module.

Create a simple class called Stock and initialize it with smbl, yr, opn, high, low & close

#Create a Data Class using the @dataclass decorator
from dataclasses import dataclass, field
@dataclass
class Stock:
    smbl: str = field(metadata={'name': 'Stock Symbol', 'description': 'Symbol of US based stocks'})
    yr: int = field(metadata={'name': 'Year', 'description': 'Year in which the stock traded'})
    opn: float = field(metadata={'name': 'Opening Price', 'description': 'The opening price of the stock'})
    high: float
    low: float
    close: float

During the initialization users may specify metadata (optional) to each field using the field() function as shown above. The metadata argument takes a valid dict, and if no dictionary is provided, an empty dict is created by default. The metadata specified in smbl, yr & opn and the empty dict (no metadata) can be accessed by using the fields() (note the ‘s’ in fields()) function.

from dataclasses import fields
#Create an instance of the Stock class
msft = Stock('MSFT', '2020', 158.78, 232.86, 132.52, 222.42)
#Pass the instance of the Stock class to the fields() function. A tuple of Field objects is returned
fields(msft)

The fields() function returns a tuple of Field objects. The first element of the of the tuple is shown below:

#Output of fields(msft)[0]
Field(name='symbol',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object at 0x0000022A15614D30>,default_factory=<dataclasses._MISSING_TYPE object at 0x0000022A15614D30>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'name': 'Stock Symbol', 'description': 'Symbol of US based stock'}),_field_type=_FIELD)

The metadata of the fields smbl, yr & opn can be retrieved by accessing the metadata property of the Field object. As evident from the output above, the metadata property is of type mappingproxy, which is simply a read only dictionary. The metadata can therefore be retrieved by iterating over the metadata property of Field object as shown below:

#Iterate over the Field object's metadata property to retrieve metadata 
for field_item in fields(msft):
    if field_item.metadata:
        print(field_item.metadata)

The output is shown below:

{'name': 'Stock Symbol', 'description': 'Symbol of US based stock'} 
{'name': 'Year', 'description': 'Year in which the stock traded'} 
{'name': 'Opening Price', 'description': 'The opening price of the stock'}

In addition to the metadata parameter, the field() function takes various other parameters such as default, default_factory, init, compare & hash. The details of these parameters can be found at the following link.

PS: The cover photo is Trillium lake in Oregon