Home Pydanctic vs Dataclasses
Post
Cancel

Pydanctic vs Dataclasses

Python dataclasses are fantastic. Pydantic is fantastic. It is a tough choice if indeed we are confronted with choosing one or the other. I would say that comparing these two great modules is like comparing pears with apples, albeit similar in some regards, different overall.

Pydantic’s arena is data parsing and sanitization, while dataclasses a is a fast and memory-efficient (especially using slots, Python 3.10+) general-purpose data container.

However, sometimes, it seems some code dependency is trying to make us choose. A great example is when using FastAPI; it is built on pydantic. If our app is simple, we may just as well do all the data models in pydantic and avoid any compatibility issues. Another example is when we inherit a codebase that is already using dataclasses. Why should we miss on dataclasses performance when using FastAPI? How can we convert our dataclasses schema to pydantic when we need to implement a web API using FastAPI. By the way, FastAPI is fantastic too.

dataclass (no defaults) to pydantic

The problem:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from dataclasses import dataclass

import pydantic


@dataclass(slots=True)
class Person:
    name: str
    age: int


# convert into

class PydanticPerson(pydantic.BaseModel):
    name: str
    age: int

To solve this, we use pydantic’s utility to create models dynamically, pydantic.create_model. We need to supply this function with a name for our dynamically created model. We also need to provide kwargs (each keyword argument is the attribute’s name in the dynamically created model) with a tuple of attribute type and default value (if there is no default value, then use … instead).

To get the information we need for the pydantic model, we use the fields utility in the dataclasses module; using fields, we can access the attributes’ properties.

1
2
3
4
5
6
7
8
9
10
11
from dataclasses import fields

import pydantic

# get attribute names and types from dataclass into pydantic format
field_kwargs = {_field.name: (_field.type, ...) for _field in fields(Person)}

# create pydantic model dynamically
DynamicPydanticPerson = pydantic.create_model(
    "DynamicPydanticPerson", **field_kwargs
)

Now we can create an instance of our dynamically generated pydantic model:

1
2
# instance dynamic pydantic
dynamic_pydantic_person = DynamicPydanticPerson(name="Diego", age=33)

printing the model and its type:

1
2
name=’Diego’ age=33
<class ‘pydantic.main.DynamicPydanticPerson’>

So we are golden here.

pydantic (no defaults) into dataclass

The problem:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from dataclasses import dataclass

import pydantic


class PydanticPerson(pydantic.BaseModel):
    name: str
    age: int


# convert into


@dataclass(slots=True)
class Person:
    name: str
    age: int

We now use dataclasses utility to create classes dynamically (dataclasses.make_dataclass), which requires the name of the dynamically created class and a list with tuples for each attribute; Each tuple should contain the attribute’s name and type.

1
2
3
4
5
6
7
8
9
10
from dataclasses import make_dataclass

# get attribute names and types from pydantic into dataclass format
dataclass_fields = [
    (_field.name, _field.type_)
    for _field in PydanticPerson.__fields__.values()
]

# create dataclass dynamically
DynamicPerson = make_dataclass("DynamicPerson", dataclass_fields, slots=True)

We can check if the class is, in fact, a dataclass with is_dataclass from the dataclasses module. So we do that and instantiate the class:

1
2
3
4
5
6
from dataclasses import is_dataclass

print(is_dataclass(DynamicPerson))

dynamic_person  = DynamicPerson(name="Diego", age=33)

prints:

1
2
True
DynamicPerson(name=’Diego’, age=33)

That is the output we expected.

dataclass with defaults to pydantic

Now we turn to the case where our dataclass has default values, and we want to convert it to pydantic.

The problem:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from dataclasses import dataclass

import pydantic


@dataclass(slots=True)
class Person:
    age: int
    name: str = "Jane"


# convert into

class PydanticPerson(pydantic.BaseModel):
    age: int
    name: str = "Jane"

The idea is similar to the no defaults case; however, now we need to replace … for the default value. Furthermore, to check if the default value in the field of the dataclass is missing (no default value), we check if the field default is an instance of the _MISSING_TYPE class from the dataclasses module (kind of a hack here, but we need to get things done).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
from dataclasses import fields, _MISSING_TYPE

import pydantic

# get attribute names and types from dataclass into pydantic format
pydantic_field_kwargs = dict()
for _field in fields(Person):
    # check is field has default value
    if isinstance(_field.default, _MISSING_TYPE):
        # no default
        default = ...
    else:
        default = _field.default

    pydantic_field_kwargs[_field.name] =  (_field.type, default)


# create pydantic model dynamically
DynamicPydanticPerson = pydantic.create_model(
    "DynamicPydanticPerson", **pydantic_field_kwargs
)

We continue to test our DynamicPydanticPerson:

1
2
3
4
5
# instance dynamic pydantic default name
dynamic_pydantic_person = DynamicPydanticPerson(age=33)

# instance dynamic pydantic
dynamic_pydantic_person = DynamicPydanticPerson(age=33, name="Diego")

printing the dataclasses and their types:

using the default:

1
2
age=33 name=’Jane’
<class ‘pydantic.main.DynamicPydanticPerson’>

override the default:

1
2
age=33 name=’Diego’
<class ‘__main__.PydanticPerson’>

Exactly what we expected.

pydantic with defaults into dataclass

The problem:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from dataclasses import dataclass

import pydantic


class PydanticPerson(pydantic.BaseModel):
    age: int
    name: str = "John"


# convert into


@dataclass(slots=True)
class Person:
    age: int
    name: str = "John"

notice the default fields.

This case is very similar to the one where there are no defaults. If an attribute has a default value, the tuple per attribute has three elements instead of two, the third element being the default value (_field.required means no default):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from dataclasses import make_dataclass

# get attribute names and types from pydantic into dataclass format
dataclass_fields = []
for _field in PydanticPerson.__fields__.values():
    if _field.required:
        field_tuple = (_field.name, _field.type_)
    else:
        field_tuple = (_field.name, _field.type_, _field.default)
    
    dataclass_fields.append(field_tuple)

# create dataclass dynamically
DynamicPerson = make_dataclass("DynamicPerson", dataclass_fields, slots=True)

testing that, in fact, it is a dataclass and the instances for default and default override:

1
2
3
4
5
6
7
8
from dataclasses import is_dataclass


print(is_dataclass(DynamicPerson))

dynamic_person_default  = DynamicPerson(age=33)

dynamic_person  = DynamicPerson(name="Diego", age=33)

prints:

1
2
3
True
DynamicPerson(age=33, name=’John’)
DynamicPerson(age=33, name=’Diego’)

We are good to go.

Building the tools: dataclass to pydantic

Our code snippets for converting dataclasses to pydantic models were successful, with defaults and no defaults. Now it is time to wrap them up in a more usable tool. A function that does the conversion for us:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
from dataclasses import fields, _MISSING_TYPE
from typing import Any, Optional

import pydantic


def convert_flat_dataclass_to_pydantic(
    dcls: type, name: Optional[str] = None
) -> type[pydantic.BaseModel]:
    if name is None:
        name_ = f"Pydantic{dcls.__name__}"
    else:
        name_ = name
    return pydantic.create_model(  # type: ignore
        name_,
        **_get_pydantic_field_kwargs(dcls),
    )


def _get_pydantic_field_kwargs(dcls: type) -> dict[str, tuple[type, Any]]:
    # get attribute names and types from dataclass into pydantic format
    pydantic_field_kwargs = dict()
    for _field in fields(dcls):
        # check is field has default value
        if isinstance(_field.default, _MISSING_TYPE):
            # no default
            default = ...
        else:
            default = _field.default

        pydantic_field_kwargs[_field.name] = (_field.type, default)
    return pydantic_field_kwargs

Below are examples:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from dataclasses import dataclass

@dataclass(slots=True)
class Person:
    age: int
    name: str = "Jane"

# create class
PydanticPerson = convert_flat_dataclass_to_pydantic(Person)

# instantiate
pydantic_person = PydanticPerson(age=33)


print(PydanticPerson.__name__)

print(pydantic_person)

Building the tools: pydantic to dataclass

The function for converting dataclasses to pydantic:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
from dataclasses import make_dataclass, Field
from collections.abc import Iterable
from typing import Any, Optional

import pydantic



def convert_flat_pydantic_to_dataclass(
    pydantic_cls: type[pydantic.BaseModel],
    name: Optional[str] = None,
    slots: bool = True,
) -> type:
    if name is None:
        name_ = f"DataClass{pydantic_cls.__name__}"
    else:
        name_ = name
    return make_dataclass(
        name_,
        _get_dataclass_fields(pydantic_cls),
        slots=slots,
    )


def _get_dataclass_fields(
    pydantic_cls: type[pydantic.BaseModel],
) -> Iterable[str | tuple[str, type] | tuple[str, type, Field[Any]]]:
    # get attribute names and types from pydantic into dataclass format
    dataclass_fields = []
    for _field in pydantic_cls.__fields__.values():
        if _field.required:
            field_tuple = (_field.name, _field.type_)
        else:
            field_tuple = (  # type: ignore
                _field.name,
                _field.type_,
                _field.default,
            )

        dataclass_fields.append(field_tuple)
    return dataclass_fields

The following code is an example of using the function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import pydantic


class Person(pydantic.BaseModel):
    age: int
    name: str = "John"

#class
DataClassPerson = convert_flat_pydantic_to_dataclass(Person)

#instance
dataclass_person = DataClassPerson(age=33)


print(DataClassPerson.__name__)

print(dataclass_person)

FastAPI example

At last, an example using FastAPI. This simple example shows how easy it is to use a dataclass with FastAPI. To make this example work, save the code from the two previous sections into a file named “dataclass_to_pydantic.py” and place it in the same directory where you run the following example. The following example should be named “fast_api_example.py”, this is very important for the uvicorn server to run from within the script.

We create a dataclass FooQuery and then convert it to pydantic, which is passed as the type for the foo endpoint:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
"""fast_api_example.py"""

from dataclasses import dataclass

import uvicorn  # pip install uvicorn
from fastapi import FastAPI

from dataclass_to_pydantic import convert_flat_dataclass_to_pydantic

app = FastAPI()


@dataclass
class FooQuery:
    foo: str
    bar: str


PydanticFooQuery = convert_flat_dataclass_to_pydantic(FooQuery)


@app.get("/foo")
async def get_foo(query: PydanticFooQuery) -> PydanticFooQuery:
    return query


# ----------------------------------------------------------------
if __name__ == "__main__":
    uvicorn.run(
        "fast_api_example:app", host="localhost", port=8080, reload=True
    )

To test it, we execute the script and use the requests module to do a simple test of the foo endpoint:

1
2
3
4
5
6
7
import requests

url = "http://localhost:8080/foo"

query = dict(foo="foo_query", bar="bar_query")
response = requests.get(url, json=query)
# print response.json(): {'foo': 'foo_query', 'bar': 'bar_query'}

Everything works as expected.

Final Words

Now we know how to convert in a simple way flat pydantic to dataclasses and vice-versa. Such conversions may be handy in several situations, including using dataclasses with FastAPI.

However, I would argue to first think carefully about the architecture of your app and try to avoid unnecessary conversions. Both pydantic and dataclasses have a lot of features and customizations, most of which would get lost on translation.

Finally, regarding nested models, their conversion is tricky. Using the tools in the code presented in the story is very straightforward to convert to an instance of the nested model. But to convert into a type (the uninstantiated class), we must first declare all of the nested models’ conversions. As I said, that part is tricky.

This post is licensed under CC BY 4.0 by the author.