Skip to main content

Reasons to use dataclass over pydantic basemodel

Β· 6 min read
Serhii Hrekov
software engineer, creator, artist, programmer, projects founder

While Pydantic is the industry standard for external data (APIs, JSON parsing), Python's built-in dataclasses are often the better choice for internal data.

To answer your specific questions:

  1. Is it speed? YES. Dataclasses are significantly faster at creating objects (instantiation).
  2. Is it strict data type? NO. Pydantic is stricter. Dataclasses do not validate types at runtime; they blindly accept whatever you give them.

Here are the 4 most convincing reasons to use Dataclasses over Pydantic in a modern app, with examples for each.

1. Speed: The "Tight Loop" Argument​

Pydantic runs a complex validation engine every time you create an object. Dataclasses just assign values to memory. In a "tight loop" (creating millions of objects), Pydantic can be a bottleneck.

The Benchmark Logic:

  • Pydantic: Checks type, converts type, validates constraints. (~1000ns per object)
  • Dataclass: self.x = x. (~100ns per object)

Example:

from dataclasses import dataclass
from pydantic import BaseModel
import timeit

# 1. The Dataclass
@dataclass(slots=True) # slots=True makes it even faster
class PointDC:
x: int
y: int

# 2. The Pydantic Model
class PointPM(BaseModel):
x: int
y: int

# Benchmarking 1 million creations
def loop_dc():
return [PointDC(i, i) for i in range(1_000)]

def loop_pm():
return [PointPM(x=i, y=i) for i in range(1_000)]

# Result: Dataclasses are typically 10x - 20x faster here
# Use Dataclasses when processing large lists of data internally.

2. Predictability: No "Magic" Coercion​

Pydantic tries to be helpful by "coercing" data. If you pass the string "42" to an int field, Pydantic converts it to 42. Dataclasses are "dumb"-they keep exactly what you passed. This is often safer for internal logic where silent type conversion could hide a bug.

Example:

@dataclass
class InventoryItemDC:
name: str
quantity: int

class InventoryItemPM(BaseModel):
name: str
quantity: int

# --- The "Magic" Difference ---

# Pydantic: SILENTLY converts the string "5" to integer 5
item_pm = InventoryItemPM(name="Apple", quantity="5")
print(type(item_pm.quantity)) # <class 'int'> (Magic happened)

# Dataclass: Preserves the string "5" (even though type hint says int)
item_dc = InventoryItemDC(name="Apple", quantity="5")
print(type(item_dc.quantity)) # <class 'str'> (No magic)

# Why use Dataclasses?
# If 'quantity' coming in as a string is a BUG in your code,
# Pydantic hides the bug. Dataclasses let the bug surface so you can fix the root cause.

3. Zero Dependencies (Library Authors)​

If you are writing a library (like a client SDK or a utility tool) that other people will install, you want to keep your "dependency weight" low.

  • Dataclasses: Built into Python. Zero extra install size.
  • Pydantic: A compiled binary extension (Rust). It adds weight to the installation.

Example: If you are building a simple CLI tool, using Pydantic might add 10-20MB to your docker image or virtual environment. Using Dataclasses adds 0MB.

4. Application Startup Time​

Pydantic models are "expensive" to define. When Python first imports a file containing a BaseModel, Pydantic has to inspect fields, build validators, and compile schemas. Dataclasses are cheap.

Example: For a standard web server (Django/FastAPI), this doesn't matter. But for AWS Lambda or CLI tools (where the script runs for 1 second and dies), Pydantic's import overhead can noticeably slow down the "cold start" time.


🏁 Summary: The Decision Matrix​

FeatureUse Dataclasses When...Use Pydantic When...
Data SourceTrusted (Internal functions, DB results).Untrusted (User JSON, API payloads).
SpeedYou are creating 100k+ objects in a loop.You are processing single requests.
TypesYou want exact values (no auto-conversion).You want smart conversion (str int).
EnvironmentLibrary code, Lambdas, Scripts.Web APIs (FastAPI), Config management.

The Modern Hybrid Approach: Most pro Python apps use both.

  • Use Pydantic at the "Edge" (API endpoints) to validate/clean incoming data.
  • Convert that data into Dataclasses for the "Core" (business logic) to pass it around quickly and cheaply.

πŸ“Ί Relevant Video​

Pydantic vs Dataclasses - ArjanCodes

This video provides a practical code walkthrough comparing the syntax and use-cases of both, reinforcing the "trusted vs untrusted" distinction.