Reasons to use dataclass over pydantic basemodel
While Pydantic is the industry standard for external data (APIs, JSON parsing), Python's built-in dataclasses are often the better choice for internal data.
To answer your specific questions:
- Is it speed? YES. Dataclasses are significantly faster at creating objects (instantiation).
- Is it strict data type? NO. Pydantic is stricter. Dataclasses do not validate types at runtime; they blindly accept whatever you give them.
Here are the 4 most convincing reasons to use Dataclasses over Pydantic in a modern app, with examples for each.
1. Speed: The "Tight Loop" Argumentβ
Pydantic runs a complex validation engine every time you create an object. Dataclasses just assign values to memory. In a "tight loop" (creating millions of objects), Pydantic can be a bottleneck.
The Benchmark Logic:
- Pydantic: Checks type, converts type, validates constraints. (~1000ns per object)
- Dataclass:
self.x = x. (~100ns per object)
Example:
from dataclasses import dataclass
from pydantic import BaseModel
import timeit
# 1. The Dataclass
@dataclass(slots=True) # slots=True makes it even faster
class PointDC:
x: int
y: int
# 2. The Pydantic Model
class PointPM(BaseModel):
x: int
y: int
# Benchmarking 1 million creations
def loop_dc():
return [PointDC(i, i) for i in range(1_000)]
def loop_pm():
return [PointPM(x=i, y=i) for i in range(1_000)]
# Result: Dataclasses are typically 10x - 20x faster here
# Use Dataclasses when processing large lists of data internally.
2. Predictability: No "Magic" Coercionβ
Pydantic tries to be helpful by "coercing" data. If you pass the string "42" to an int field, Pydantic converts it to 42.
Dataclasses are "dumb"-they keep exactly what you passed. This is often safer for internal logic where silent type conversion could hide a bug.
Example:
@dataclass
class InventoryItemDC:
name: str
quantity: int
class InventoryItemPM(BaseModel):
name: str
quantity: int
# --- The "Magic" Difference ---
# Pydantic: SILENTLY converts the string "5" to integer 5
item_pm = InventoryItemPM(name="Apple", quantity="5")
print(type(item_pm.quantity)) # <class 'int'> (Magic happened)
# Dataclass: Preserves the string "5" (even though type hint says int)
item_dc = InventoryItemDC(name="Apple", quantity="5")
print(type(item_dc.quantity)) # <class 'str'> (No magic)
# Why use Dataclasses?
# If 'quantity' coming in as a string is a BUG in your code,
# Pydantic hides the bug. Dataclasses let the bug surface so you can fix the root cause.
3. Zero Dependencies (Library Authors)β
If you are writing a library (like a client SDK or a utility tool) that other people will install, you want to keep your "dependency weight" low.
- Dataclasses: Built into Python. Zero extra install size.
- Pydantic: A compiled binary extension (Rust). It adds weight to the installation.
Example: If you are building a simple CLI tool, using Pydantic might add 10-20MB to your docker image or virtual environment. Using Dataclasses adds 0MB.
4. Application Startup Timeβ
Pydantic models are "expensive" to define. When Python first imports a file containing a BaseModel, Pydantic has to inspect fields, build validators, and compile schemas.
Dataclasses are cheap.
Example: For a standard web server (Django/FastAPI), this doesn't matter. But for AWS Lambda or CLI tools (where the script runs for 1 second and dies), Pydantic's import overhead can noticeably slow down the "cold start" time.
π Summary: The Decision Matrixβ
| Feature | Use Dataclasses When... | Use Pydantic When... |
|---|---|---|
| Data Source | Trusted (Internal functions, DB results). | Untrusted (User JSON, API payloads). |
| Speed | You are creating 100k+ objects in a loop. | You are processing single requests. |
| Types | You want exact values (no auto-conversion). | You want smart conversion (str int). |
| Environment | Library code, Lambdas, Scripts. | Web APIs (FastAPI), Config management. |
The Modern Hybrid Approach: Most pro Python apps use both.
- Use Pydantic at the "Edge" (API endpoints) to validate/clean incoming data.
- Convert that data into Dataclasses for the "Core" (business logic) to pass it around quickly and cheaply.
πΊ Relevant Videoβ
Pydantic vs Dataclasses - ArjanCodes
This video provides a practical code walkthrough comparing the syntax and use-cases of both, reinforcing the "trusted vs untrusted" distinction.
