Ever written a class where __init__ just assigns a bunch of attributes, then added __repr__ and __eq__ by hand? That’s boilerplate. The @dataclass decorator (Python 3.7+) generates all of it for us.
Before vs After
# Without dataclass — lots of boilerplate
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return f"Point(x={self.x}, y={self.y})"
def __eq__(self, other):
return self.x == other.x and self.y == other.y
# With dataclass — same thing, 3 lines
from dataclasses import dataclass
@dataclass
class Point:
x: float
y: float
That’s it. We get __init__, __repr__, and __eq__ for free.
Default Values
We can set defaults just like function arguments. The only rule: fields with defaults must come after fields without.
@dataclass
class User:
name: str
email: str
active: bool = True # default value
role: str = "viewer"
u = User("Manish", "manish@example.com")
print(u) # User(name='Manish', email='manish@example.com', active=True, role='viewer')
field() and default_factory
For mutable defaults (lists, dicts), we can’t just use = [] — that’s the same shared list gotcha as function defaults. We use field(default_factory=...) instead.
from dataclasses import dataclass, field
@dataclass
class Team:
name: str
members: list = field(default_factory=list) # new list for each instance
metadata: dict = field(default_factory=dict)
The field() function also lets us exclude fields from repr or comparison:
@dataclass
class User:
name: str
password: str = field(repr=False) # hidden in __repr__
_internal: int = field(default=0, compare=False) # ignored in __eq__
Frozen Dataclasses (Immutable)
Adding frozen=True makes instances read-only. Any attempt to change an attribute raises FrozenInstanceError.
@dataclass(frozen=True)
class Config:
host: str
port: int
c = Config("localhost", 8080)
c.port = 9090 # FrozenInstanceError!
Frozen dataclasses are also hashable by default, so we can use them in sets and as dict keys.
__post_init__
If we need custom logic after initialization, we define __post_init__. Python calls it right after the auto-generated __init__.
@dataclass
class Rectangle:
width: float
height: float
area: float = field(init=False) # not passed to __init__
def __post_init__(self):
self.area = self.width * self.height
r = Rectangle(3, 4)
print(r.area) # 12.0
slots=True (Python 3.10+)
Adding slots=True generates __slots__, making instances use less memory and have faster attribute access.
@dataclass(slots=True)
class Point:
x: float
y: float
Dataclass vs NamedTuple vs Dict
- dict — use when the structure is dynamic or we’re just passing data around loosely
- NamedTuple — immutable, lightweight, works as a tuple (can unpack, index). Great for simple records
- dataclass — mutable by default, supports methods, default factories, inheritance. Best for structured objects
from typing import NamedTuple
class PointNT(NamedTuple): # immutable, tuple-like
x: float
y: float
p = PointNT(1, 2)
print(p[0]) # 1 — works like a tuple
In simple language, @dataclass kills the boilerplate. We just declare the fields and Python generates the boring methods for us. Use it whenever we’re building a class that’s mainly about holding data.