Gyaan

Dataclasses

intermediate dataclasses data-class namedtuple

Ever written a class where __init__ just assigns a bunch of attributes, then added __repr__ and __eq__ by hand? That’s boilerplate. The @dataclass decorator (Python 3.7+) generates all of it for us.

Before vs After

# Without dataclass — lots of boilerplate
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def __repr__(self):
        return f"Point(x={self.x}, y={self.y})"
    def __eq__(self, other):
        return self.x == other.x and self.y == other.y
# With dataclass — same thing, 3 lines
from dataclasses import dataclass

@dataclass
class Point:
    x: float
    y: float

That’s it. We get __init__, __repr__, and __eq__ for free.

Default Values

We can set defaults just like function arguments. The only rule: fields with defaults must come after fields without.

@dataclass
class User:
    name: str
    email: str
    active: bool = True  # default value
    role: str = "viewer"

u = User("Manish", "manish@example.com")
print(u)  # User(name='Manish', email='manish@example.com', active=True, role='viewer')

field() and default_factory

For mutable defaults (lists, dicts), we can’t just use = [] — that’s the same shared list gotcha as function defaults. We use field(default_factory=...) instead.

from dataclasses import dataclass, field

@dataclass
class Team:
    name: str
    members: list = field(default_factory=list)  # new list for each instance
    metadata: dict = field(default_factory=dict)

The field() function also lets us exclude fields from repr or comparison:

@dataclass
class User:
    name: str
    password: str = field(repr=False)  # hidden in __repr__
    _internal: int = field(default=0, compare=False)  # ignored in __eq__

Frozen Dataclasses (Immutable)

Adding frozen=True makes instances read-only. Any attempt to change an attribute raises FrozenInstanceError.

@dataclass(frozen=True)
class Config:
    host: str
    port: int

c = Config("localhost", 8080)
c.port = 9090  # FrozenInstanceError!

Frozen dataclasses are also hashable by default, so we can use them in sets and as dict keys.

__post_init__

If we need custom logic after initialization, we define __post_init__. Python calls it right after the auto-generated __init__.

@dataclass
class Rectangle:
    width: float
    height: float
    area: float = field(init=False)  # not passed to __init__

    def __post_init__(self):
        self.area = self.width * self.height

r = Rectangle(3, 4)
print(r.area)  # 12.0

slots=True (Python 3.10+)

Adding slots=True generates __slots__, making instances use less memory and have faster attribute access.

@dataclass(slots=True)
class Point:
    x: float
    y: float

Dataclass vs NamedTuple vs Dict

  • dict — use when the structure is dynamic or we’re just passing data around loosely
  • NamedTuple — immutable, lightweight, works as a tuple (can unpack, index). Great for simple records
  • dataclass — mutable by default, supports methods, default factories, inheritance. Best for structured objects
from typing import NamedTuple

class PointNT(NamedTuple):  # immutable, tuple-like
    x: float
    y: float

p = PointNT(1, 2)
print(p[0])  # 1 — works like a tuple

In simple language, @dataclass kills the boilerplate. We just declare the fields and Python generates the boring methods for us. Use it whenever we’re building a class that’s mainly about holding data.