Dataclasses & NamedTuple

intermediate dataclass NamedTuple frozen field __post_init__

Most classes we write just hold data. A User with a name and email. A Point with x and y. But every time, we end up writing the same boring __init__, __repr__, and __eq__ methods. Dataclasses fix this by generating all that boilerplate for us.

The Boilerplate Problem

Here’s a plain class that holds data:

class User:
    def __init__(self, name, email, age):
        self.name = name
        self.email = email
        self.age = age

    def __repr__(self):
        return f"User(name={self.name!r}, email={self.email!r}, age={self.age})"

    def __eq__(self, other):
        return isinstance(other, User) and (self.name, self.email, self.age) == (other.name, other.email, other.age)

That’s 12 lines just to store three values. And we’d need even more for __hash__, ordering, etc. With @dataclass, we get all of that in 5 lines.

@dataclass Decorator

from dataclasses import dataclass

@dataclass
class User:
    name: str
    email: str
    age: int

That’s it. Python auto-generates __init__, __repr__, and __eq__ for us. We just declare the fields with type annotations.

u1 = User("Manish", "manish@example.com", 25)
u2 = User("Manish", "manish@example.com", 25)
print(u1)        # User(name='Manish', email='manish@example.com', age=25)
print(u1 == u2)  # True — compares by value, not identity

Default Values and field()

We can set defaults just like function arguments. But mutable defaults (lists, dicts) need field(default_factory=...) to avoid the shared-mutable-default trap.

from dataclasses import dataclass, field

@dataclass
class Team:
    name: str
    members: list[str] = field(default_factory=list)  # each instance gets its own list
    max_size: int = 10  # simple default is fine

If we tried members: list = [], every Team instance would share the same list. default_factory creates a fresh one each time.

__post_init__ for Computed Fields

Sometimes we need a field that’s derived from other fields. __post_init__ runs right after the auto-generated __init__.

@dataclass
class Rectangle:
    width: float
    height: float
    area: float = field(init=False)  # not in __init__, computed instead

    def __post_init__(self):
        self.area = self.width * self.height

r = Rectangle(4, 5)
print(r.area)  # 20.0

The init=False tells the dataclass “don’t accept this in the constructor — I’ll set it myself.”

frozen=True for Immutability

Adding frozen=True makes the dataclass immutable — we can’t change fields after creation. This also makes instances hashable (usable as dict keys or in sets).

@dataclass(frozen=True)
class Point:
    x: float
    y: float

p = Point(1.0, 2.0)
# p.x = 5.0  # FrozenInstanceError — can't modify
print({p: "origin"})  # works as dict key because it's hashable

slots=True and kw_only (Python 3.10+)

slots=True generates __slots__ instead of using __dict__, saving memory. kw_only=True forces all fields to be keyword-only arguments.

@dataclass(slots=True, kw_only=True)
class Config:
    host: str
    port: int = 8080
    debug: bool = False

# Config("localhost")           # TypeError — must use keywords
c = Config(host="localhost")    # works
# c.__dict__                    # AttributeError — uses slots, no __dict__

Typed NamedTuple

NamedTuple is another way to create simple data-holding classes. The key difference: named tuples are tuples. They’re immutable, ordered, and support indexing.

from typing import NamedTuple

class Point(NamedTuple):
    x: float
    y: float

p = Point(1.0, 2.0)
print(p.x)      # 1.0 — access by name
print(p[0])     # 1.0 — access by index (it's a tuple!)
# p.x = 5.0     # AttributeError — tuples are immutable
x, y = p        # unpacking works

@dataclass vs NamedTuple vs Plain Class

Three Approaches Compared
@dataclass
Mutable by default • Has __dict__ (or slots) • Supports defaults, field(), __post_init__ • Can be frozen • Best for most data classes
NamedTuple
Always immutable • It IS a tuple • Supports indexing and unpacking • Hashable by default • Best for simple records and interop with tuple APIs
Plain Class
Full control • Maximum boilerplate • Custom __init__ logic • Best when we need complex behavior beyond just holding data
Rule of thumb: start with @dataclass. Use NamedTuple for immutable records. Use plain class only when we need full customization.

Quick Decision Guide

  • Need mutable data with nice defaults? @dataclass
  • Need an immutable, hashable record that works like a tuple? NamedTuple
  • Need complex initialization logic, custom __new__, or the class is more behavior than data? Plain class

In simple language, dataclasses and named tuples eliminate the boilerplate of writing __init__, __repr__, and __eq__ by hand. @dataclass is the go-to for most data-holding classes, NamedTuple is great when we want immutable tuple-like records, and plain classes are for when we need full control.