Cache Strategies - Redis

When we add a cache in front of a database, we have to decide who talks to whom. Does the app go to the cache first, or to the DB? Who writes to the cache — the app, or the cache itself? These choices shape the four classic caching strategies.

In simple language — think of the cache as a fast intern and the DB as a slow expert. Do you ask the intern first, then go to the expert if they don’t know? Or do you have the intern handle everything and quietly check with the expert? That’s the difference.

1. Cache-Aside (Lazy Loading)

The most common pattern. The app manages everything — it checks the cache first, falls back to the DB on miss, and populates the cache.

Cache-Aside read flow

App ──GET key──▶ Cache
                   │
       ◀─── hit ───┤
                   │
       ◀── miss ───┘
       │
       ├── SELECT ──▶ DB ──▶ data
       │
       └── SET key ──▶ Cache  (so next read is a hit)

def get_user(user_id):
    key = f"user:{user_id}"
    cached = redis.get(key)
    if cached:
        return json.loads(cached)
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    redis.set(key, json.dumps(user), ex=300)
    return user

Pros: simple, only caches what’s actually used, cache failure doesn’t take down the app. Cons: first read is always a miss (cold cache penalty), stale data unless you invalidate on writes.

This is what 80% of real-world apps use. It’s defensive and predictable.

2. Read-Through

The cache itself talks to the DB. The app only ever talks to the cache.

Read-Through

App ──GET key──▶ Cache ──miss──▶ DB
                   ◀───── data ──────┘
       ◀── data ───┘

This needs a cache layer smart enough to fetch from the DB on miss. Redis on its own doesn’t do this — you’d build it into a client library or use a system like RedisGears, or simply implement it as a wrapper around your cache calls. From the app’s perspective, the cache is the only thing it sees.

Pros: clean app code, cache logic centralized. Cons: first read still slow, requires cache to know your DB schema.

In practice, “cache-aside” wrapped in a helper function ends up looking like read-through anyway.

3. Write-Through

On every write, the app updates both the cache and the DB synchronously. The cache is always fresh.

Write-Through

App ──write──▶ Cache ──▶ DB
                          ◀── ok ──┘
              ◀── ok ────┘
   ◀── ok ───┘

def update_user(user_id, data):
    db.execute("UPDATE users SET ... WHERE id = %s", user_id)
    redis.set(f"user:{user_id}", json.dumps(data), ex=300)

Pros: cache and DB always in sync, no stale reads. Cons: writes are slower (two hops), and you may cache data that’s never read (write to a key no one ever GETs).

Use this when reads vastly outnumber writes and freshness matters.

4. Write-Behind (Write-Back)

The app writes to the cache only. The cache asynchronously batches writes to the DB later.

Write-Behind

App ──write──▶ Cache  (instant ack)
                  │
                  └─ batch ─▶ DB  (async, later)

Pros: fastest writes possible from the app’s view, batching reduces DB load. Cons: if the cache dies before flushing, you lose writes. Complex to get right.

You’d use this for high-write workloads where you can tolerate small data loss — analytics counters, event logs, view counts. Don’t use it for orders, payments, anything that must survive a crash.

Side-by-side

Cache-Aside
write: DB only
read: cache → DB
simple, default

Read-Through
write: DB only
read: cache (cache fetches)
centralized

Write-Through
write: cache + DB sync
read: cache
always fresh

Write-Behind
write: cache only (async to DB)
read: cache
fastest, riskiest

Invalidation — the hard part

For cache-aside, you have two options when data changes:

# option 1: delete the cache key (next read repopulates)
db.execute("UPDATE users SET ... WHERE id = %s", uid)
redis.delete(f"user:{uid}")

# option 2: update the cache key in place
db.execute("UPDATE users SET ... WHERE id = %s", uid)
redis.set(f"user:{uid}", json.dumps(new_data), ex=300)

Option 1 (delete-on-write) is safer. If the DB write succeeds but the cache update races with a concurrent read, you can end up with stale data. Deleting forces the next reader to refetch.

There’s a classic two famous things — “There are only two hard things in computer science: cache invalidation and naming things.” This is exactly the cache invalidation part. There’s no perfect answer; pick the strategy that matches your read/write ratio and freshness needs.

For most apps: cache-aside with delete-on-write + a TTL safety net. The TTL catches anything you forget to invalidate.