Caching is storing a copy of data in a faster location so we don’t have to fetch it from the slower source every time. Think of it like keeping a sticky note on our desk instead of walking to the filing cabinet every time we need a phone number.
It’s probably the single most impactful thing we can do for performance.
Where to Cache
The closer the cache is to the user, the faster the response. A browser cache hit is instant. A CDN hit avoids the trip to our data center. A Redis hit avoids the slow database query.
Cache Hit vs Cache Miss
- Cache hit — The data is in the cache. We return it immediately. Fast.
- Cache miss — The data is NOT in the cache. We fetch from the source, store it in the cache, then return it. Slow (but next time it’ll be a hit).
The hit ratio tells us how effective our cache is. If 95% of requests are cache hits, our cache is doing great. Below 80%, we should rethink our strategy.
Cache Eviction Policies
The cache has limited memory. When it’s full and new data comes in, we have to kick something out. The question is: what do we evict?
| Policy | How It Works | Use When |
|---|---|---|
| LRU (Least Recently Used) | Evict the item not accessed for the longest time | General purpose — most common choice |
| LFU (Least Frequently Used) | Evict the item accessed the fewest times | Hot items should stay (e.g., trending content) |
| FIFO (First In, First Out) | Evict the oldest item | Simple, order-based access patterns |
| TTL (Time To Live) | Items expire after a fixed time | Data that goes stale (API responses, sessions) |
LRU is the default choice in most system design interviews. It’s simple and works well for most access patterns.
Cache Invalidation Strategies
The hardest problem with caching: keeping the cache in sync with the database. If the database changes but the cache still has the old value, users see stale data.
Cache-Aside (Lazy Loading)
The most common pattern. The application manages the cache directly:
- Read: Check cache first → miss → read from DB → write to cache → return
- Write: Write to DB → delete from cache (next read will re-populate it)
Pros: Only caches what’s actually requested. Cache failure doesn’t break the system. Cons: First request after a miss is slow. Potential for stale data between DB write and cache delete.
Write-Through
Every write goes to the cache AND the database at the same time.
Pros: Cache is always up to date. No stale data. Cons: Higher write latency (two writes per operation). Cache may fill with data that’s never read.
Write-Behind (Write-Back)
Write to the cache first, then asynchronously write to the database later.
Pros: Super fast writes. Great for write-heavy workloads. Cons: Risk of data loss if the cache crashes before persisting to DB.
When NOT to Cache
Caching isn’t always the answer:
- Frequently changing data — If data changes every second, the cache is constantly stale.
- Low-traffic data — If it’s rarely accessed, the cache miss rate is high and we’re wasting memory.
- Write-heavy workloads — More writes than reads means the cache is constantly being invalidated.
- Data that must be perfectly consistent — Like account balances. Stale cache = real problems.
Popular Caching Tools
- Redis — In-memory key-value store. Supports data structures (lists, sets, sorted sets). Most popular choice.
- Memcached — Simpler than Redis, pure key-value. Slightly faster for simple caching.
- Varnish — HTTP reverse proxy cache. Great for caching entire HTTP responses.
Cache in System Design Interviews
When an interviewer asks “how would you improve performance?”, caching is almost always part of the answer. Common cache use cases:
- Cache database query results to reduce DB load
- Cache user session data for fast authentication
- Cache API responses from third-party services
- Cache computed results (like feed generation or recommendations)
In simple language, caching trades memory for speed. We store a copy of frequently accessed data in a fast place so we don’t keep hammering the slow place. It’s the difference between a 2ms response (Redis) and a 200ms response (database query with joins).