This is probably the most classic system design interview question. We’re designing a URL shortening service like TinyURL or bit.ly — take a long URL, give back a short one, and when someone visits the short URL, redirect them to the original. Simple concept, but the details get interesting fast.
Let’s walk through it step by step, the way we’d do it in an actual interview.
Step 1: Requirements
Functional Requirements
- Given a long URL, generate a unique short URL
- When a user visits the short URL, redirect them to the original long URL
- Users can optionally set a custom short link
- Links expire after a default period (configurable)
- Analytics — track how many times a short URL was clicked
Non-Functional Requirements
- High availability — the redirect service can’t go down, or millions of links break
- Low latency — redirects should feel instant (< 50ms)
- Read-heavy — way more people click short links than create them (100:1 read/write ratio)
- Short URLs should be as short as possible
- URLs should not be predictable (so people can’t guess other URLs)
Step 2: Estimation
Let’s put some numbers on this.
Assumptions:
- 500M new URLs created per month
- 100:1 read to write ratio → 50B redirects per month
QPS:
Write QPS = 500M / (30 × 86,400) ≈ ~200 URLs/sec
Read QPS = 200 × 100 = ~20,000 redirects/sec
Peak QPS = ~40,000 redirects/sec (2x average)
Storage (5 years):
Each URL record ≈ 500 bytes (short URL + long URL + metadata)
500M × 12 months × 5 years = 30 Billion URLs
30B × 500 bytes = 15 TB
Cache:
Following the 80/20 rule — 20% of URLs generate 80% of traffic.
Daily read requests = 50B / 30 ≈ 1.7B/day
Cache 20% of daily data = 1.7B × 0.2 × 500 bytes ≈ 170 GB
170 GB fits comfortably in a few Redis instances. Nice.
Step 3: High-Level Design
How the redirect works:
- User visits
short.url/abc123 - Load balancer routes to an app server
- App server checks Redis cache for
abc123 - Cache hit → return the long URL. Cache miss → query DB, put in cache, return.
- Server responds with a 301 (permanent redirect) or 302 (temporary redirect)
301 vs 302 — which do we pick?
- 301 (Moved Permanently) — The browser caches the redirect. Next time, it goes directly to the long URL without hitting our server. Better for the user, but we lose analytics visibility.
- 302 (Found/Temporary) — The browser always comes back to our server first. We can track every click. More load on our server, but we keep full analytics.
If analytics matter (and for a URL shortener, they do), we go with 302.
Step 4: API Design
POST /api/v1/shorten
Body: { "long_url": "https://example.com/very/long/path", "custom_alias": "my-link", "expires_at": "2026-12-31" }
Response: { "short_url": "https://short.url/abc123", "expires_at": "2026-12-31" }
GET /{short_url_key}
Response: HTTP 302 Redirect → Location: https://example.com/very/long/path
GET /api/v1/stats/{short_url_key}
Response: { "total_clicks": 15420, "created_at": "2025-01-15", "long_url": "..." }
Step 5: Data Model
-- Main URL table
CREATE TABLE urls (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
short_key VARCHAR(7) UNIQUE NOT NULL, -- the "abc123" part
long_url TEXT NOT NULL,
user_id BIGINT, -- nullable for anonymous users
created_at TIMESTAMP DEFAULT NOW(),
expires_at TIMESTAMP,
INDEX idx_short_key (short_key) -- fast lookups by short key
);
-- Analytics table (append-only, write-heavy)
CREATE TABLE click_events (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
short_key VARCHAR(7) NOT NULL,
clicked_at TIMESTAMP DEFAULT NOW(),
ip_address VARCHAR(45),
user_agent TEXT,
referrer TEXT,
country VARCHAR(2),
INDEX idx_short_key_time (short_key, clicked_at)
);
We keep the URL table lean for fast reads. The analytics table is append-only — we’re only ever inserting into it, never updating. This is a perfect candidate for a time-series approach or a message queue that writes asynchronously.
Step 6: Deep Dives
Deep Dive 1: Short URL Generation Strategies
This is the heart of the problem. How do we turn a long URL into a short, unique key? We have three main approaches.
Approach A: Hash + Collision Check
Take the long URL, hash it (MD5 or SHA-256), and take the first 7 characters.
MD5("https://example.com/long/path") = "a1b2c3d4e5f6..."
Short key = "a1b2c3d" (first 7 chars)
Problem: collisions. Two different URLs could produce the same first 7 characters. So we check the DB — if the key exists, we append a counter and rehash. This works but the collision checks add latency and complexity.
Approach B: Auto-Increment ID + Base62 Encoding
Use a database auto-increment ID and convert it to base62 (a-z, A-Z, 0-9 = 62 characters).
ID = 123456789
Base62 = "8M0kX" (123456789 in base 62)
With 7 characters, base62 gives us 62^7 = 3.5 trillion possible URLs. That’s plenty. The problem? URLs are predictable. If someone gets abc123, they know abc124 probably exists too. Also, the auto-increment ID becomes a single point of failure in a distributed system.
Approach C: Pre-Generated Key Service (KGS)
A separate service pre-generates millions of unique keys and stores them in a database. When an app server needs a key, it grabs one from the pool.
KGS Database:
┌─────────────┬──────────┐
│ key │ used │
├─────────────┼──────────┤
│ "a7Bx2q" │ false │
│ "k9Mw3r" │ false │
│ "p2Lz8n" │ true │ ← already assigned
└─────────────┴──────────┘
Each app server fetches a batch of keys (say 1000) and keeps them in memory. No collision checking needed. No coordination between servers. This is the most scalable approach and the one most interviewers love.
The winner: Pre-Generated Key Service. It’s clean, fast, and eliminates the collision problem entirely.
Deep Dive 2: Caching Hot URLs
Our system is extremely read-heavy (100:1). Most traffic goes to a small percentage of popular URLs. This screams “cache me.”
We put Redis between the app servers and the database. The strategy:
- On a redirect request, check Redis first
- Cache hit → return immediately (sub-millisecond)
- Cache miss → query DB → store in Redis with a TTL → return
- Use LRU eviction — when cache is full, kick out the least recently used URL
With our 170 GB estimate, we can use a Redis cluster of 3-4 nodes with replication. The cache hit rate should be 90%+ since URL access follows a power law — a few URLs get the vast majority of clicks.
Cache invalidation: When a URL expires or gets deleted, we remove it from the cache. Simple because URLs are immutable — we never update a short URL to point to a different long URL.
Deep Dive 3: Analytics and Click Tracking
Every redirect is a potential analytics event. But we can’t let analytics slow down the redirect. The redirect must be fast — analytics can be eventual.
The approach: async processing with a message queue.
User clicks → App Server sends 302 redirect immediately
→ App Server pushes click event to Kafka/SQS
→ Analytics workers consume from queue
→ Workers batch-insert into click_events table
This way, the user gets their redirect in milliseconds. The analytics data flows through a queue and gets processed in the background. If the analytics system falls behind, clicks queue up but the redirect service stays fast.
For the dashboard, we can pre-aggregate hourly/daily counts in a summary table instead of running expensive COUNT queries on billions of rows.
Step 7: Scaling
Database scaling:
- The URL table is read-heavy → add read replicas
- As data grows past a single DB → shard by short_key hash
- The click_events table grows fast → partition by time (monthly partitions) and archive old data
App server scaling:
- Stateless servers behind a load balancer → horizontally scale by adding more instances
- Each server holds a batch of pre-generated keys in memory → no coordination needed between servers
Cache scaling:
- Start with a single Redis instance, then move to Redis Cluster
- Consistent hashing to distribute keys across cache nodes
Global distribution:
- Deploy app servers in multiple regions
- Use GeoDNS to route users to the nearest region
- Replicate the database across regions (or use a globally distributed DB like CockroachDB)
Handling 40K QPS at peak:
- Load balancer distributes across app servers
- Redis handles ~90% of reads (36K QPS in cache = easy for Redis)
- Only ~4K QPS actually hits the database
- Each DB read is a simple key lookup by indexed short_key — blazing fast
In simple language, a URL shortener is a giant key-value store with a smart key generation strategy. The short key is the hard part — we use a pre-generated key service to avoid collisions. The redirect is the hot path — we cache it aggressively. And analytics go through a queue so they never slow down the user. That’s the whole system.