Design a URL Shortener (TinyURL)

intermediate 2-4 YOE url-shortener system-design encoding caching

This is probably the most classic system design interview question. We’re designing a URL shortening service like TinyURL or bit.ly — take a long URL, give back a short one, and when someone visits the short URL, redirect them to the original. Simple concept, but the details get interesting fast.

Let’s walk through it step by step, the way we’d do it in an actual interview.

Step 1: Requirements

Functional Requirements

  • Given a long URL, generate a unique short URL
  • When a user visits the short URL, redirect them to the original long URL
  • Users can optionally set a custom short link
  • Links expire after a default period (configurable)
  • Analytics — track how many times a short URL was clicked

Non-Functional Requirements

  • High availability — the redirect service can’t go down, or millions of links break
  • Low latency — redirects should feel instant (< 50ms)
  • Read-heavy — way more people click short links than create them (100:1 read/write ratio)
  • Short URLs should be as short as possible
  • URLs should not be predictable (so people can’t guess other URLs)

Step 2: Estimation

Let’s put some numbers on this.

Assumptions:

  • 500M new URLs created per month
  • 100:1 read to write ratio → 50B redirects per month

QPS:

Write QPS = 500M / (30 × 86,400) ≈ ~200 URLs/sec
Read QPS  = 200 × 100 = ~20,000 redirects/sec
Peak QPS  = ~40,000 redirects/sec (2x average)

Storage (5 years):

Each URL record ≈ 500 bytes (short URL + long URL + metadata)
500M × 12 months × 5 years = 30 Billion URLs
30B × 500 bytes = 15 TB

Cache:

Following the 80/20 rule — 20% of URLs generate 80% of traffic.
Daily read requests = 50B / 30 ≈ 1.7B/day
Cache 20% of daily data = 1.7B × 0.2 × 500 bytes ≈ 170 GB

170 GB fits comfortably in a few Redis instances. Nice.

Step 3: High-Level Design

URL Shortener — High-Level Architecture
Clients (Browser / Mobile)
Load Balancer
App Server 1 App Server 2 App Server N
Cache (Redis)
hot URLs
Database
all URLs
Write flow: Client → LB → App Server → DB (generate short URL)
Read flow: Client → LB → App Server → Cache (hit?) → DB → 301 Redirect

How the redirect works:

  1. User visits short.url/abc123
  2. Load balancer routes to an app server
  3. App server checks Redis cache for abc123
  4. Cache hit → return the long URL. Cache miss → query DB, put in cache, return.
  5. Server responds with a 301 (permanent redirect) or 302 (temporary redirect)

301 vs 302 — which do we pick?

  • 301 (Moved Permanently) — The browser caches the redirect. Next time, it goes directly to the long URL without hitting our server. Better for the user, but we lose analytics visibility.
  • 302 (Found/Temporary) — The browser always comes back to our server first. We can track every click. More load on our server, but we keep full analytics.

If analytics matter (and for a URL shortener, they do), we go with 302.

Step 4: API Design

POST /api/v1/shorten
Body: { "long_url": "https://example.com/very/long/path", "custom_alias": "my-link", "expires_at": "2026-12-31" }
Response: { "short_url": "https://short.url/abc123", "expires_at": "2026-12-31" }

GET /{short_url_key}
Response: HTTP 302 Redirect → Location: https://example.com/very/long/path

GET /api/v1/stats/{short_url_key}
Response: { "total_clicks": 15420, "created_at": "2025-01-15", "long_url": "..." }

Step 5: Data Model

-- Main URL table
CREATE TABLE urls (
    id          BIGINT PRIMARY KEY AUTO_INCREMENT,
    short_key   VARCHAR(7) UNIQUE NOT NULL,   -- the "abc123" part
    long_url    TEXT NOT NULL,
    user_id     BIGINT,                        -- nullable for anonymous users
    created_at  TIMESTAMP DEFAULT NOW(),
    expires_at  TIMESTAMP,
    INDEX idx_short_key (short_key)            -- fast lookups by short key
);

-- Analytics table (append-only, write-heavy)
CREATE TABLE click_events (
    id          BIGINT PRIMARY KEY AUTO_INCREMENT,
    short_key   VARCHAR(7) NOT NULL,
    clicked_at  TIMESTAMP DEFAULT NOW(),
    ip_address  VARCHAR(45),
    user_agent  TEXT,
    referrer    TEXT,
    country     VARCHAR(2),
    INDEX idx_short_key_time (short_key, clicked_at)
);

We keep the URL table lean for fast reads. The analytics table is append-only — we’re only ever inserting into it, never updating. This is a perfect candidate for a time-series approach or a message queue that writes asynchronously.

Step 6: Deep Dives

Deep Dive 1: Short URL Generation Strategies

This is the heart of the problem. How do we turn a long URL into a short, unique key? We have three main approaches.

Approach A: Hash + Collision Check

Take the long URL, hash it (MD5 or SHA-256), and take the first 7 characters.

MD5("https://example.com/long/path") = "a1b2c3d4e5f6..."
Short key = "a1b2c3d" (first 7 chars)

Problem: collisions. Two different URLs could produce the same first 7 characters. So we check the DB — if the key exists, we append a counter and rehash. This works but the collision checks add latency and complexity.

Approach B: Auto-Increment ID + Base62 Encoding

Use a database auto-increment ID and convert it to base62 (a-z, A-Z, 0-9 = 62 characters).

ID = 123456789
Base62 = "8M0kX"   (123456789 in base 62)

With 7 characters, base62 gives us 62^7 = 3.5 trillion possible URLs. That’s plenty. The problem? URLs are predictable. If someone gets abc123, they know abc124 probably exists too. Also, the auto-increment ID becomes a single point of failure in a distributed system.

Approach C: Pre-Generated Key Service (KGS)

A separate service pre-generates millions of unique keys and stores them in a database. When an app server needs a key, it grabs one from the pool.

KGS Database:
┌─────────────┬──────────┐
│ key         │ used     │
├─────────────┼──────────┤
│ "a7Bx2q"   │ false    │
│ "k9Mw3r"   │ false    │
│ "p2Lz8n"   │ true     │  ← already assigned
└─────────────┴──────────┘

Each app server fetches a batch of keys (say 1000) and keeps them in memory. No collision checking needed. No coordination between servers. This is the most scalable approach and the one most interviewers love.

The winner: Pre-Generated Key Service. It’s clean, fast, and eliminates the collision problem entirely.

Deep Dive 2: Caching Hot URLs

Our system is extremely read-heavy (100:1). Most traffic goes to a small percentage of popular URLs. This screams “cache me.”

We put Redis between the app servers and the database. The strategy:

  1. On a redirect request, check Redis first
  2. Cache hit → return immediately (sub-millisecond)
  3. Cache miss → query DB → store in Redis with a TTL → return
  4. Use LRU eviction — when cache is full, kick out the least recently used URL

With our 170 GB estimate, we can use a Redis cluster of 3-4 nodes with replication. The cache hit rate should be 90%+ since URL access follows a power law — a few URLs get the vast majority of clicks.

Cache invalidation: When a URL expires or gets deleted, we remove it from the cache. Simple because URLs are immutable — we never update a short URL to point to a different long URL.

Deep Dive 3: Analytics and Click Tracking

Every redirect is a potential analytics event. But we can’t let analytics slow down the redirect. The redirect must be fast — analytics can be eventual.

The approach: async processing with a message queue.

User clicks → App Server sends 302 redirect immediately
           → App Server pushes click event to Kafka/SQS
           → Analytics workers consume from queue
           → Workers batch-insert into click_events table

This way, the user gets their redirect in milliseconds. The analytics data flows through a queue and gets processed in the background. If the analytics system falls behind, clicks queue up but the redirect service stays fast.

For the dashboard, we can pre-aggregate hourly/daily counts in a summary table instead of running expensive COUNT queries on billions of rows.

Step 7: Scaling

Database scaling:

  • The URL table is read-heavy → add read replicas
  • As data grows past a single DB → shard by short_key hash
  • The click_events table grows fast → partition by time (monthly partitions) and archive old data

App server scaling:

  • Stateless servers behind a load balancer → horizontally scale by adding more instances
  • Each server holds a batch of pre-generated keys in memory → no coordination needed between servers

Cache scaling:

  • Start with a single Redis instance, then move to Redis Cluster
  • Consistent hashing to distribute keys across cache nodes

Global distribution:

  • Deploy app servers in multiple regions
  • Use GeoDNS to route users to the nearest region
  • Replicate the database across regions (or use a globally distributed DB like CockroachDB)

Handling 40K QPS at peak:

  • Load balancer distributes across app servers
  • Redis handles ~90% of reads (36K QPS in cache = easy for Redis)
  • Only ~4K QPS actually hits the database
  • Each DB read is a simple key lookup by indexed short_key — blazing fast

In simple language, a URL shortener is a giant key-value store with a smart key generation strategy. The short key is the hard part — we use a pre-generated key service to avoid collisions. The redirect is the hot path — we cache it aggressively. And analytics go through a queue so they never slow down the user. That’s the whole system.