High-Level Design — Quick Summary
Quick revision: every topic, key terms, and mnemonics for High-Level Design.
This is a quick revision doc covering all 45 topics in hld. Open the linked notes if you want depth.
Foundations & Approach
What Is System Design?
What it is. Defining the architecture, components, and data flow of a system to meet requirements. Conversation, not test.
Key terms.
- Trade-offs — every choice has costs.
- Building blocks — load balancers, caches, queues, databases.
- Stages of growth — single server → separate DB → LB+multiple servers → distributed system.
Remember. Interviewers evaluate communication, problem breakdown, trade-off awareness, breadth, and depth — not perfect architecture.
How to Approach System Design
What it is. Repeatable framework for any 45-min interview.
Key terms (5-step framework).
- Step 1 (5 min) — Requirements & Scope.
- Step 2 (5 min) — Back-of-the-envelope estimation.
- Step 3 (15 min) — High-level design (boxes + APIs + data flow).
- Step 4 (15 min) — Deep dive (2-3 areas).
- Step 5 (5 min) — Wrap up + improvements.
Remember. Common mistakes: jumping to solution, over-engineering for Google-scale, silence, no trade-offs, ignoring interviewer hints.
Requirements Gathering
What it is. Turning vague prompts into concrete scope.
Key terms.
- Functional requirements — what the system does (features).
- Non-functional requirements — how well (scalability, availability, consistency, latency, durability).
- Availability nines — 99% (3.65 days/yr), 99.9% (8.7 hrs), 99.99% (52 min), 99.999% (5 min).
Remember. Always ask: How many users? Read/write ratio? Real-time vs batch? Consistency vs availability? When in conflict, state which one wins.
Back-of-the-Envelope Estimation
What it is. Quick math to feel the scale. Order-of-magnitude only.
Key terms.
- 86,400 seconds/day — round to 100K (10^5).
- Peak QPS ≈ 2-3x avg.
- Numbers to know — RAM 100ns, SSD 150μs, datacenter RTT 0.5ms, cross-continent ~150ms.
- Storage units — 2^10 ≈ 1K, 2^20 ≈ 1M, 2^30 ≈ 1B, 2^40 ≈ 1T.
- 80/20 rule — 20% of data handles 80% of traffic (cache budget).
Remember. Round aggressively. State assumptions. Don’t spend more than 5 min.
Design Principles and Trade-offs
What it is. Every design choice is a trade-off.
Key terms.
- Scalability — vertical (bigger machine) vs horizontal (more machines).
- Latency vs Throughput — fast individual reqs vs many req/sec.
- SPOF — Single Point of Failure. Eliminate at every layer.
- Stateless vs Stateful — push state to external stores so app servers can scale.
- Consistency vs Availability — banking picks C, social media picks A.
- p50 / p95 / p99 — latency percentiles. p99 catches the slow tail.
Remember. No “best” architecture, only the right one for the requirements.
Core Building Blocks
DNS and How the Internet Works
What it is. Translates names → IPs. The first step of every web request.
Key terms.
- Hierarchy — Stub resolver → Recursive resolver → Root → TLD → Authoritative.
- Records — A (IPv4), AAAA (IPv6), CNAME (alias), NS, MX (mail).
- TTL — caching duration.
Remember. DNS is not just resolution — it does load distribution (round-robin), geo-routing, and failover.
Load Balancers
What it is. Distributes traffic across servers. Removes SPOF.
Key terms.
- L4 (transport) — IP+port. Fast, dumb. NLB.
- L7 (application) — reads HTTP. Path/host routing, sticky sessions, TLS termination. ALB, NGINX.
- Algorithms — Round Robin, Weighted Round Robin, Least Connections, IP Hash, Random.
- Health checks — periodic ping; remove dead servers.
- Active-passive vs Active-active redundancy.
Remember. Tools: NGINX, HAProxy, AWS ALB/NLB, Caddy. Always have redundant LBs (LB itself is a SPOF).
Caching
What it is. Storing data in faster location to avoid the slow source.
Key terms.
- Layers — Browser → CDN → App cache (Redis/Memcached) → DB cache.
- Hit / miss / hit ratio — aim for 95%+ on static, 80%+ on dynamic.
- Eviction — LRU (default), LFU, FIFO, TTL.
- Patterns — Cache-Aside (most common), Write-Through, Write-Behind.
Remember. Don’t cache: frequently-changing data, low-traffic data, write-heavy workloads, must-be-perfectly-consistent data.
Content Delivery Networks (CDNs)
What it is. Geographically distributed cache servers near users.
Key terms.
- Edge / PoP — point of presence.
- Pull CDN (default) — fetches from origin on miss.
- Push CDN — we upload ahead.
- Cache invalidation — TTL, explicit purge, versioned URLs (best).
Remember. Use CDN for static content (images, JS, CSS, video). Don’t use for dynamic personalised content. Pair with object storage (S3+CloudFront).
Message Queues
What it is. Async buffer between producer and consumer. Decouples services.
Key terms.
- Point-to-Point — one message → one consumer.
- Pub/Sub — one message → all subscribers.
- DLQ — Dead Letter Queue for poison messages.
- Tools — Kafka (high throughput, retains messages, replay), RabbitMQ (complex routing), SQS (managed).
Remember. Use queues for: decoupling services, traffic spikes, retries, async heavy work (email, report, video encoding).
Proxies and Reverse Proxies
What it is. Intermediary servers. Forward = hides client. Reverse = hides server.
Key terms.
- Reverse proxy duties — TLS termination, load balancing, compression, caching, request routing, rate limiting, WAF.
- Tools — Nginx, Caddy (auto HTTPS), HAProxy, Traefik (k8s native).
- Reverse proxy vs LB vs API Gateway — overlapping; many tools do all three.
Remember. “Proxy to access blocked sites” = forward (VPN). “Nginx in front of my app” = reverse. CDN = distributed reverse proxy.
Database Deep Dive
SQL vs NoSQL
What it is. Relational vs non-relational. Pick based on data shape + scale + consistency.
Key terms.
- SQL — fixed schema, ACID, vertical scaling, JOINs. Postgres, MySQL.
- NoSQL types — Key-Value (Redis, DynamoDB), Document (MongoDB), Columnar (Cassandra), Graph (Neo4j).
- ACID vs BASE — strict guarantees vs eventual consistency.
Remember. Real systems use both. PostgreSQL primary + Redis cache + Elasticsearch search is the classic combo.
Database Indexing
What it is. Separate sorted structure for fast lookups. Like book index.
Key terms.
- B-Tree — default, O(log n) lookups + range queries.
- Hash index — O(1) but no ranges.
- Composite index — leftmost prefix rule on
(a, b, c). - Covering index — INCLUDE all needed columns; index-only scan.
- Unique index — enforces uniqueness.
Remember. Indexes speed reads, slow writes, take disk space. Don’t index small tables, low-cardinality columns, write-heavy tables.
Database Replication
What it is. Copies of data on multiple servers. HA + read scaling + DR.
Key terms.
- Single-leader (master-slave) — one writer, many readers. Most common.
- Multi-leader — for multi-region writes; conflict resolution hard (LWW, CRDTs).
- Leaderless (Dynamo) — quorum-based:
w + r > n. - Sync vs Async — sync = safe + slow; async = fast + risky.
- Replication lag — read-after-write inconsistency, monotonic reads, causality.
Remember. Replication scales reads, not writes. Always plan for replication lag.
Database Sharding
What it is. Splitting data across multiple servers. Each shard = subset.
Key terms.
- Hash-based — even distribution, no range queries, painful resharding.
- Range-based — easy ranges, hot spots possible.
- Directory-based — flexible, lookup service is SPOF.
- Cross-shard JOINs — expensive; denormalize to avoid.
- Hot spots — celebrity user problem; mitigate with random suffix.
Remember. Sharding is a last resort. Try vertical scaling, replicas, caching, query optimization first.
Consistent Hashing
What it is. Hash ring that minimizes data movement when nodes change.
Key terms.
- The ring — 0 to 2^32. Servers + keys placed via hash.
- Walk clockwise — first server hit owns the key.
- Adding a node — only nearby keys move (~1/N).
- Virtual nodes (vnodes) — 100-200 placements per server for even distribution.
Remember. Used in Memcached, Cassandra, DynamoDB, Akamai CDN. Without it, hash % N reshuffles ~75% of keys when adding one node.
ACID and Transactions
What it is. Four guarantees for reliable transactions. (Same as DBMS — see that.)
Isolation Levels Cheatsheet.
| Level | Dirty | Non-Repeatable | Phantom |
|---|---|---|---|
| Read Uncommitted | possible | possible | possible |
| Read Committed (Postgres default) | prevented | possible | possible |
| Repeatable Read (MySQL default) | prevented | prevented | possible |
| Serializable | prevented | prevented | prevented |
Remember. ACID for money/inventory/auth. BASE for feeds/likes/views.
Scalability Patterns
Horizontal vs Vertical Scaling
What it is. Bigger machine vs more machines.
Key terms.
- Vertical (scale up) — simple, low complexity, hard ceiling, SPOF.
- Horizontal (scale out) — distributed complexity, no theoretical limit, fault tolerant.
Remember. Start vertical for simplicity. Design stateless so you can go horizontal later. Most large systems use a mix.
Microservices vs Monolith
What it is. One app vs many small apps.
Key terms.
- Monolith — simple, fast iteration, scales together. Good for small teams.
- Microservices — independent deploy/scale, fault isolation, polyglot. High operational overhead.
- Communication — Sync (HTTP/gRPC) vs Async (message queues). Most use both.
- Service discovery — Consul, Eureka, k8s DNS.
Remember. “Monolith first” (Fowler). Amazon, Netflix, Uber all started monolith. Extract services when pain points appear.
API Gateway
What it is. Single entry point for microservices.
Key terms.
- Duties — request routing, auth, rate limit, load balance, transformation, response aggregation, caching, logging.
- Response aggregation — gateway fans out to N services and merges. Big win for mobile.
- Tools — Kong, AWS API Gateway, Nginx, Traefik, Express Gateway.
- Difference from LB — LB distributes across copies of one service; gateway routes between services.
Remember. Without it, every client knows every service URL. With it, one door. Make it HA — it’s a SPOF if not.
Denormalization and Read-Write Separation
What it is. Trade duplication for fast reads. Split reads from writes.
Key terms.
- Denormalization — copy fields, cached counters, summary tables.
- Read replicas — primary handles writes; replicas handle reads.
- Replication lag — read-your-own-writes problem.
- CQRS Lite — separate read/write DB.
Remember. “Update name” requires updating all denormalized copies — that’s the price.
Blob Storage and Object Storage
What it is. Storage for files. Object = flat key-value blobs over HTTP.
Key terms.
- Block — raw disk (EBS, EFS).
- File — shared filesystem (NFS).
- Object — S3, GCS, Azure Blob. Virtually unlimited.
- Pre-signed URLs — let client upload/download directly to/from S3 without going through our server.
- Storage classes — Standard / IA / Archive (Glacier). Use lifecycle policies.
Remember. Object storage + CDN is the gold standard. S3 alone has 11 nines durability.
Reliability & Consistency
CAP Theorem
What it is. During a network partition, choose C or A. Same as DBMS — covered there.
Key terms.
- C — Every read sees latest write.
- A — Every request gets a response.
- P — System works through partitions (not optional).
- CP — MongoDB, HBase, etcd, Zookeeper.
- AP — Cassandra, DynamoDB, CouchDB, Riak.
Remember. Trade-off only kicks in during partition. PACELC extends with Else (no partition) → Latency or Consistency.
Cheatsheet — CAP Triad
| Pick | Sacrifice | Examples |
|---|---|---|
| CP | Availability during partition | MongoDB, etcd |
| AP | Consistency during partition | Cassandra, DynamoDB |
| CA | Partition tolerance (single-node only) | Single Postgres |
Consistency Models
What it is. Spectrum from strong to eventual.
Key terms.
- Strong / Linearizable — every read = latest write. Spanner, CockroachDB.
- Sequential — global total order, slightly weaker than linearizable.
- Causal — preserves cause-and-effect order. MongoDB causal sessions.
- Read-Your-Writes — you see your own writes. Common UX fix.
- Monotonic Reads — never see a value older than what we already saw.
- Eventual — converges given time. DynamoDB, Cassandra default.
Remember. Mix and match per use case. Inventory → strong. Likes → eventual. Profile updates → read-your-writes.
Cheatsheet — Consistency Models
| Model | Strength | Latency | Example |
|---|---|---|---|
| Linearizable | Strongest | Highest | Spanner |
| Sequential | Strong | High | Some DBs |
| Causal | Medium | Medium | MongoDB causal |
| Read-your-writes | Weak+ | Low+ | Sticky sessions |
| Eventual | Weakest | Lowest | DynamoDB default |
Failover and Redundancy
What it is. Automatic switch from failed component to backup.
Key terms.
- Active-Passive — one works, one waits. Standby uses warm/hot replication.
- Active-Active — all serve traffic; survivors absorb load.
- Health checks (pull) vs Heartbeats (push).
- SLI / SLO / SLA — Indicator (metric) / Objective (internal target) / Agreement (customer promise).
- Standby types — cold (off, restore from backup), warm (replicating, idle), hot (synced, ready).
Remember. Always have redundant LBs. Going from 99.9% to 99.99% is exponentially harder. Aim for 3 nines on web apps, 4-5 for critical infra.
Circuit Breaker and Bulkhead Patterns
What it is. Prevent cascading failures.
Key terms.
- Circuit Breaker states — Closed (normal) → Open (fail fast) → Half-Open (test one request).
- Fallback — cached data, default value, queued, degraded.
- Bulkhead — separate thread/connection pools per service.
- Retry with exponential backoff + jitter.
- Don’t retry — 4xx errors, non-idempotent operations.
- Libraries — Resilience4j (Java), Polly (.NET), Hystrix (deprecated).
Remember. Three patterns combine: Bulkhead isolates → Circuit breaker fails fast → Retry handles transients. Together they prevent cascading collapse.
Monitoring, Logging, and Alerting
What it is. Three pillars of observability.
Key terms.
- Metrics — numeric, aggregated. “What’s happening?” Prometheus, Datadog.
- Logs — discrete events, detailed. “What happened?” ELK, Loki.
- Traces — request journey across services. “Where is it slow?” Jaeger, Zipkin.
- Four Golden Signals (Google SRE) — Latency, Traffic, Errors, Saturation.
- Latency percentiles — p50, p95, p99 matters most.
- Structured logs — JSON for searchability.
- Alert on symptoms, not causes. SLO breaches.
- Avoid alert fatigue — every alert must be actionable + have runbook.
Remember. Mention monitoring at end of system design — shows production maturity.
Communication Protocols
REST API Design
What it is. Resource-based architecture over HTTP. Stateless.
Key terms.
- Methods — GET (read), POST (create), PUT (replace), PATCH (partial), DELETE.
- Idempotency — GET/PUT/PATCH/DELETE yes; POST no.
- Status codes — 2xx success, 3xx redirect, 4xx client error, 5xx server error.
- URL design — nouns plural, nested resources for relationships, query params for filters.
- Pagination — cursor-based > offset-based.
- Versioning —
/v1/, header, or query param. URL is most explicit.
Remember. “4xx = client’s fault. 5xx = our fault.” Always use ISO 8601 dates. HTTPS everywhere.
GraphQL
What it is. Client-specified queries to one endpoint. Solves over/under-fetching.
Key terms.
- Schema + Types — strongly typed contract.
- Queries — read.
- Mutations — write.
- Subscriptions — real-time over WebSocket.
- N+1 in resolvers — fix with DataLoader for batching.
- Caching is harder — single endpoint with POST. Use Apollo/urql client cache.
- Security — query depth limits + complexity analysis.
Remember. GraphQL shines when frontend needs lots of related data. REST simpler for CRUD, public APIs, file uploads.
Cheatsheet — REST vs GraphQL vs gRPC vs WebSocket
| Use case | Pick |
|---|---|
| Public CRUD API, browser-friendly | REST |
| Frontend with many related entities | GraphQL |
| Backend-to-backend, perf critical | gRPC |
| Bidirectional real-time | WebSocket |
| Server-only push | SSE |
WebSockets
What it is. Persistent bidirectional connection. Both sides send anytime.
Key terms.
- HTTP Upgrade handshake → 101 Switching Protocols.
- ws:// / wss:// (TLS).
- Use cases — chat, live notifications, dashboards, collab editing, gaming.
- Scaling — sticky sessions OR Redis Pub/Sub backbone.
- Heartbeats / ping-pong — detect dead connections.
Remember. Don’t use for occasional updates (use SSE) or simple CRUD. Each connection consumes server memory — plan for ~100K connections per server.
gRPC and Protocol Buffers
What it is. RPC framework using HTTP/2 + Protobuf binary serialization.
Key terms.
- Protobuf — schema-first, binary, 3-10x smaller than JSON.
- Field tags —
int32 id = 1— never reuse. - Four call types — Unary, Server streaming, Client streaming, Bidirectional streaming.
- HTTP/2 multiplexing — many calls per connection.
- gRPC-Web — needed for browsers (proxy).
Remember. gRPC inside services, REST outside for public APIs. Trade-off: faster + typed but harder to debug.
Polling, Long Polling, and Server-Sent Events
What it is. Server push without full WebSocket complexity.
Key terms.
- Short polling — fixed interval. Wasteful.
- Long polling — server holds until data ready. Server resource heavy.
- SSE —
text/event-stream, one-way server→client, EventSource API auto-reconnects. - Last-Event-ID — resume after disconnect.
Remember. Decision tree: real-time + bidirectional → WebSocket. Server-only push → SSE (underrated). Occasional updates → polling.
Advanced Patterns
Rate Limiting
What it is. Cap on requests per client per time window.
Key terms.
- Token Bucket — refills steadily, allows bursts. Most popular.
- Leaking Bucket — strict outflow rate, no bursts.
- Fixed Window Counter — boundary burst issue.
- Sliding Window Log — accurate but memory-heavy.
- Sliding Window Counter — weighted hybrid, memory-efficient.
- Headers —
X-RateLimit-Limit / Remaining / Reset, 429 Too Many Requests,Retry-After. - Distributed — Redis + Lua script for atomicity.
Remember. Fail open if rate limiter is down. Layer global + per-client + per-endpoint. Token Bucket is the safe interview answer.
Advanced Caching Patterns
What it is. Five caching strategies + handling stampedes.
Key terms.
- Cache-Aside — most common. App manages cache.
- Read-Through — cache loads from DB on miss.
- Write-Through — write to both, synchronously.
- Write-Behind — fast cache write, async DB. Risky.
- Refresh-Ahead — proactively refresh hot keys before TTL.
- Cache stampede / thundering herd — fix with: locking, staggered TTL (jitter), refresh-ahead, never-expire-with-bg-refresh.
Remember. Always add jitter to TTL. Pick cache-aside for safe default. Write-behind only if data loss is acceptable.
Search and Indexing
What it is. Inverted indexes for fast text search. Elasticsearch.
Key terms.
- Inverted index — word → list of documents.
- Elasticsearch concepts — Index, Document, Shard, Replica.
- Analyzer — tokenize → normalize (lowercase, stem, remove stop words).
- BM25 — relevance scoring (TF + IDF + field length).
- Sync from DB — best via CDC (Debezium).
Remember. Don’t use SQL LIKE %x% for search. ES sits beside DB, not instead. Use it for full-text, autocomplete, faceted filtering, geosearch.
Event Sourcing and CQRS
What it is. Store events not state. Separate read/write models.
Key terms.
- Event store — append-only log of immutable events.
- Projections — materialized read views built from events.
- CQRS — Command (write, validated, normalized) vs Query (read, denormalized, fast).
- Snapshots — periodic state dumps to avoid replaying everything.
Remember. Adds significant complexity. Use for audit trails (finance/healthcare/legal), complex domains, replay needs. Most CRUD apps don’t need it.
Distributed Consensus
What it is. Many nodes agreeing. Leader election. Raft.
Key terms.
- Raft states — Follower → Candidate → Leader.
- Term — election round.
- Heartbeats — leader signal to followers.
- Quorum — majority required (prevents split-brain).
- Odd cluster sizes — 3, 5, 7. Tolerate (N-1)/2 failures.
- Tools — etcd (k8s), ZooKeeper (older, ZAB protocol), Consul (HashiCorp).
Remember. Quorum prevents split-brain mathematically. Raft is the modern, understandable algorithm — Paxos is the OG.
Real System Design Questions
Design a URL Shortener (TinyURL)
Core flow. POST long URL → generate short key → return. GET short key → 302 redirect to long URL.
Key terms.
- Short key generation — Hash + collision check / Auto-increment + Base62 / Pre-Generated Key Service (KGS).
- KGS — wins because no collisions, no coordination, scalable.
- 301 vs 302 — 302 keeps analytics visibility (each click hits server).
- Base62 (a-z, A-Z, 0-9) — 62^7 = 3.5 trillion URLs.
Architecture. Client → LB → App Server → Redis (hot keys) → DB. Click events → Kafka → analytics workers.
Remember. Read-heavy (100:1) → cache aggressively. Async analytics through Kafka so redirects stay fast.
Design a Rate Limiter
Core flow. Every request → check counter → allow or 429.
Key terms.
- Algorithms — Token Bucket, Sliding Window Counter (best practical picks).
- Redis + Lua script for atomic INCR + EXPIRE.
- Identification — API key, user_id, IP, or combination.
- Layering — global + per-client + per-endpoint.
Remember. Fail open. Use Token Bucket for bursts. Multi-region: per-region limits are usually good enough.
Design a Chat System (WhatsApp)
Core flow. Persistent WebSocket → message → server routes to recipient (online) or push notification (offline) → persist.
Key terms.
- WebSocket for real-time delivery.
- Connection registry (Redis) —
user_connection:{user_id} → chat-server-X. - Kafka as backbone for routing across chat servers.
- Cassandra for messages — partition by conversation_id, cluster by message_id (TimeUUID).
- Statuses — sent, delivered, read.
- Per-conversation sequence numbers for ordering.
- Group fan-out — push for small groups (≤500), pull for huge channels.
- Presence — heartbeat in Redis with 60s TTL.
Remember. 5,000+ chat servers for 500M concurrent connections. Pub/sub bridges users on different servers. Push notifications via APNs/FCM for offline.
Design a Social Media Feed (Twitter/X)
Core flow. Tweet posted → fan out to followers’ feed caches (write-side) → user opens feed → fetch from cache + pull from celebrities → rank → return.
Key terms.
- Snowflake IDs — globally unique, time-ordered.
- Fan-out on write (push) — fast reads, expensive writes for celebs.
- Fan-out on read (pull) — fast writes, slow reads.
- Hybrid (Twitter approach) — push for normal users, pull for celebrities (>10K-50K followers).
- Feed cache — Redis sorted set per user.
- Ranking — recency + engagement + relationship signals.
- Real-time updates — WebSocket push for active users; new-tweet banner.
Remember. Pre-compute feeds when tweets created (write expense) so reads are cache lookups (read speed). Celebrity problem requires hybrid.
Design a Video Streaming Platform (YouTube)
Core flow. Upload → S3 → Kafka → transcode workers → multiple resolutions + chunks → S3 → CDN. Watch → manifest from CDN → fetch chunks adaptively.
Key terms.
- Pre-signed URL for direct upload to S3.
- Transcoding — generate 360p/480p/720p/1080p/4K, chunked into 2-10s segments.
- HLS (HTTP Live Streaming) —
master.m3u8lists qualities; player picks based on bandwidth. - Adaptive bitrate — quality switches mid-stream segment-by-segment.
- CDN — 90%+ of traffic served from edge, not origin.
- Recommendations — candidate generation → ranking → re-ranking.
Remember. Upload and stream are completely separate paths. CDN does the heavy lifting. HLS works because each segment is just a regular HTTP file (cacheable).
Design a Ride-Sharing Service (Uber)
Core flow. Driver location every 4s → Redis GEO + Kafka. Rider request → Matching Service queries nearby drivers → ETA-based pick → notify driver → ride lifecycle.
Key terms.
- Redis GEO —
GEOADD,GEORADIUSfor nearby driver search. - Geohashing — neighboring cells share prefix.
- Quadtree / H3 (Uber) — alternatives.
- Matching score = ETA + driver rating + acceptance rate + fairness.
- Trip state machine — matching → accepted → arriving → in_progress → completed.
- Surge pricing — supply/demand per hex zone.
- Sharding by city — handles 1.25M location updates/sec.
- Saga + idempotency keys for payment.
Remember. Location firehose (1M+ writes/sec) is the hardest part. Shard by city. Redis for current locations, Kafka for stream, time-series DB for history.
Design a File Storage Service (Dropbox)
Core flow. Client chunks file → hashes each chunk → uploads only new chunks → server tracks chunks per file. Sync notifications via WebSocket → other devices fetch new chunks.
Key terms.
- Chunking — split into 4 MB pieces, SHA-256 hash each.
- Content-addressable storage — same content → same hash → stored once.
- Deduplication — Dropbox saves ~75% on storage.
- Variable-length chunking (Rabin) — handles inserts better than fixed.
- Versioning — keep old versions; restore via chunk hash list.
- Conflict resolution — Last Writer Wins + save conflicting copy.
- Magic Pocket — Dropbox’s custom block storage (left S3 for cost).
Remember. Three pillars: chunking (transfer only changes), dedup (store once), sync notifications (push changes to devices). Metadata in PostgreSQL, blocks in S3.
Design an E-Commerce Platform (Amazon)
Core flow. Product catalog (cached heavily) → cart (Redis + DB) → checkout → reserve inventory → charge payment → confirm → fulfill.
Key terms.
- Inventory — optimistic locking with version column. Reserved stock pattern (reserve at checkout, confirm at payment success, release on timeout).
- Saga pattern — order → reserve → pay → confirm. Rollbacks (release reservation, refund) on failure.
- Idempotency keys — non-negotiable for payments.
- Search — Elasticsearch synced via CDC.
- Recommendations — collaborative filtering (“bought together”), content-based, personalized.
- Price snapshot — store
price_at_orderinorder_items.
Remember. Never oversell. Optimistic locking + reserved stock + version numbers. Sale events: queue-based checkout, pre-warm caches, feature flags to disable non-essentials. Read-heavy (100:1) so cache everything, shard by user_id.