Back-of-the-envelope estimation is quick, rough math to figure out the scale of our system. We’re not trying to be exact — we just need to know if we’re dealing with thousands or billions, megabytes or petabytes. That changes everything about our design.
Why Estimation Matters
If our system gets 10 requests per second, a single server is fine. If it gets 100,000 requests per second, we need load balancers, caching, sharding, and a whole different architecture. Estimation tells us which world we’re in.
Numbers Every Engineer Should Know
| What | How Much |
|---|---|
| L1 cache reference | 0.5 ns |
| L2 cache reference | 7 ns |
| RAM reference | 100 ns |
| SSD random read | 150 μs |
| HDD seek | 10 ms |
| Round trip within same datacenter | 0.5 ms |
| Round trip CA to Netherlands | 150 ms |
And for storage:
| Unit | Bytes | Practical Example |
|---|---|---|
| 1 KB | 1,000 | A short email |
| 1 MB | 1,000,000 | A high-res photo |
| 1 GB | 1,000,000,000 | A movie |
| 1 TB | 10^12 | 1,000 movies |
| 1 PB | 10^15 | 1 million movies |
Handy shortcut: There are about 86,400 seconds in a day. For quick math, round to ~100,000 (10^5). A month is about 2.5 million seconds.
The Four Key Calculations
1. QPS (Queries Per Second)
QPS = Daily Active Users × Queries per User / 86,400
Peak QPS = QPS × 2 (or ×3 for spiky traffic)
Example: 10 million DAU, each user makes 5 requests/day.
QPS = 10M × 5 / 86,400 ≈ 580 QPS
Peak QPS ≈ 1,160 QPS
2. Storage
Storage = Daily New Records × Record Size × Retention Period
Example: 100M new URLs per day, each URL record is 500 bytes, keep for 5 years.
Daily = 100M × 500 bytes = 50 GB/day
Yearly = 50 GB × 365 ≈ 18 TB/year
5 years = ~90 TB total
3. Bandwidth
Incoming = QPS × Request Size
Outgoing = QPS × Response Size
4. Memory for Cache
We usually cache the hot data — the most frequently accessed items. A common rule of thumb is to cache 20% of daily requests (the 80/20 rule: 20% of data handles 80% of traffic).
Cache Memory = Daily Requests × 0.2 × Average Response Size
Full Example: Estimate Twitter’s Storage
Let’s say Twitter has:
- 300M monthly active users, 50% are daily → 150M DAU
- Each user posts 2 tweets/day on average
- Each tweet: 140 chars (~280 bytes) + metadata (~200 bytes) = ~500 bytes
- 10% of tweets have a photo (~500 KB average)
Tweet storage per day:
Text: 150M × 2 × 500 bytes = 150 GB/day
Photos: 150M × 2 × 0.10 × 500 KB = 15 TB/day
Per year:
Text: ~55 TB/year
Photos: ~5.5 PB/year
That tells us we need a serious storage strategy — object storage (like S3) for media, and sharded databases for tweet metadata.
Powers of Two — Quick Reference
| Power | Value | Size |
|---|---|---|
| 2^10 | 1,024 | ~1 Thousand (1 KB) |
| 2^20 | ~1 Million | ~1 MB |
| 2^30 | ~1 Billion | ~1 GB |
| 2^40 | ~1 Trillion | ~1 TB |
Tips for the Interview
- State assumptions clearly. “I’m assuming 100M DAU” — the interviewer can correct us.
- Round aggressively. Use 10^5 instead of 86,400. Nobody expects exact math.
- Focus on order of magnitude. The difference between 50 TB and 90 TB doesn’t change our design. The difference between 50 GB and 50 TB does.
- Don’t spend more than 5 minutes. Estimation supports the design, it’s not the main event.
In simple language, back-of-the-envelope estimation is about getting a feel for the scale. Are we building a bicycle or a spaceship? The math takes 5 minutes but saves us from designing something wildly over- or under-engineered.