Latency vs Bandwidth vs Throughput - Computer Networks

Latency, bandwidth, and throughput sound similar. They’re not. Mixing them up is the root cause of half the “why is my app slow” debates.

Definitions

Latency — how long it takes for one packet to travel from A to B. Measured in milliseconds.
Bandwidth — the theoretical maximum capacity of the link. Measured in bits per second.
Throughput — the actual rate of data we observe end to end. Always ≤ bandwidth.

The Highway Analogy

Imagine a highway between two cities.

Bandwidth = the number of lanes. More lanes = more cars can travel in parallel.
Latency = how long it takes one car to drive from city A to city B. Fixed by speed and distance.
Throughput = how many cars actually arrive per minute. Depends on lanes, speed limit, traffic jams, accidents.

A 16-lane highway (high bandwidth) with a 5-hour drive (high latency) still delivers a lot of cars per hour because the lanes are wide. But if all the lanes are jammed, throughput drops to a trickle.

Why Low Bandwidth + Low Latency ≠ High Throughput

Throughput is bounded by both plus the protocol overhead. TCP, for example, requires acknowledgments. The sender can only send a “window” of unacknowledged data, then must wait for an ACK before sending more.

That window divided by the round-trip time gives us the maximum effective throughput, no matter how big the pipe is.

Throughput ~= Window Size / RTT

If we have 1 Gbps of bandwidth but 200 ms RTT and a 64 KB window, throughput is capped around:

64 KB / 0.2 s = 320 KB/s ~= 2.5 Mbps

Even though the link can do 1000 Mbps. This is the long fat pipe problem.

Bandwidth-Delay Product (BDP)

BDP = bandwidth × RTT. It tells us how much data is “in flight” at any moment.

BDP = 1 Gbps * 100 ms = 1e9 bits/s * 0.1 s = 1e8 bits = 12.5 MB

If our TCP window is smaller than BDP, we can’t fill the pipe. Modern TCP uses window scaling (RFC 7323) to grow the window beyond the original 64 KB cap.

RTT — Round-Trip Time

The time for a packet to go to a host and come back. Measured by ping.

# Measure RTT to a server
ping -c 5 google.com

# Continuous monitoring with jitter
mtr google.com

Typical RTTs:

Same datacenter: < 1 ms
Same city: 5-10 ms
Cross-country (US): 50-80 ms
Cross-continent: 100-200 ms
Satellite (GEO): 600+ ms

Why Latency Matters Even When Bandwidth Is Huge

Loading a page often means dozens of small HTTP requests in serial — DNS, TLS handshake, fetch HTML, fetch CSS, fetch JS, etc. Each one pays the RTT cost.

A 100 ms RTT × 10 sequential requests = 1 second of pure waiting, regardless of bandwidth. That’s why CDNs (low latency) often matter more than upgrading the connection (high bandwidth).

This is also why HTTP/2 multiplexing and HTTP/3 (over QUIC) save time — they parallelize the round trips.

Latency Hierarchy (Approximate)

L1 cache          0.5 ns
L2 cache          5 ns
RAM               100 ns
SSD read          150 us
HDD seek          10 ms
Network LAN       0.5 ms
Network WAN       50-150 ms
Mobile (4G/5G)    30-50 ms

Notice the 5-7 orders of magnitude between RAM and a typical WAN round trip. That’s why caching matters so much.

Measuring Throughput

# iperf3 - the standard tool for measuring real throughput
# On the server
iperf3 -s
# On the client
iperf3 -c server.example.com -t 30

iperf3 reports actual achievable throughput. Compare with the link’s advertised bandwidth — the gap is overhead, congestion, and protocol limits.

Interview Tip

If asked “would you rather have 10x bandwidth or 1/10 latency?” — for most user-facing workloads (web pages, APIs, gaming), lower latency wins. Bandwidth helps with bulk transfers (video streaming, backups). Always tie the answer to the workload.