Latency vs Bandwidth vs Throughput

intermediate latency bandwidth throughput performance rtt networking

Latency, bandwidth, and throughput sound similar. They’re not. Mixing them up is the root cause of half the “why is my app slow” debates.

Definitions

  • Latency — how long it takes for one packet to travel from A to B. Measured in milliseconds.
  • Bandwidth — the theoretical maximum capacity of the link. Measured in bits per second.
  • Throughput — the actual rate of data we observe end to end. Always ≤ bandwidth.

The Highway Analogy

Imagine a highway between two cities.

  • Bandwidth = the number of lanes. More lanes = more cars can travel in parallel.
  • Latency = how long it takes one car to drive from city A to city B. Fixed by speed and distance.
  • Throughput = how many cars actually arrive per minute. Depends on lanes, speed limit, traffic jams, accidents.

A 16-lane highway (high bandwidth) with a 5-hour drive (high latency) still delivers a lot of cars per hour because the lanes are wide. But if all the lanes are jammed, throughput drops to a trickle.

Why Low Bandwidth + Low Latency ≠ High Throughput

Throughput is bounded by both plus the protocol overhead. TCP, for example, requires acknowledgments. The sender can only send a “window” of unacknowledged data, then must wait for an ACK before sending more.

That window divided by the round-trip time gives us the maximum effective throughput, no matter how big the pipe is.

Throughput ~= Window Size / RTT

If we have 1 Gbps of bandwidth but 200 ms RTT and a 64 KB window, throughput is capped around:

64 KB / 0.2 s = 320 KB/s ~= 2.5 Mbps

Even though the link can do 1000 Mbps. This is the long fat pipe problem.

Bandwidth-Delay Product (BDP)

BDP = bandwidth × RTT. It tells us how much data is “in flight” at any moment.

BDP = 1 Gbps * 100 ms = 1e9 bits/s * 0.1 s = 1e8 bits = 12.5 MB

If our TCP window is smaller than BDP, we can’t fill the pipe. Modern TCP uses window scaling (RFC 7323) to grow the window beyond the original 64 KB cap.

RTT — Round-Trip Time

The time for a packet to go to a host and come back. Measured by ping.

# Measure RTT to a server
ping -c 5 google.com

# Continuous monitoring with jitter
mtr google.com

Typical RTTs:

  • Same datacenter: < 1 ms
  • Same city: 5-10 ms
  • Cross-country (US): 50-80 ms
  • Cross-continent: 100-200 ms
  • Satellite (GEO): 600+ ms

Why Latency Matters Even When Bandwidth Is Huge

Loading a page often means dozens of small HTTP requests in serial — DNS, TLS handshake, fetch HTML, fetch CSS, fetch JS, etc. Each one pays the RTT cost.

A 100 ms RTT × 10 sequential requests = 1 second of pure waiting, regardless of bandwidth. That’s why CDNs (low latency) often matter more than upgrading the connection (high bandwidth).

This is also why HTTP/2 multiplexing and HTTP/3 (over QUIC) save time — they parallelize the round trips.

Latency Hierarchy (Approximate)

L1 cache          0.5 ns
L2 cache          5 ns
RAM               100 ns
SSD read          150 us
HDD seek          10 ms
Network LAN       0.5 ms
Network WAN       50-150 ms
Mobile (4G/5G)    30-50 ms

Notice the 5-7 orders of magnitude between RAM and a typical WAN round trip. That’s why caching matters so much.

Measuring Throughput

# iperf3 - the standard tool for measuring real throughput
# On the server
iperf3 -s
# On the client
iperf3 -c server.example.com -t 30

iperf3 reports actual achievable throughput. Compare with the link’s advertised bandwidth — the gap is overhead, congestion, and protocol limits.

Interview Tip

If asked “would you rather have 10x bandwidth or 1/10 latency?” — for most user-facing workloads (web pages, APIs, gaming), lower latency wins. Bandwidth helps with bulk transfers (video streaming, backups). Always tie the answer to the workload.