Load Balancing (L4 vs L7)

intermediate load-balancing l4 l7 nginx haproxy networking

A load balancer sits in front of multiple servers and decides which one handles each request. The two main types — L4 and L7 — operate at different layers of the OSI model and have very different superpowers.

L4 — Transport Layer Load Balancing

L4 means transport layer — TCP and UDP. The LB only sees IPs, ports, and TCP flags. It doesn’t read the bytes inside.

In simple language: it forwards packets like a smart router. “This TCP connection goes to backend B” — and from then on every packet of that connection goes to B.

  • Examples: AWS NLB, HAProxy in TCP mode, IPVS, Linux LVS.
  • Speed: Very fast. Can handle millions of connections per second on commodity hardware.
  • Visibility: None into the application. Can’t route by URL or hostname.
  • TLS: Pass-through. The LB doesn’t terminate TLS, the backend does.

L7 — Application Layer Load Balancing

L7 means application layer — usually HTTP/HTTPS. The LB parses the request, reads headers, URL, cookies, body.

In simple language: it acts like a smart receptionist. “Requests to /api/v2/... go to the new service. Mobile user-agents go to the mobile pool.”

  • Examples: AWS ALB, NGINX, HAProxy in HTTP mode, Envoy, Traefik.
  • Speed: Slower than L4 (parsing has cost), but still tens of thousands of req/s easily.
  • Visibility: Full HTTP. Can route by path, host, header, cookie.
  • TLS: Terminates TLS at the LB. Backends can be plain HTTP inside the VPC.

Side by Side

L4 (Transport)
TCP / UDP forwarding
Sees: IPs, ports
Speed: Very fast
Routing: By IP/port only
TLS: Pass-through
Tools: NLB, HAProxy TCP
Use when: Raw TCP, high QPS, non-HTTP
L7 (Application)
HTTP-aware routing
Sees: Headers, URL, cookies
Speed: Slower (still fast)
Routing: Path, host, header
TLS: Terminates here
Tools: ALB, NGINX, Envoy
Use when: Microservices, path routing, sticky sessions

L7 Superpowers

L7 unlocks features L4 simply cannot do:

  • Path-based routing/api/* → api pool, /static/* → static pool.
  • Host-based routingapi.example.com → one pool, app.example.com → another.
  • Sticky sessions — same user goes to same backend (via cookie).
  • Header rewrites — add X-Forwarded-For, normalize hostnames.
  • TLS termination — cert managed in one place.
  • Request retries — retry on 502 to a different backend.
  • Canary deployments — route 1% of traffic to a new version.

Algorithms

How does the LB pick which backend? Common algorithms:

  • Round Robin — backend 1, 2, 3, 1, 2, 3… Simple. Doesn’t account for load.
  • Least Connections — pick whichever backend has the fewest active connections. Good for long-lived connections.
  • IP Hash — hash the client IP, always send to the same backend. Naive sticky-session option.
  • Weighted — backend A is bigger, give it 60%; B and C get 20% each. Good when servers are heterogeneous.
  • Least Response Time — pick the backend with the lowest p50/p95 latency. Smartest, used by Envoy and modern LBs.
  • Power of Two Choices — pick two random backends, send to whichever has fewer connections. Fast, near-optimal load distribution.

Health Checks

A load balancer is only useful if it stops sending traffic to dead backends. Both L4 and L7 do periodic health checks:

  • L4 — TCP connect: did the port answer?
  • L7 — HTTP GET /healthz: did we get a 200?

L7 health checks are stronger because a server can accept TCP but be returning 500s — L4 wouldn’t notice.

Example: NGINX L7 Config

upstream api_backend {
    least_conn;
    server 10.0.1.10:3000 weight=2;
    server 10.0.1.11:3000;
    server 10.0.1.12:3000 backup;
}

server {
    listen 443 ssl;
    server_name api.example.com;

    location /v1/ {
        proxy_pass http://api_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_header;
    }
}

Real-World Stack: Use Both

Big architectures often layer them:

Client -> L4 (NLB, terminates nothing, fast) -> L7 (ALB/NGINX, routes by path, terminates TLS) -> Backends

The L4 layer absorbs raw connections at line rate. The L7 layer does the smart stuff.

Interview Tip

Don’t say “L4 is faster, so always use L4.” Say “L4 is faster but blind. Use L7 when we need HTTP-level features (path routing, sticky sessions, TLS termination); use L4 when we need raw TCP throughput or the protocol isn’t HTTP.” Bonus: mention that ALB is L7, NLB is L4 — comes up constantly in AWS interviews.