Consistent Hashing

intermediate 4-7 YOE consistent-hashing distributed-systems hash-ring scaling

We just learned about hash-based sharding: shard = hash(key) % N. The problem? When we add or remove a server (N changes), almost every key maps to a different server. We have to move nearly all the data. Consistent hashing fixes this.

The Problem with Regular Hashing

Say we have 3 servers and we use hash(key) % 3 to decide where data goes.

hash("user:1") % 3 = 1  → Server 1
hash("user:2") % 3 = 0  → Server 0
hash("user:3") % 3 = 2  → Server 2

Now we add a 4th server. Everything becomes hash(key) % 4:

hash("user:1") % 4 = 1  → Server 1 (same)
hash("user:2") % 4 = 2  → Server 2 (MOVED!)
hash("user:3") % 4 = 3  → Server 3 (MOVED!)

Most keys end up on different servers. If these are cache servers, we just invalidated almost our entire cache. If these are database shards, we need to migrate a huge amount of data. With a million keys, roughly (N-1)/N of them move. That’s brutal.

How Consistent Hashing Works

The core idea: instead of hash % N, we arrange everything on a ring (a circle from 0 to some max value).

  1. Hash each server onto the ring (using the server’s name or IP)
  2. Hash each key onto the same ring
  3. Walk clockwise from the key’s position — the first server we hit owns that key
Hash Ring
A B C k1 k2 k3 clockwise ↻
k1 → clockwise → B
k2 → clockwise → C
k3 → clockwise → A

The magic: when we add a new server D to the ring, only the keys between D and the next server counter-clockwise need to move to D. Everything else stays put. Instead of moving ~75% of keys (with modular hashing), we move roughly 1/N of keys. Same thing when a server is removed — only its keys redistribute to the next server clockwise.

Virtual Nodes

There’s a catch with basic consistent hashing. If servers aren’t evenly spaced on the ring, some servers end up with way more keys than others. And with just 3-4 servers, the distribution is often terrible.

The fix is virtual nodes (vnodes). Instead of placing each server once on the ring, we place it many times at different positions.

Virtual Nodes on the Ring
Without virtual nodes (uneven):
A (50%)
B (15%)
C (35%)
With virtual nodes (balanced):
A1
C1
B1
A2
B2
C2
Each server gets ~33% of keys with virtual nodes

Server A gets virtual nodes A-1, A-2, A-3, ... A-100 spread across the ring. More virtual nodes = more even distribution. In practice, systems use 100-200 virtual nodes per physical server.

Another cool benefit: if a server is more powerful, we give it more virtual nodes so it handles more keys. Simple and elegant.

Where Consistent Hashing Is Used

It’s everywhere in distributed systems:

  • Distributed caches (Memcached, Redis Cluster) — deciding which cache server holds which key
  • CDNs (Akamai) — routing content to the nearest/appropriate edge server
  • Load balancers — sticky sessions where the same user always hits the same backend
  • Distributed databases (Cassandra, DynamoDB) — partitioning data across nodes
  • Distributed file systems — deciding which node stores which chunk of a file

The Key Insight

With regular hashing, adding or removing a server reshuffles almost everything. With consistent hashing, only K/N keys move on average (where K is total keys and N is total servers). That’s the difference between a catastrophe and a minor adjustment.

In simple language, consistent hashing is like arranging servers in a circle and letting each key find its server by walking clockwise. When a server joins or leaves, only its immediate neighbors are affected. Everyone else doesn’t even notice.