Design Consistent Hashing

Listen to this article
Click ▶ to start
0%

Imagine you are the infrastructure engineer at a hot social media platform. Your system is humming along with 4 cache servers, each holding about 25% of your data. Life is good.

Then your platform goes viral overnight. You urgently add a 5th server. You restart everything. And suddenly, your database is on fire — every single cache server is getting a tsunami of cache misses. Users experience 10× slower page loads. The whole site is crawling.

What happened? You added one server and nearly everything broke.

This is the rehashing problem — one of the nastiest scaling traps in distributed systems. Consistent hashing is the elegant algorithm invented at MIT that makes this problem disappear. It powers Amazon DynamoDB, Apache Cassandra, Discord, Akamai CDN, and Google’s Maglev load balancer.

By the end of this article, you’ll understand exactly why the naive approach breaks, how the ring-based solution works, and how virtual nodes make it production-ready. Let’s go.


Part 1 — The Problem: Why Naive Hashing Breaks

The Naive Approach

When you have n cache servers and want to route a key to a server, the obvious approach is:

serverIndex = hash(key) % N

Where N is the total number of servers. Let’s walk through a concrete example with 4 servers and 8 keys:

Keyhash(key)hash(key) % 4Assigned Server
key0183586171server 1
key1261435840server 0
key2181311462server 2
key3358634960server 0
key4340858091server 1
key5275817033server 3
key6381649782server 2
key7225303513server 3

This works beautifully when N is fixed. Each server gets about 2 keys each. Balanced, clean, simple.

graph TD
    subgraph CLUSTER["4-Server Cache Cluster (N=4)"]
        S0["🖥️ Server 0\nkey1, key3"]
        S1["🖥️ Server 1\nkey0, key4"]
        S2["🖥️ Server 2\nkey2, key6"]
        S3["🖥️ Server 3\nkey5, key7"]
    end

    CLIENT["👤 Client\nhash(key) % 4"] --> S0 & S1 & S2 & S3

The Catastrophe: What Happens When N Changes

Now server 1 goes down. N changes from 4 to 3. We apply hash(key) % 3:

Keyhash(key)hash(key) % 3NEW ServerOLD ServerMoved?
key0183586170server 0server 1✅ YES
key1261435840server 0server 0❌ no
key2181311461server 1server 2✅ YES
key3358634962server 2server 0✅ YES
key4340858091server 1server 1✅ YES
key5275817030server 0server 3✅ YES
key6381649781server 1server 2✅ YES
key7225303510server 0server 3✅ YES

7 out of 8 keys moved to a different server. Only key1 stayed put.

flowchart LR
    subgraph BEFORE["Before: N=4"]
        direction TB
        B0["Server 0: key1, key3"]
        B1["Server 1: key0, key4"]
        B2["Server 2: key2, key6"]
        B3["Server 3: key5, key7"]
    end

    subgraph AFTER["After: N=3 — server 1 removed"]
        direction TB
        A0["Server 0: key0, key1, key5, key7"]
        A1["Server 1: key2, key4, key6"]
        A2["Server 2: key3"]
        A3["❌ Server 1: gone"]
    end

    subgraph IMPACT["Impact"]
        NOTE["😱 7 of 8 keys remapped to new servers\nEvery remapped key = cache miss\nEvery cache miss = database query\nDatabase overloaded instantly!"]
    end

    BEFORE -->|"N changes from 4 to 3"| AFTER
    AFTER --> IMPACT

This is devastating in production. Every remapped key is a cache miss. Every cache miss is a database query. With millions of keys remapping simultaneously, your database gets a thundering herd of requests it was never designed to handle. The very moment you scale — the moment your system is under stress — traditional hashing makes everything worse.

This is the problem consistent hashing solves.


Part 2 — Consistent Hashing: The Core Idea

From Wikipedia:

“Consistent hashing is a special kind of hashing such that when a hash table is re-sized and consistent hashing is used, only k/n keys need to be remapped on average, where k is the number of keys, and n is the number of slots.”

In plain English: instead of remapping nearly all keys when servers change, consistent hashing remaps only the minimum necessary fraction.

With 4 servers and 100 keys:

  • Traditional hashing: ~75 keys remapped when 1 server removed
  • Consistent hashing: ~25 keys remapped (only the ones that were on that server)

The ratio k/n means: with k keys and n servers, only k/n keys move — exactly the keys that were on the removed server. Everything else stays put.


Part 3 — The Hash Ring: How It Works

Step 1: Create the Hash Space

Using SHA-1 as our hash function, the output range goes from 0 to 2^160 - 1. That’s a number with 48 digits. We represent this as a linear space:

x0 ←──────────────────────────────────────────────────────────→ xn
0                                                          2^160-1

Step 2: Bend It Into a Ring

Now the magic: we bend this linear space into a circle by connecting the two ends. x0 and xn meet. We call this the hash ring (also called the hash circle or token ring).

graph LR
    A["Position 0\n(x0)"] -->|"Clockwise"| B["Position 2^40"]
    B --> C["Position 2^80"]
    C --> D["Position 2^120"]
    D --> E["Position 2^160-1\n(xn)"]
    E -->|"Wraps back\n(ring closes here)"| A

    style A fill:#6366f1,color:#fff
    style E fill:#6366f1,color:#fff

Think of it like a clock face — but instead of 0–12, the numbers go from 0 to 2^160. After 2^160-1, you wrap back to 0.

Step 3: Place Servers on the Ring

We apply the same hash function to each server’s IP address or name. Each server lands somewhere on the ring.

graph LR
    subgraph RING["Hash Ring — 4 Servers Placed"]
        S0["🟣 Server 0\n(hashed IP)"]
        S1["🔵 Server 1\n(hashed IP)"]
        S2["🟠 Server 2\n(hashed IP)"]
        S3["🟢 Server 3\n(hashed IP)"]

        S0 -->|"clockwise →"| S1
        S1 -->|"clockwise →"| S2
        S2 -->|"clockwise →"| S3
        S3 -->|"clockwise →\n(wraps around)"| S0
    end

Step 4: Place Keys on the Ring Too

We hash each data key with the exact same hash function. The key lands at some position on the ring.

graph LR
    subgraph RING["Hash Ring — Servers + Keys"]
        direction LR
        K0["⭕ key0\n(top)"]
        S0["🟣 Server 0"]
        K1["⭕ key1"]
        S1["🔵 Server 1"]
        K2["⭕ key2\n(bottom)"]
        S2["🟠 Server 2"]
        K3["⭕ key3\n(left)"]
        S3["🟢 Server 3"]

        K0 --> S0
        S0 --> K1
        K1 --> S1
        S1 --> K2
        K2 --> S2
        S2 --> K3
        K3 --> S3
        S3 -->|wraps| K0
    end

Step 5: The Lookup Rule — Go Clockwise

To find which server a key belongs to: start at the key’s position on the ring, walk clockwise, and stop at the first server you encounter.

sequenceDiagram
    participant Client
    participant Ring as "Hash Ring"
    participant Result

    Note over Ring: Servers at positions: S0(top-right), S1(right), S2(bottom), S3(left)
    Note over Ring: Keys at positions: k0(top), k1(right-ish), k2(bottom-ish), k3(left-ish)

    Client->>Ring: Where does key0 go?
    Ring->>Ring: Hash key0 → lands at top
    Ring->>Ring: Walk clockwise from top...
    Ring->>Ring: First server hit: Server 0
    Ring-->>Result: key0 → Server 0 ✅

    Client->>Ring: Where does key1 go?
    Ring->>Ring: Hash key1 → lands between S0 and S1
    Ring->>Ring: Walk clockwise...
    Ring->>Ring: First server hit: Server 1
    Ring-->>Result: key1 → Server 1 ✅

    Client->>Ring: Where does key2 go?
    Ring->>Ring: Hash key2 → lands between S1 and S2
    Ring->>Ring: Walk clockwise...
    Ring->>Ring: First server hit: Server 2
    Ring-->>Result: key2 → Server 2 ✅

    Client->>Ring: Where does key3 go?
    Ring->>Ring: Hash key3 → lands between S2 and S3
    Ring->>Ring: Walk clockwise...
    Ring->>Ring: First server hit: Server 3
    Ring-->>Result: key3 → Server 3 ✅

Simple rule: clockwise to the nearest server.


Part 4 — Adding and Removing Servers

This is where consistent hashing proves its genius. Watch how few keys move when the cluster changes.

Adding a Server

Let’s say we add Server 4 to the ring. It lands between Server 3 and Server 0.

flowchart TD
    subgraph BEFORE["Before: Adding Server 4"]
        direction LR
        bS3["🟢 Server 3"]
        bS0["🟣 Server 0"]
        bK0["⭕ key0\n(was going to Server 0)"]
        bS3 -->|clockwise| bK0 -->|clockwise| bS0
    end

    subgraph AFTER["After: Server 4 Added Between S3 and S0"]
        direction LR
        aS3["🟢 Server 3"]
        aS4["🟡 Server 4 ← NEW"]
        aS0["🟣 Server 0"]
        aK0["⭕ key0\n(now goes to Server 4!)"]
        aS3 -->|clockwise| aK0 -->|clockwise| aS4 -->|clockwise| aS0
    end

    BEFORE -->|"Server 4 joins"| AFTER

    RESULT["✅ Only key0 moved!\nkey1, key2, key3 stay on their original servers.\nAll other keys: zero disruption."]

Only key0 is affected — because Server 4 inserted itself between Server 3 and Server 0, intercepting only the keys that were previously in that arc.

Before Server 4: key0 walked clockwise and hit Server 0 first. After Server 4: key0 walks clockwise and hits Server 4 first. Server 4 is now responsible for key0.

The other keys (key1, key2, key3) are not in this arc — they continue routing to their original servers unchanged.

Removing a Server

Now let’s remove Server 1 from the ring.

flowchart LR
    subgraph BEFORE["Before: Server 1 exists"]
        direction TB
        bS0["🟣 Server 0"]
        bS1["🔵 Server 1\n(owns key1)"]
        bS2["🟠 Server 2"]
        bK1["⭕ key1"]
        bS0 --> bK1 --> bS1 --> bS2
    end

    subgraph AFTER["After: Server 1 removed"]
        direction TB
        aS0["🟣 Server 0"]
        aS1["❌ Server 1\n(gone)"]
        aS2["🟠 Server 2\n(now owns key1)"]
        aK1["⭕ key1\n(moved here)"]
        aS0 --> aK1 --> aS2
    end

    BEFORE -->|"Server 1 crashes"| AFTER

    RESULT["✅ Only key1 moved to Server 2!\nkey0, key2, key3 completely unaffected."]

Only key1 is affected. Since Server 1 is gone, key1 now walks clockwise and lands on Server 2 instead. Every other key was never in Server 1’s arc — they’re untouched.

The maths: With n servers and k keys, consistent hashing moves only k/n keys on average when a server joins or leaves. With 4 servers and 100 keys, that’s only 25 keys out of 100. Compare that to traditional hashing which would move ~75 keys.


Part 5 — Two Problems with the Basic Approach

The basic consistent hashing algorithm was introduced by Karger et al. at MIT. It’s brilliant, but it has two real-world problems that need solving before production use.

Problem 1: Unequal Partition Sizes 📐

A partition is the arc of hash space between two adjacent servers. In an ideal world, all partitions are the same size — each server owns the same fraction of the ring.

But in practice, because server positions are determined by hashing their IP addresses, they can land anywhere. If servers cluster together on one side of the ring, the resulting partitions are wildly unequal.

graph LR
    subgraph UNEQUAL["After Server 1 is removed — Unequal Partitions"]
        direction LR
        S0["🟣 Server 0\nlarge partition"]
        S2["🟠 Server 2\nsmall partition"]
        S3["🟢 Server 3\nmedium partition"]
        WARN["⚠️ Server 0 handles far more traffic\nthan Server 2 or Server 3.\nLoad imbalance!"]
        
        S0 -->|"Large arc — was S0 to S1 to S2"| S2
        S2 -->|"Small arc"| S3
        S3 -->|"Medium arc — wraps to S0"| S0
        S0 --- WARN
    end

When Server 1 is removed, Server 2’s partition grows to include Server 1’s old arc. Server 0’s partition is now twice as large as Server 3’s. Server 2’s partition is tiny. Unbalanced load, hot servers, slow responses.

Problem 2: Non-Uniform Key Distribution 🎯

Even without removing servers, if servers happen to hash to positions that are clustered together, most keys will land on one server while others sit nearly empty.

graph LR
    subgraph CLUSTERED["Servers Clustered Together — Key Distribution Skewed"]
        S0["🟣 Server 0"]
        S1["🔵 Server 1"]
        S2["🟠 Server 2\nhugest partition — most keys land here"]
        S3["🟢 Server 3"]
        WARN["⚠️ Server 2 handles 70% of keys\nServer 1 handles only 5%\nMassively unbalanced system!"]

        S3 -->|"tiny arc"| S0
        S0 -->|"tiny arc"| S1
        S1 -->|"tiny arc"| S2
        S2 -->|"HUGE arc"| S3
        S2 --- WARN
    end

The book gives a real, relatable example:

“Imagine data for Katy Perry, Justin Bieber, and Lady Gaga all end up on the same shard. Excessive access to a specific shard could cause server overload.”

If three of the world’s most-followed celebrities happen to hash to the same partition, that one server handles an insane share of traffic while others sit idle. This is called the hotspot problem.

Both of these problems have the same solution: Virtual Nodes.


Part 6 — Virtual Nodes: The Real-World Solution

A virtual node (also called a replica or vnode) is a phantom copy of a server on the ring. Instead of each physical server appearing once, it appears many times at different positions.

How Virtual Nodes Work

Instead of mapping server0 to one position, we create multiple virtual nodes:

  • server0_0, server0_1, server0_2 → each hashes to a different position
  • server1_0, server1_1, server1_2 → each hashes to a different position
graph LR
    subgraph VNODE_RING["Hash Ring with Virtual Nodes (3 vnodes each)"]
        direction LR
        S1_0["🔵 s1_0"]
        S0_0["🟣 s0_0"]
        S1_1["🔵 s1_1"]
        S0_1["🟣 s0_1"]
        S1_2["🔵 s1_2"]
        S0_2["🟣 s0_2"]

        S1_0 -->|cw| S0_0 -->|cw| S1_1 -->|cw| S0_1 -->|cw| S1_2 -->|cw| S0_2 -->|cw| S1_0
    end

    subgraph PHYSICAL["Physical Servers"]
        P0["🟣 Server 0\n(owns all s0_* nodes)"]
        P1["🔵 Server 1\n(owns all s1_* nodes)"]
    end

    S0_0 & S0_1 & S0_2 -.->|"all route to"| P0
    S1_0 & S1_1 & S1_2 -.->|"all route to"| P1

The lookup still works the same way: go clockwise from the key, find the first virtual node, then map that virtual node to its physical server.

Why This Fixes Both Problems

graph TD
    subgraph BALANCED["✅ Virtual Nodes = Balanced Distribution"]
        direction LR
        A["Ring with 3 vnodes per server\n\nServer 0 scattered at: 12°, 132°, 252°\nServer 1 scattered at: 60°, 180°, 300°\n\nResult: alternating, evenly spaced"]
        B["Each server owns\n~3 non-contiguous arcs\n\nTotal arc per server ≈ equal\nLoad is balanced!"]
        A --> B
    end

    subgraph HOTSPOT["✅ Hotspot Problem Solved"]
        direction LR
        C["Celebrity users hash to\ndifferent positions"]
        D["With many vnodes, they land on\ndifferent physical servers\n\nKaty Perry → Server 2\nJustin Bieber → Server 0\nLady Gaga → Server 3"]
        C --> D
    end

With more virtual nodes, the load distribution gets mathematically better:

Virtual Nodes per ServerStandard Deviation of Load
1 (no virtual nodes)Very high (unpredictable)
100 virtual nodes~10% of mean
200 virtual nodes~5% of mean
500+ virtual nodes~2% of mean

This comes from research cited in the book. With 100–200 virtual nodes per server, the load distribution is so balanced that the variation is negligible in practice.

Apache Cassandra defaults to 256 virtual nodes per physical node. That’s why Cassandra can add or remove nodes with almost zero impact on cluster performance.

Heterogeneous Servers: Proportional Allocation

Virtual nodes give you a powerful bonus: you can allocate proportionally based on server capacity.

graph TD
    subgraph HETERO["Heterogeneous Cluster — Proportional Virtual Nodes"]
        direction LR
        STRONG["💪 Strong Server\n32 cores / 256GB RAM\n300 virtual nodes\nhandles 3x traffic"]
        MEDIUM["🖥️ Medium Server\n16 cores / 128GB RAM\n150 virtual nodes\nhandles 1.5x traffic"]
        WEAK["📟 Weak Server\n8 cores / 32GB RAM\n50 virtual nodes\nhandles baseline traffic"]
        RULE["✅ No special routing logic.\nMore virtual nodes on ring\n= more keys assigned\n= proportional load automatically."]
        STRONG --- MEDIUM --- WEAK --- RULE
    end

No custom routing logic. No configuration files. The ring handles it automatically. More virtual nodes = more arcs = more keys = more load. It’s proportional by construction.


Part 7 — Finding Affected Keys When Servers Change

When a server joins or leaves, which specific keys need to be moved? The answer comes from a simple anticlockwise scan rule.

When a Server is Added

To find which keys move to the new server, start at the new server’s position and scan anticlockwise until you hit another server. All keys in that anticlockwise arc belong to the new server.

sequenceDiagram
    participant Admin as "👤 Admin"
    participant Ring as "Hash Ring"
    participant S3 as "Server 3 (previous owner)"
    participant S4 as "Server 4 (new)"

    Admin->>Ring: Add Server 4 between S3 and S0
    Ring->>Ring: Server 4 lands at position P4
    Ring->>Ring: Scan anticlockwise from P4...
    Ring->>Ring: Hit Server 3 at position P3
    Ring-->>Admin: Keys in arc [P3 → P4] must move to Server 4

    Note over S3,S4: S3 transfers keys in that arc to S4
    S3->>S4: Transfer key0, key_a, key_b (in the arc)

    Note over Ring: All other keys untouched ✅

When a Server is Removed

When a server fails, start at the removed server’s position and scan anticlockwise until you hit another server. All keys in that anticlockwise arc must be redistributed to the next server clockwise.

sequenceDiagram
    participant Ring as "Hash Ring"
    participant S0 as "Server 0"
    participant S1 as "Server 1 (failing)"
    participant S2 as "Server 2 (inheritor)"

    Note over S1: Server 1 goes down!
    Ring->>Ring: Scan anticlockwise from S1's position...
    Ring->>Ring: Hit Server 0 at position P0
    Ring-->>S2: Keys in arc [P0 → P1] must move to Server 2

    Note over S1,S2: S1's data must be read from replicas
    S2->>S2: Now owns key1 (and others in the arc)
    Note over Ring: key0, key2, key3 completely unaffected ✅

This is beautiful in its simplicity: only the adjacent arc is ever affected. The rest of the ring is immune.


Part 8 — Data Replication with Consistent Hashing

In a production system, consistent hashing isn’t just for routing — it’s also used for replication. You don’t store data on just one server; you store it on N servers (where N is the replication factor).

The replication strategy is elegant:

After a key is hashed to a position, walk clockwise and store copies on the next N unique physical servers.

graph LR
    subgraph REPLICATION["Replication Factor N=3"]
        K0["⭕ key0\n(hashed position)"]
        S0["🟤 Server 0"]
        S1["🟢 Server 1 ← 1st replica"]
        S2["🔵 Server 2 ← 2nd replica"]
        S3["🟣 Server 3 ← 3rd replica"]
        S4["🟠 Server 4"]

        K0 -->|"walk clockwise"| S0 -->|"skip same-physical"| S1 -->|"replica 1"| S2 -->|"replica 2"| S3
        STORED["✅ key0 stored on Server 1, Server 2, Server 3\nIf Server 1 goes down, Server 2 and 3 still have the data!"]
        S3 --- STORED
    end

Important: With virtual nodes, you must skip virtual nodes that belong to the same physical server. If s1_0 and s1_1 are both for Server 1, you skip the second one — otherwise the same physical machine stores two “replicas,” which defeats the purpose.

For maximum resilience, replicas are placed in different data centers — so a power outage, network failure, or natural disaster doesn’t take down all copies of your data simultaneously.


Part 9 — Where Consistent Hashing Lives in Production

Consistent hashing is not an academic exercise. It’s running in systems you use every single day.

mindmap
  root(("Consistent\nHashing\nin Production"))
    Amazon DynamoDB
      Partitions data across nodes
      Uses virtual nodes internally
      Handles millions of requests/sec
      Automatic scaling with zero downtime
    Apache Cassandra
      256 vnodes per node by default
      Data replicated across nodes
      Adding a node moves only 1/N of keys
      Used by Netflix Apple Discord
    Discord
      Chat message storage
      Scaled to 5 million concurrent users
      Consistent hashing for shard routing
    Akamai CDN
      194+ edge servers worldwide
      Routes content requests to nearest cache
      Consistent hashing minimises cache misses
      Handles 15-30% of all web traffic
    Google Maglev
      Software network load balancer
      Consistent hashing across backends
      Connection persistence across restarts
      Runs on every Google server
    Redis Cluster
      16384 fixed hash slots
      Similar principle to consistent hashing
      Slots assigned to nodes
      Resharding moves only affected slots

Deep Dive: How Cassandra Uses It

When a new node joins a Cassandra cluster, here’s what happens:

sequenceDiagram
    participant Admin as "👤 Admin"
    participant NewNode as "🆕 New Cassandra Node"
    participant Ring as "Token Ring"
    participant OldNode as "📦 Old Node (neighbour)"
    participant Client as "👤 Client"

    Admin->>NewNode: Join cluster
    NewNode->>Ring: Generate 256 virtual node positions
    Ring->>Ring: Insert all 256 tokens into ring
    Ring->>OldNode: "These key ranges now belong to NewNode"
    OldNode->>NewNode: Stream only the affected data
    Note over OldNode,NewNode: Only ~1/N of data moves\nN = total nodes in cluster
    NewNode-->>Ring: "Ready to serve traffic"
    Client->>NewNode: Can now route requests here ✅

    Note over Ring: Zero downtime, zero reconfiguration\nCluster just got bigger seamlessly

This is why Cassandra is famous for “zero downtime scaling.” You literally just plug in a new node and the cluster self-heals.


Part 10 — The Standard Deviation Math (Why More VNodes = Better)

This is worth understanding intuitively, even if you’re not a maths person.

Imagine 3 darts thrown randomly at a circular dartboard. You’ll likely get a lumpy distribution — some sections much larger than others. Now throw 300 darts. The sections become much more uniform. This is exactly the virtual node principle.

xychart-beta
    title "Load Imbalance vs Number of Virtual Nodes"
    x-axis ["1 vnode", "10 vnodes", "50 vnodes", "100 vnodes", "200 vnodes", "500 vnodes"]
    y-axis "Standard Deviation of Load (%)" 0 --> 45
    line [42, 28, 16, 10, 5, 2]

The tradeoff: More virtual nodes = more memory to store the ring metadata. Each virtual node entry requires about 100 bytes. With 200 vnodes and 100 servers, that’s 200 × 100 × 100 bytes = 2MB of ring metadata per server. Very manageable.

Production systems typically use 150–300 virtual nodes per server — the sweet spot between distribution quality and memory overhead.


Part 11 — Consistent Hashing vs. Other Approaches

How does consistent hashing compare to other ways of routing keys to servers?

graph TD
    subgraph NAIVE["❌ Naive Modular Hashing\nhash(key) % N"]
        N1["Pro: Dead simple to implement"]
        N2["Con: N changes → most keys remap"]
        N3["Con: Requires downtime to rescale"]
        N4["Con: Cache miss storm on any resize"]
    end

    subgraph RANGE["🟡 Range-Based Partitioning\nA-M → Server 1, N-Z → Server 2"]
        R1["Pro: Easy to reason about"]
        R2["Con: Hotspots if keys aren't uniform"]
        R3["Con: Manual rebalancing required"]
        R4["Used by: HBase, MongoDB (early)"]
    end

    subgraph CONSISTENT["✅ Consistent Hashing"]
        C1["Pro: Only k/n keys move on resize"]
        C2["Pro: Automatic load balancing via vnodes"]
        C3["Pro: Proportional capacity via vnode count"]
        C4["Pro: Zero-downtime scaling"]
        C5["Used by: Dynamo, Cassandra, Discord, Akamai"]
    end

    NAIVE --> CONSISTENT
    RANGE --> CONSISTENT

Part 12 — Benefits: The Full Picture

Let’s bring it all together. Why is consistent hashing so important?

graph LR
    CH["⭕ Consistent\nHashing"]

    CH --> B1["🔄 Minimal Key Redistribution\n\nOnly k/n keys move when\na server joins or leaves.\nWith 100 servers, adding 1\nmoves only 1% of keys."]

    CH --> B2["⚖️ Horizontal Scaling\n\nAdd servers at any time\nwithout downtime or\nmanual data migration.\nThe ring self-balances."]

    CH --> B3["🌡️ Hotspot Mitigation\n\nVirtual nodes spread\npopular keys across\nmany physical servers.\nNo more celebrity data\noverloading one shard."]

    CH --> B4["📊 Heterogeneous Clusters\n\nPowerful servers get more\nvirtual nodes and serve\nmore traffic naturally.\nNo custom routing code."]

    CH --> B5["🔁 Automatic Replication\n\nN-replica replication\nfollows the same ring.\nFailure recovery is\nautomatically bounded."]

    style CH fill:#6366f1,color:#fff
    style B1 fill:#dbeafe
    style B2 fill:#dcfce7
    style B3 fill:#fef9c3
    style B4 fill:#fce7f3
    style B5 fill:#ede9fe

The Algorithm in 6 Steps

Here’s the complete consistent hashing algorithm, distilled:

flowchart TD
    STEP1["① Choose a hash function\n(SHA-1, MD5, MurmurHash)\nOutput range = 0 to 2^160-1"]
    STEP2["② Bend the range into a ring\nConnect x0 and x_max"]
    STEP3["③ Place each server on the ring\nhash(server_ip) → position\nRepeat M times per server (vnodes)"]
    STEP4["④ Place each key on the ring\nhash(key) → position\n(no modular operation!)"]
    STEP5["⑤ Lookup: go clockwise from key\nFind first virtual node → physical server\nThat server owns the key"]
    STEP6["⑥ On change: only adjacent arc affected\nAdding server: anticlockwise scan for keys to claim\nRemoving server: clockwise neighbour inherits keys"]

    STEP1 --> STEP2 --> STEP3 --> STEP4 --> STEP5 --> STEP6

Interview Quick Reference

mindmap
  root(("Consistent\nHashing\nInterview"))
    The Problem
      hash key % N breaks on resize
      N changes → most keys remap
      Cache miss storm
      Thundering herd on DB
    The Solution
      Hash ring 0 to 2^160
      Servers placed on ring by IP hash
      Keys placed on ring by key hash
      Go clockwise → first server wins
    Key Properties
      Only k/n keys move on resize
      Adding server → only adjacent arc affected
      Removing server → only adjacent arc affected
    Virtual Nodes
      Each server has M positions on ring
      Solves unequal partition sizes
      Solves non-uniform key distribution
      More vnodes = better balance
      Cassandra uses 256 by default
    Replication
      Walk clockwise N servers for N replicas
      Skip vnodes of same physical server
      Place replicas in different data centres
    Real World
      DynamoDB partitioning
      Cassandra token ring
      Discord chat sharding
      Akamai CDN routing
      Google Maglev load balancer
    Tradeoffs
      More vnodes = more memory for ring metadata
      Sweet spot 150-300 vnodes per server
      Virtual nodes add implementation complexity

Summary: What We Learned

  1. Traditional hashing (hash(key) % N) is simple but catastrophic when N changes — it remaps ~75% of keys, causing a cache miss storm that overwhelms your database

  2. Consistent hashing bends the hash space into a ring. Servers and keys are both placed on this ring using the same hash function. A key routes to the first server clockwise from its position

  3. On server add/remove, only the adjacent arc of keys is affected. On average, only k/n keys move — the mathematical minimum

  4. Virtual nodes (multiple positions per server) solve the two core problems of the basic algorithm: unequal partition sizes and non-uniform key distribution. More vnodes = more even load distribution

  5. Replication uses the same ring: store copies on the next N unique physical servers clockwise from the key’s position

  6. The whole ecosystem runs on this: Amazon DynamoDB, Apache Cassandra, Discord, Akamai, Google Maglev. It’s one of the most impactful algorithms in modern distributed systems


What’s Next?

In Chapter 6, we’ll design a Key-Value Store from scratch — diving into CAP theorem (Consistency, Availability, Partition Tolerance), data replication strategies, consistency models, conflict resolution, and how DynamoDB and Cassandra make these tradeoffs. Consistent hashing will be one of the building blocks!

If this article helped you understand distributed systems better, share it with a fellow engineer. The more people understand these foundations, the better the systems we all build.