Design a Distributed Cache

Eviction Policies, Cache Strategies, Consistent Hashing, and Hot Key Mitigation

A distributed cache (e.g., Redis Cluster, Memcached) stores frequently accessed data in memory across multiple nodes to reduce database load and latency. The core challenges: partitioning keys across nodes with consistent hashing, choosing the right eviction policy when memory fills, handling cache invalidation without serving stale data, and mitigating thundering herd and hot key problems. At millions of QPS, a well-designed cache layer reduces p99 latency from 50ms (DB) to under 1ms.

Cache Hit/Miss Ratio Calculator

Model your cache performance. Adjust the hit ratio, request rate, and cache/DB latency to see the effective latency and cost savings.

Cache hit ratio: 95%

Requests per second: 100K

Cache latency (ms): 1

DB latency (ms): 50

Effective latency—

DB QPS saved—

DB QPS remaining—

Latency reduction—

95% Hit

5% Miss

Consistent Hashing Ring

Visualize how cache keys are distributed across nodes. Add/remove nodes to see minimal key redistribution — the fundamental property that makes consistent hashing ideal for caching.

Virtual nodes: 3

Eviction Policy Simulator

Compare LRU, LFU, and FIFO eviction policies. Watch what happens when the cache fills and new keys arrive.

Cache capacity: 5 slots

LRU Least Recently Used

Hits: 0 / Evictions: 0

LFU Least Frequently Used

Hits: 0 / Evictions: 0

FIFO First In, First Out

Hits: 0 / Evictions: 0

Capacity Estimation

Size your distributed cache cluster.

Unique cacheable keys (millions): 10

Avg value size (KB): 2

Memory per node (GB): 64

Replication factor: 2

Total data size—

With replication—

Nodes needed—

Usable per node—

Redis Cluster Architecture

Client

→

App Server

→

Cache Proxy

↓

Shard 1

Slots 0–5460

↓

Replica

↓

Shard 2

Slots 5461–10922

↓

Replica

↓

Shard 3

Slots 10923–16383

↓

Replica

Cache Miss ↓

Primary Database

Key Design Decisions

Cache Write Strategies Write-Through Write to cache + DB synchronously
Always consistent
Higher write latency
 
vs
 Write-Behind (Back) Write to cache, async flush to DB
Low write latency
Risk of data loss on crash
 

Cache-Aside vs Read-Through Cache-Aside App manages cache logic
Flexible — cache what you want
Risk of stale data on DB update
 
vs
 Read-Through Cache loads from DB on miss
Simpler app code
Cache library must understand DB
 

Thundering Herd

When a popular cache key expires, hundreds of concurrent requests hit the DB simultaneously. Mitigate with cache lock (only one request fetches, others wait), early expiration (refresh before TTL), or request coalescing (single-flight pattern). Redis uses a probabilistic early recomputation approach.

Hot Key Problem

A viral post or flash sale creates extreme load on one cache shard. Solutions: local in-process cache (L1 cache in app servers), key replication across multiple shards with random suffix (e.g., key:0–key:9), or dedicated hot key detection with automatic promotion.

Cache Invalidation

The two hardest problems in CS: naming, cache invalidation, and off-by-one errors. Strategies: TTL-based (simple, eventually consistent), event-driven (DB triggers or CDC stream invalidates keys), version tagging (key includes version, new version = new key). Most systems combine TTL + event-driven.

Eviction Policies

LRU: evict least recently accessed — great general-purpose policy. LFU: evict least frequently used — better for skewed access patterns. Random: surprisingly effective, O(1), used in Redis's allkeys-random. Redis default is noeviction — returns errors when full. Use volatile-lru to only evict keys with TTL set.

Cache Hit/Miss Ratio Calculator

Consistent Hashing Ring

Eviction Policy Simulator

LRU Least Recently Used

LFU Least Frequently Used

FIFO First In, First Out

Capacity Estimation

Redis Cluster Architecture

Key Design Decisions

Cache Write Strategies

Cache-Aside vs Read-Through

Thundering Herd

Hot Key Problem

Cache Invalidation

Eviction Policies

🔗 Related Topics