Design a Distributed Cache

Eviction Policies, Cache Strategies, Consistent Hashing, and Hot Key Mitigation

A distributed cache (e.g., Redis Cluster, Memcached) stores frequently accessed data in memory across multiple nodes to reduce database load and latency. The core challenges: partitioning keys across nodes with consistent hashing, choosing the right eviction policy when memory fills, handling cache invalidation without serving stale data, and mitigating thundering herd and hot key problems. At millions of QPS, a well-designed cache layer reduces p99 latency from 50ms (DB) to under 1ms.

Cache Hit/Miss Ratio Calculator

Model your cache performance. Adjust the hit ratio, request rate, and cache/DB latency to see the effective latency and cost savings.

Effective latencyβ€”
DB QPS savedβ€”
DB QPS remainingβ€”
Latency reductionβ€”
95% Hit
5% Miss

Consistent Hashing Ring

Visualize how cache keys are distributed across nodes. Add/remove nodes to see minimal key redistribution β€” the fundamental property that makes consistent hashing ideal for caching.

Eviction Policy Simulator

Compare LRU, LFU, and FIFO eviction policies. Watch what happens when the cache fills and new keys arrive.

LRU Least Recently Used

Hits: 0 / Evictions: 0

LFU Least Frequently Used

Hits: 0 / Evictions: 0

FIFO First In, First Out

Hits: 0 / Evictions: 0

Capacity Estimation

Size your distributed cache cluster.

Total data sizeβ€”
With replicationβ€”
Nodes neededβ€”
Usable per nodeβ€”

Redis Cluster Architecture

Client
β†’
App Server
β†’
Cache Proxy
↓
Shard 1
Slots 0–5460
↓
Replica
↓
Shard 2
Slots 5461–10922
↓
Replica
↓
Shard 3
Slots 10923–16383
↓
Replica
Cache Miss ↓
Primary Database

Key Design Decisions

Cache Write Strategies

Write-Through
  • Write to cache + DB synchronously
  • Always consistent
  • Higher write latency
vs
Write-Behind (Back)
  • Write to cache, async flush to DB
  • Low write latency
  • Risk of data loss on crash

Cache-Aside vs Read-Through

Cache-Aside
  • App manages cache logic
  • Flexible β€” cache what you want
  • Risk of stale data on DB update
vs
Read-Through
  • Cache loads from DB on miss
  • Simpler app code
  • Cache library must understand DB

Thundering Herd

When a popular cache key expires, hundreds of concurrent requests hit the DB simultaneously. Mitigate with cache lock (only one request fetches, others wait), early expiration (refresh before TTL), or request coalescing (single-flight pattern). Redis uses a probabilistic early recomputation approach.

Hot Key Problem

A viral post or flash sale creates extreme load on one cache shard. Solutions: local in-process cache (L1 cache in app servers), key replication across multiple shards with random suffix (e.g., key:0–key:9), or dedicated hot key detection with automatic promotion.

Cache Invalidation

The two hardest problems in CS: naming, cache invalidation, and off-by-one errors. Strategies: TTL-based (simple, eventually consistent), event-driven (DB triggers or CDC stream invalidates keys), version tagging (key includes version, new version = new key). Most systems combine TTL + event-driven.

Eviction Policies

LRU: evict least recently accessed β€” great general-purpose policy. LFU: evict least frequently used β€” better for skewed access patterns. Random: surprisingly effective, O(1), used in Redis's allkeys-random. Redis default is noeviction β€” returns errors when full. Use volatile-lru to only evict keys with TTL set.

Caching is the single most impactful optimization β€” master partitioning, eviction, and invalidation strategies.