Redis Expiration & Eviction

Two related but distinct mechanisms control how Redis frees space: expiration enforces user-set TTLs (EXPIRE key 60), and eviction kicks in when maxmemory is reached and Redis needs to make room. Expiration uses a hybrid lazy + active-sweep strategy that bounds memory waste without paying for per-key timers. Eviction picks victims via approximate LRU, approximate LFU, TTL-soonest, or random, depending on maxmemory-policy. This page details how each algorithm works, what the configuration knobs actually buy you, and why Redis chose probabilistic algorithms over exact ones.

Expiration Pipeline

Two parallel mechanisms keep memory bounded Lazy Expiration on every key access GET foo → lookup expires["foo"] if deadline < now → DEL foo, return nil otherwise serve normally cost: free for unaccessed expired keys but wastes memory until something asks (could be never) Active Expiration in serverCron, every 100ms ÷ hz sample 20 random keys from expires delete those whose deadline < now if >25% expired in batch → repeat CPU budget: 25% of cron tick max probabilistic — bounded staleness but not exact-time

Key Numbers

Active sample size
20 keys
Repeat threshold
>25% expired
Default hz
10
Cron tick
100 ms
CPU budget
25% of tick
LRU samples
5 (default)
LFU counter
8-bit log

Why Probabilistic Expiration

Per-key timers don't scale
A million keys with TTL would mean a million timer registrations. Even efficient timer wheels would burn cycles managing this. Sampling instead gives bounded effort regardless of key count.
Lazy alone wastes RAM
If users SET keys with TTL but never read them again (one-shot tokens, write-once metrics), pure lazy expiration would let them sit forever. The active sweep cleans up the orphans.
Active alone wastes CPU
Scanning the full expires dict every cron tick wastes work when nothing's expired. The probabilistic sample gives early termination if the fraction of expired keys is low.

The expires Dict

A separate hashtable per database mapping key → absolute deadline.

Each Redis database has two top-level dicts: the main keyspace dict (db->dict) mapping key → robj, and the expires dict (db->expires) mapping key → 64-bit absolute timestamp in milliseconds. EXPIRE key 60 stores now_ms + 60000 in expires; PERSIST key removes the entry; deleting the key from the main dict also removes it from expires.

The two-dict design has a quirk: a key without a TTL has no entry in expires, so iterating the expires dict is equivalent to iterating only keys with TTL. This is what makes the active expiration sample efficient — it samples from this restricted set, not from the entire keyspace. If only 10% of your keys have TTLs, active expiration only walks 10%.

Lazy Expiration

Check on every access. Free for unaccessed keys.

Every command that touches a key calls expireIfNeeded() first:

int expireIfNeeded(redisDb *db, robj *key) {
  long long when = getExpire(db, key);
  if (when < 0) return 0;          // no TTL set
  if (now_ms() < when) return 0;   // not yet expired
  // expired — delete it lazily
  if (server.lazyfree_lazy_expire)
    dbAsyncDelete(db, key);
  else
    dbSyncDelete(db, key);
  notifyKeyspaceEvent("expired", key);
  return 1;
}

A return of 1 tells the calling command to act as if the key didn't exist. The keyspace notification (__keyspace@0__:foo expired) gets published to subscribers, useful for cache-invalidation patterns where downstream systems need to know when a TTL fires.

The cost of lazy expiration is exactly: one hashtable lookup per command. The cost it doesn't pay: no work for keys nobody asks about. The downside: if you SET a key with TTL and never read it, it sits in memory forever from lazy alone — that's why active expiration exists.

Active Expiration (the Sampling Algorithm)

Probabilistic; runs in the cron callback up to its CPU budget.

serverCron fires every 1000/hz ms (default hz=10, so every 100 ms). One of its jobs is the active expiration cycle, implemented in activeExpireCycle():

while (cycle_time_used < 25% of tick) {
  for each db in dbs {
    repeat {
      sample 20 random keys from db->expires
      delete those that are expired
      stats: expired_in_batch / 20

      if (expired_in_batch > 5)         // >25% threshold
        repeat the loop on this db
      else
        move on to next db
    } until budget exhausted
  }
}

The "if more than 25% of the sample expired, do another sample" is the probabilistic cleverness: when the database has lots of expired keys, the loop spins fast and clears them quickly; when most keys are fresh, the first sample shows few expired and the loop moves on. The CPU budget cap (25% of the tick) prevents starvation of normal command processing.

Tuning hz: higher values (up to 500) make the cron run more frequently, so active expiration is more responsive but uses more CPU. The default of 10 is fine for most workloads. hz=100 may help if you have a large fraction of short-TTL keys (e.g. sub-second tokens) and need tighter eviction lag. dynamic-hz yes (default) lets Redis raise hz adaptively when many clients are connected.

The expected staleness bound: even in worst case, no more than ~1/4 of the expires dict should be stale, because the loop repeats while >25% are expired. This isn't a hard guarantee — for a few cron ticks under burst load it can drift higher — but in steady state it's tight.

Eviction When maxmemory Is Reached

Different mechanism, different policies, different goal.

Eviction is the action taken when a write would push memory over maxmemory and Redis must drop something to fit. The policy is set by maxmemory-policy:

noeviction          ← reject writes with OOM error (default)
allkeys-lru         ← approximate LRU across all keys
volatile-lru        ← approximate LRU among keys with TTL
allkeys-lfu         ← approximate LFU across all keys
volatile-lfu        ← approximate LFU among keys with TTL
allkeys-random      ← random across all keys
volatile-random     ← random among keys with TTL
volatile-ttl        ← key with soonest deadline among TTL keys

For caching workloads, allkeys-lru is the typical choice — drop the least-recently-used regardless of TTL. For workloads where some keys are designated cache (TTL set) and others designated state (no TTL), volatile-lru protects the state. noeviction is the right choice when Redis is the source of truth and dropping data is unacceptable.

Approximate LRU and LFU

Why Redis doesn't maintain a global LRU list.

A textbook LRU cache requires a doubly-linked list of all keys ordered by access time, with O(1) move-to-head on access. For a hashtable with millions of keys, this list adds 16 bytes of pointer overhead per key — significant memory cost. Redis instead stores a 24-bit "LRU clock" per object (in the robj header), updated on every access, and uses a sampling-based eviction:

when memory needs freeing {
  maxmemory_samples = 5  // configurable
  pool = empty priority queue of (key, lru_clock)

  for i in 0..maxmemory_samples {
    k = random key
    insert (k, k.lru_clock) into pool
  }
  evict the key in pool with the smallest lru_clock (oldest)
}

With samples=5, each eviction picks "old enough" but not perfectly oldest. The Redis docs include charts showing that with samples=10, the result is nearly indistinguishable from true LRU at much lower memory cost. Typical production: leave at default unless eviction quality matters more than CPU.

LFU is similar but stores a 24-bit field split into a 16-bit "last decremented" timestamp and an 8-bit logarithmic counter. Each access increments the counter probabilistically (with diminishing probability as the counter grows, so it caps around 255 even for very-hot keys). Periodically the counter decays based on the timestamp gap. The eviction sample picks the key with the lowest counter — i.e., least frequently used recently. LFU shines when scans shouldn't displace genuinely hot data; default LRU policies fail this case.

Tradeoffs Across Policies

PolicyBest forWorst case
noevictionsource-of-truth stateOOM error rejects writes
allkeys-lrugeneric cachescan poisons cache
allkeys-lfucache with scanscold-start admits trash
volatile-lrumixed state+cachestate grows past memory
volatile-ttluniform-TTL datathrashes if all TTLs equal
allkeys-randomuniform accessthrows away hot keys

FAQ

Why doesn't Redis expire keys exactly on time?

Because exact-time expiration would require either a timer per key (millions of timers, unmanageable) or a single sorted-by-deadline structure scanned every millisecond (too expensive). Redis instead uses lazy expiration (check on access, evict if expired) plus active expiration (sample 20 random keys with TTL every 100 ms; if more than 25% are expired, repeat). Expected staleness is bounded but not zero — set hz higher (default 10) to make active expiration run more often at the cost of more CPU.

What does maxmemory-policy actually evict?

It depends on the policy. allkeys-lru samples a few keys' approximate-LRU clocks and evicts the oldest. volatile-lru only considers keys with a TTL set. allkeys-lfu uses approximate-LFU (a 24-bit logarithmic counter per key, decayed over time). volatile-ttl picks the key with the soonest expiration. allkeys-random / volatile-random pick at random. The 'sample' size is maxmemory-samples (default 5); raise to 10 for better LRU approximation at the cost of slightly more CPU per eviction.

What's the difference between LRU and LFU in Redis?

LRU evicts what hasn't been accessed recently. LFU evicts what's accessed infrequently overall. The key difference: a key accessed once a minute, every minute, has poor LRU score (last access only seconds ago, so it stays) but good LFU score (regular access pattern). LFU is better for workloads with stable working sets and occasional scans (the LRU classic anti-pattern: a sequential scan churns the cache by recency, but most keys are still 'hotter' overall than the scan). Redis 4.0+ supports allkeys-lfu and volatile-lfu.

Does EXPIRE have any cost beyond storing a TTL?

Negligible. The TTL is stored in a per-database expires dict (a separate hashtable mapping key→absolute-deadline). EXPIRE adds an entry; PERSIST removes it. The lookup at command time is one hashtable lookup in expires. The cost shows up at scale: a database with 100M keys all having TTL has 100M entries in expires consuming ~80 bytes each. Active expiration sweeps spend more time finding expired keys when the expires dict is huge but most keys are not expired.

Can I expire fields within a hash?

Yes, since Redis 7.4: HEXPIRE, HEXPIREAT, HTTL, HPERSIST, etc. provide per-field TTLs within a hash. Before 7.4, the unit of expiration was the whole key. Field-level TTL is implemented by adding a side dict tracking per-field deadlines and integrating with the active-expiration scan. Useful for session-style data where multiple sessions live in one hash and each should age out independently.

How does eviction interact with replication?

Lazy expirations are propagated as DEL commands to replicas, so replicas converge — but only when the master serves a request that triggers them. Active expirations on the master also generate DEL stream entries. On the replica, keys reaching their TTL deadline don't auto-expire; they wait for the master's DEL. This is why replicas can serve a key the master would have considered expired (a small window). Set replica-lazy-expire to no to make replicas eagerly expire too, but this can introduce dataset divergence under master partition.