Redis Internal Encodings
Redis presents five high-level data types — strings, lists, hashes, sets, sorted sets — but
under the hood each one switches between several physical representations as it grows. A
hash with 10 entries is stored as a flat byte array; the same hash with 1000 entries is a
hashtable. A string with 12 bytes might be inline integer, embstr, or full SDS. The
OBJECT ENCODING command exposes the choice, and the configurable thresholds
let you tune each cutoff. This page walks through every encoding Redis ships, why each one
exists, and the concrete byte layouts.
Encoding Decision Tree
Key Numbers
Why So Many Encodings?
Strings: int, embstr, raw
A 'string' is one of three things in Redis.
Redis strings are stored as one of three encodings:
SET counter 42 → encoding: int
(long stored inline in the robj.ptr field)
SET name "alice" → encoding: embstr
(robj + SDS in one 56-byte alloc)
SET bio "...long markdown text..." → encoding: raw
(robj points to separately-allocated SDS) The robj struct is 16 bytes: 4-bit type, 4-bit encoding, 24-bit LRU, 32-bit refcount, 64-bit pointer. int stuffs the integer into the pointer field — no SDS allocation at all. Lookup is just a cast. INCR/DECR work directly on this encoding without touching the heap.
embstr packs the SDS header plus payload into the same memory chunk as the robj. The header is 3 bytes (type byte + 1-byte length + 1-byte alloc) for the common short case. With a 16-byte robj, 3-byte SDS header, 1 null terminator, and the cache-line goal of 64 bytes total, the payload limit is 44 bytes. Any string ≤44 bytes lives in this single compact allocation; the entire object fits in one cache line, no second pointer dereference.
raw separates the SDS from the robj. Two allocations, one pointer indirection. Used for strings >44 bytes. APPEND on an embstr converts to raw because in-place mutation of embstr would require resizing in the same chunk — too fiddly. Once raw, it stays raw.
Listpack: The Universal Small-Collection Format
Replaced ziplist in Redis 7.0 across hashes, lists, sets, and sorted sets.
Listpack is a flat byte buffer: a small header, a series of self-describing entries, and an end byte. Each entry encodes both its length (forward) and its total size (backward) — that backward length is what enables O(1) reverse traversal without the cascading-update bug that plagued ziplist.
+-------+-------+--------------+--------------+--------------+----+ | total | nelm | entry 0 | entry 1 | entry N |FF | | bytes | count | (var size) | (var size) | (var size) |end | +-------+-------+--------------+--------------+--------------+----+ 4 B 2 B 1B encoding + payload + 1-5B backward-length
Each entry's encoding byte tells you whether the payload is a small int (4-bit, 7-bit, 13-bit, 16-bit, 24-bit, 32-bit, 64-bit) or a string of given length. Then the payload itself, then a tail length encoding the entry's total size — read backward from the next entry's start, this gives O(1) reverse iteration.
The historical context: ziplist (the predecessor) had a "previous entry length" field at the start of each entry. Inserting a long entry could force the next entry's prev-length field to grow from 1 byte to 5 bytes, which could cascade if the entry after that also needed its prev-length grown — a worst-case O(n²) update. Listpack moves the length to the end of the entry, where its size depends only on the entry's own size, eliminating the cascade. Redis 7.0 swapped ziplist for listpack everywhere.
Quicklist: List as Linked List of Listpacks
A list is always a quicklist. The leaves are listpacks.
A list in Redis is implemented as a doubly-linked list of listpack chunks. Each chunk holds
multiple list elements; the configuration list-max-listpack-size -2 (default)
means each chunk can be up to 8 KB. Small lists (a few items, total under 8 KB) live in a
single quicklist node, behaving essentially like a contiguous array. Larger lists chain
multiple listpack nodes together with prev/next pointers between them.
quicklist
├─ head ──→ listpack (8 KB, ~500 small entries) ──┐
│ │
│ listpack (8 KB, ~500 small entries) ←──┘
│ │
│ listpack (8 KB, ~500 small entries) ←──┘
└─ tail ↓
listpack (8 KB, ~500 small entries) LPUSH/RPUSH on a non-full head/tail chunk just append to that listpack. When the chunk hits the size limit, a new node is allocated and linked. LINDEX walks the list of nodes counting entries until it finds the node containing the target index, then scans that node's listpack.
An optional optimization: list-compress-depth. Inner quicklist nodes (not head,
not tail) can be compressed with LZF on the fly, since LPOP/RPOP only touch the ends. For
cold queue workloads this can halve memory at the cost of CPU during random LINDEX.
Hash: Listpack vs Hashtable
Two thresholds, both must hold for listpack.
A hash starts as a listpack and converts to a hashtable when:
number of fields > hash-max-listpack-entries (128)
OR
length of any field name OR value > hash-max-listpack-value (64 bytes)
In listpack form, a hash is just alternating field-value pairs in a single listpack:
[fname1, fvalue1, fname2, fvalue2, ...]. HGET is O(n) — scan from the start
until you match the field. HSET overwrites in place if the new value fits, otherwise relays
to the new size and rewrites the listpack.
In hashtable form, it's a standard open-addressing hashtable using SipHash for the hash function (since Redis 4.0, replacing the older djb hash, for HashDoS resistance). The hashtable keeps two arrays during incremental rehashing — at any moment after a resize trigger, half the buckets live in the old table and half in the new, with each command doing a small batch of moves. This avoids the multi-millisecond stall of resizing a 1M-entry hashtable in one go.
Set: Intset, Listpack, Hashtable
Three encodings for sets, picked by content.
Sets have three encodings:
all members are integers, ≤512 members → intset small set, mixed types, ≤128 entries → listpack otherwise → hashtable
Intset is a sorted array of integers, narrowed to the smallest fitting width (int16, int32, or int64). Lookups use binary search — O(log n). Memory is exactly N × element_width — about 8x denser than a hashtable. Adding a string to an intset converts to listpack; adding an integer outside the current width's range expands all elements to the wider type.
Listpack-encoded sets (added in Redis 7.2 — replacing the previous ziplist+hashtable behavior for small mixed sets) work like listpack hashes but with single values rather than KV pairs. SADD scans for membership before insert (O(n) for the small case), SISMEMBER is O(n). The hashtable encoding kicks in once size or value-length thresholds break.
Sorted Set: Listpack vs Skiplist+Hashtable
The only encoding that uses two parallel structures.
Small sorted sets (default thresholds: ≤128 entries and ≤64-byte values) use listpack with alternating member-score entries. Above that, a sorted set is stored as both a skiplist (for range queries by score) and a hashtable (for point lookups by member). Every operation maintains both structures.
The skiplist is a probabilistic balanced data structure: each node has a randomly chosen "level" (typically 1-32) and contains forward pointers at each level. Higher levels skip more entries, giving O(log n) range and rank queries. ZADD inserts into both structures; ZSCORE goes to the hashtable; ZRANGEBYSCORE walks the skiplist; ZRANGEBYRANK uses span counters in the skiplist nodes.
Why two structures? A pure skiplist would make ZSCORE (lookup by member name) O(n). A pure hashtable would make ZRANGEBYSCORE O(n log n). Maintaining both costs ~3x the memory of a hashtable but gives O(log n) on every operation. For sorted sets large enough to need range queries, the trade is right.
FAQ
What's the difference between embstr and raw strings?
Both are SDS (simple dynamic strings), but embstr embeds the SDS structure into the same allocation as the robj wrapper — one malloc, contiguous memory, fits in a 64-byte cache line for strings up to 44 bytes (the OBJ_ENCODING_EMBSTR_SIZE_LIMIT). Raw strings are two separate allocations: the robj points to a separately malloc'd SDS. Embstr is also immutable: any APPEND or SETRANGE that would mutate it converts to raw first. The 44-byte limit is calibrated against 64-byte cache lines minus the robj header and SDS header overhead.
When does a list switch from listpack to quicklist?
A quicklist is always a quicklist — but its internal nodes are listpacks of bounded size. The list-max-listpack-size config (default -2, meaning 8 KB per node) controls when a single listpack node is full and a new node is added. So very small lists fit in one listpack node and look list-an-array; bigger lists are linked lists of listpack nodes. The previous design (until Redis 7.0) used ziplist nodes; ziplist had a quadratic update problem (cascading length-prefix rewrites) that listpack solved by encoding lengths at both ends of each entry.
Why does HGETALL on a small hash return faster than expected?
Small hashes use the listpack encoding, which is just a flat byte array of alternating key-value entries. HGETALL on a listpack hash linearly scans the bytes — O(n) but extremely cache-friendly, often faster than the hashtable equivalent below ~128 entries. Above hash-max-listpack-entries (default 128) or hash-max-listpack-value (default 64 bytes), it converts to a real hashtable with O(1) lookups but more memory overhead. Tune these for your workload: small hashes everywhere → keep listpack; large hashes → keep defaults.
What is OBJECT ENCODING and when should I use it?
OBJECT ENCODING <key> returns the internal storage format, e.g. embstr, int, listpack, quicklist, skiplist+listpack, hashtable, intset. Use it to verify your data fits the encoding you expect. If a sorted set you assumed was tiny (under 128 entries, all values under 64 bytes) reports skiplist instead of listpack, you've blown the threshold and are paying full skiplist+hashtable overhead. The MEMORY USAGE <key> command pairs well with this for tuning per-data-type memory footprints.
What's an intset?
A compact array of sorted integers used by Redis Sets when every member is an integer and the set is below set-max-intset-entries (default 512). Lookups are binary search — O(log n) — and the storage is just N×8 bytes (or fewer if int16/int32 fits). Once you SADD a non-integer member, the set converts to a listpack or hashtable. Intset is dramatically more memory-efficient than a hashtable for integer-only sets: ~8 bytes per member vs ~80.
Are these encodings configurable per-database?
No, they're server-global. The relevant configs are hash-max-listpack-entries, hash-max-listpack-value, list-max-listpack-size, set-max-intset-entries, set-max-listpack-entries, zset-max-listpack-entries, zset-max-listpack-value. All take effect on subsequent operations — existing keys keep their current encoding until they're modified to a degree that triggers conversion. Lowering a threshold doesn't immediately recompact existing keys; you'd need to rewrite them.