🗜️ Data Compression

Codecs, Stacking Strategies & Storage Efficiency in ClickHouse

🏗️ Why Column Stores Champion Compression

Row-oriented databases store one row's columns contiguously. Column-oriented databases store one column's values contiguously. That single difference changes everything for compression.

Row-Oriented Storage

id:1101 ts:11700 val:142.5 cat:1A

id:2102 ts:21701 val:243.1 cat:2B

id:3103 ts:31702 val:341.8 cat:3A

id:4104 ts:41703 val:442.9 cat:4A

Each row mixes types. Compressing requires scanning across mixed data types — defeats most specialized codecs.

Column-Oriented Storage

101102103104

1700170117021703

val

42.543.141.842.9

cat

ABAA

Same type = same codec. A column of timestamps uses the same compression for every value — optimal codec selection becomes possible.

📊

Column Homogeneity

All values in a column share the same data type and often similar semantic meaning. This predictability lets specialized codecs work at maximum efficiency.

🔁

Repetition Amplification

Low-cardinality columns have enormous repetition at the byte level. A column of status codes (200, 404, 500) compresses far better than the same data scattered across rows.

📈

Sequential Pattern Discovery

Timestamps increment by fixed amounts. Sensor readings change gradually. Network IDs have prefix patterns. Column storage exposes these patterns for codec optimization.

🧊

Memory Footprint Reduction

Compressed data stays compressed in memory. ClickHouse's vectorized execution processes compressed data directly — fewer memory reads, more cache-friendly scans.

⚡ The ClickHouse Compression Philosophy

ClickHouse optimizes for compression ratio first, decompression speed second. In OLAP workloads, data is read far more often than it is written, and queries scan millions of rows. Getting more data per read is the primary goal.

🎯 Primary Goal: Maximum Ratio

Better compression means more data fits in disk I/O buffers, more data stays in CPU cache, and less data needs to be read from storage. The savings compound across every query.

⚖️ Acceptable Tradeoff: CPU Time

Decompression CPU cost is predictable and bounded. With modern CPUs doing 5–10 GB/s LZ4 decompression, the CPU cost is almost always worth the I/O savings.

📉 Secondary Goal: Encode Speed

Data is written once, read many times. ClickHouse tolerates slower encoding in exchange for faster decoding — writes happen during ingestion windows, reads happen constantly.

🎚️ Codec Selectability

Unlike most databases with one or two compression options, ClickHouse exposes per-column codec selection — letting you match the compressor to the data pattern.

Philosophy Aspect	ClickHouse Approach	Typical RDBMS Approach
Compression priority	Ratio over speed — ZSTD default	Speed over ratio — LZ4 or no compression
Codec selection	Per-column, user-specified	Global, automatic, or none
Specialized codecs	Delta, Gorilla, DoubleDelta, T64	None — generic only
Codec stacking	Up to 4 codecs per column	Single codec or none
Block-level optimization	Per-granule codec selection (8192 rows)	Table or page-level only
Memory compression	Full support — processed in-place	Usually decompress-on-read

🗜️ Generic Compression Algorithms

ClickHouse supports several general-purpose compression algorithms. These are the workhorses — they're the fallback when specialized codecs don't apply, and they often work very well as the final stage in a codec stack.

LZ4 Default

Compression Speed

⚡⚡⚡⚡⚡

Decompression Speed

⚡⚡⚡⚡⚡

Ratio

📦📦📦

CPU Cost

🟢🟢

Use when: You need the fastest possible decompression. Good default for frequently-accessed hot data where I/O is the bottleneck.

ALTER TABLE events MODIFY COLUMN timestamp_codec LZ4;

ZSTD Recommended

Compression Speed

⚡⚡⚡

Decompression Speed

⚡⚡⚡⚡

Ratio

📦📦📦📦

CPU Cost

🟡🟡🟡

Use when: You want the best balance of ratio and speed. ClickHouse's default for most scenarios. ZSTD level 1 is default, but you can tune the level (1–22) for more ratio at more CPU cost.

-- Default ZSTD level 1
ALTER TABLE events MODIFY COLUMN data_codec ZSTD(1);

-- Higher ratio, more CPU
ALTER TABLE events MODIFY COLUMN data_codec ZSTD(3);

LZMA

Compression Speed

⚡

Decompression Speed

⚡⚡

Ratio

📦📦📦📦📦

CPU Cost

🔴🔴🔴🔴

Use when: Data is archived and rarely accessed. Maximum compression ratio matters more than decode speed. Very high CPU usage — not recommended for hot data.

ALTER TABLE archive MODIFY COLUMN old_data_codec LZMA;

Zlib (gzip)

Compression Speed

⚡⚡

Decompression Speed

⚡⚡

Ratio

📦📦📦📦

CPU Cost

🟡🟡🟡

Use when: Compatibility with other tools matters, or when you need gzip-compatible output. Slower than LZ4 or ZSTD with similar ratio to ZSTD level 1.

ALTER TABLE logs MODIFY COLUMN payload_codec Zlib;

Ziqng

Compression Speed

⚡

Decompression Speed

⚡⚡⚡

Ratio

📦📦📦📦

CPU Cost

🟡🟡🟡🟡

Use when: Maximum compression on cold archive data. Better ratio than ZSTD level 1, but significantly slower. Niche use case.

ALTER TABLE archive MODIFY COLUMN cold_data_codec Ziqng;

📊 Generic Codec Benchmark (approximate)

Codec	Encode Speed (MB/s)	Decode Speed (MB/s)	Compression Ratio	CPU Cost
LZ4	~800	~2500	2–4×	Low
ZSTD (level 1)	~350	~900	3–6×	Medium
ZSTD (level 3)	~180	~850	4–8×	Medium-High
Zlib	~100	~300	3–5×	Medium
LZMA	~30	~200	5–10×	High
Ziqng	~20	~350	5–9×	High

* Benchmarked on Intel Skylake-class hardware. Your hardware will vary. Codec performance depends heavily on data entropy — high-randomness data compresses poorly with all algorithms.

🔬 Specialized Codecs — The ClickHouse Secret Sauce

Generic compression treats data as opaque bytes. ClickHouse's specialized codecs understand the semantics of your data — timestamps increment, floats drift slowly, IDs share prefixes. This understanding lets them achieve dramatically better ratios than generic algorithms.

Specialized codecs work by detecting and exploiting data patterns. They don't just compress — they transform data into a more compact representation based on what the data means, not just what it contains.

Most specialized codecs are lossless — the original data is perfectly reconstructed. They're designed to stack on top of each other, with each codec in the chain handling a specific pattern.

Codec Families

Delta Family Delta → DoubleDelta → T64 Monotonic sequences

Gorilla Family Gorilla Similar floating point values

Dictionary Encoded Low-cardinality strings

Generic LZ4 → ZSTD → LZMA Final stage compression

🔓 DEFAULT — No Compression

No-op Any data type

Stores data uncompressed. Sometimes the right choice — data that's already encrypted or randomly distributed won't compress further, and compression adds CPU overhead for no benefit.

When to use DEFAULT:

UUID columns — random by design, no compression benefit
Already-compressed data (JPEG, video, binary blobs)
Columns with very high cardinality (>1M unique values)
Data in memory-only tables where I/O isn't a concern

clickhouse-sql

-- UUIDs don't compress well
CREATE TABLE events (
    event_id UUID DEFAULT,
    user_id UUID DEFAULT,
    payload String DEFAULT  -- already compressed external data
) ENGINE = MergeTree()
ORDER BY (event_id);

📊 Delta — Monotonic Sequence Compression

Preprocessor Integers, Dates, DateTime, UInt/N

Stores the difference between consecutive values instead of raw values. For monotonically increasing sequences (timestamps, auto-increment IDs), differences are tiny — a timestamp delta might be just 1 instead of 1704067200.

How Delta encoding works:

Raw values (timestamps as integers):

1704067200, 1704067201, 1704067202, 1704067203, 1704067204

5 × 4 bytes = 20 bytes

→

After Delta encoding (first value stored raw, then deltas):

1704067200, 1, 1, 1, 1

Can now compress to ~5 bytes with Delta(4)

Key insight: Delta stores the first value as-is, then stores subsequent values as value[i] - value[i-1]. Small deltas compress dramatically better than large raw numbers.

Best use cases:

Timestamps with regular intervals (1s, 1ms, 1μs)
Monotonically increasing IDs (user_id, order_id)
Sequence numbers in event logs
Sensor readings taken at fixed intervals

clickhouse-sql

-- Delta(4) stores deltas as 4-byte integers
-- Good for timestamps where deltas fit in 4 bytes
CREATE TABLE sensor_events (
    timestamp DateTime64(3) CODEC(Delta(4)),
    sensor_id UInt32 CODEC(Delta(4)),
    temperature Float32 CODEC(Gorilla)
) ENGINE = MergeTree()
ORDER BY (sensor_id, timestamp);

Limitation: Delta only works well when deltas are small. If your timestamps have irregular gaps (seconds vs. hours), delta will store large values and gain little. For irregular timestamps, consider DoubleDelta or T64.

📈📊 DoubleDelta — Predictable Rate-of-Change

Preprocessor Integers, DateTime, DateTime64

Stores the delta of deltas — the second-order difference. For sequences that change at a constant rate (timestamps incrementing by exactly 1ms, counters incrementing by exactly 1), DoubleDelta achieves extraordinary compression.

DoubleDelta on regularly-spaced timestamps:

Original:

1000000000, 1000000001, 1000000002, 1000000003

→

After Delta:

1000000000, 1, 1, 1

→

After DoubleDelta:

1000000000, 0, 0, 0

Almost all zeros — compresses beautifully with minimal bits!

Perfect for: Metrics that arrive at regular intervals — server timestamps, sensor readings with fixed sampling rates, usage counters incremented by fixed amounts.

clickhouse-sql

-- DoubleDelta(4) for 4-byte timestamp deltas
-- Excellent for regular-interval time series
CREATE TABLE metrics (
    timestamp DateTime64(3) CODEC(DoubleDelta),
    metric_name LowCardinality(String),
    value Float64 CODEC(Gorilla)
) ENGINE = MergeTree()
ORDER BY (metric_name, timestamp);

-- DoubleDelta shines for DateTime64(3) at millisecond precision
-- Ratio improvement over raw: 10-50× typical

🗃️ T64 — Transform for 64-byte Alignment

Preprocessor Signed/Unsigned Int: 8, 16, 32, 64-bit

T64 is a byte-order transformation that rearranges integer data for better compression. It processes 64-byte blocks, transposing data so that similar bytes across values are grouped together — improving dictionary-based compression that follows it.

T64 is typically used as a preprocessing step before other codecs — it reorganizes data into a form that downstream codecs can exploit more effectively.

clickhouse-sql

-- T64 as a preprocessing codec
-- Reorganizes bytes for better downstream compression
CREATE TABLE events (
    event_id UInt64 CODEC(T64, ZSTD(1)),
    user_id UInt64 CODEC(T64, LZ4),
    timestamp DateTime64(3) CODEC(DoubleDelta, ZSTD(1))
) ENGINE = MergeTree()
ORDER BY event_id;

Limitation: T64 requires the data to be aligned to 64-byte boundaries. It's most effective on integer types — using it on Float32/Float64 has limited benefit since the byte layout of floats doesn't group as well.

🦍 Gorilla — Floating Point Time Series Compression

Specialized Float32, Float64

Gorilla is a purpose-built codec for floating point data that changes gradually — the hallmark of time series from sensors, metrics, and monitoring systems. It's the codec that Facebook's Gorilla paper made famous, and ClickHouse implements it directly.

Gorilla exploits two patterns common in time series:

Leading zero sharing: If consecutive floats share the same leading bits, only the differing bits are stored
XOR compression: If value[i] XOR value[i-1] is small, store just the XOR result

Gorilla XOR compression example:

value[i]:

0x41973333 (20.9)

XOR

value[i-1]:

0x41974333 (20.95)

XOR:

0x00001000 (stores only this!)

Best for: Temperature sensors, stock prices, CPU metrics, system observability data — any floating point series where consecutive values are similar.

When Gorilla excels:

Sensor readings that change gradually (temperature, pressure)
Financial time series with small per-tick changes
CPU/memory utilization metrics (similar values across time)
Any Float32/Float64 where consecutive values often share leading bits

clickhouse-sql

CREATE TABLE temperature_sensors (
    sensor_id UInt32,
    timestamp DateTime64(3) CODEC(DoubleDelta, ZSTD(1)),
    temperature Float64 CODEC(Gorilla),       -- slowly varying → great with Gorilla
    humidity Float64 CODEC(Gorilla),         -- also slowly varying
    battery_voltage Float64 CODEC(Gorilla)  -- gradual discharge, Gorilla works too
) ENGINE = MergeTree()
ORDER BY (sensor_id, timestamp);

Limitation: Gorilla works poorly on random or rapidly changing floats — network packet sizes, GPS coordinates, UUID-encoded floats. If consecutive values differ significantly, Gorilla stores full values and gains nothing.

🔗 Codec Stacking — The ClickHouse Advantage

This is where ClickHouse's compression system gets really powerful. You can chain up to 4 codecs per column — each codec in the chain handles a specific pattern, transforming data progressively until the final generic codec squeezes out the last bits.

Input Data

Raw column values

↓

Specialized Codec (1st)

Transforms data exploiting semantic patterns
Delta → removes large baseline values

↓

Specialized Codec (2nd)

Further transforms the already-transformed data
Gorilla → XOR of similar floats

↓

Generic Codec (last)

Final compression stage
ZSTD(1) → dictionary + entropy coding

↓

Compressed Data

Stored on disk in compressed form

raw_timestamp[i] - raw_timestamp[i-1] → small_delta → XOR_with_previous → compressed_bits

Combined effect: 10-50× compression on time series data

💡 Common Codec Stack Patterns

Timestamps (DateTime64)

DoubleDelta, ZSTD(1)

Why: DoubleDelta exploits the regular interval pattern. ZSTD provides final compression. This is the go-to stack for time series timestamps.

Typical ratio: 10–50×

Timestamps + Floats

DoubleDelta, ZSTD(1) for timestamps, Gorilla, ZSTD(1) for floats

Why: Temperature, pressure, metrics — values that change slowly. Gorilla exploits XOR patterns in similar floats. DoubleDelta handles the regular timestamps.

Typical ratio: 8–30× on floats

Auto-increment IDs

Delta(8), ZSTD(1)

Why: IDs increment monotonically. Delta(8) stores tiny 1-4 byte deltas instead of 8-byte raw IDs. ZSTD final stage cleans up.

Typical ratio: 10–20×

Low-Cardinality Strings

Delta(4), ZSTD(3) or ZSTD(3)

Why: Low-cardinality string columns (country codes, status strings) often encode better if first run through Delta if there's a natural sort order, otherwise ZSTD alone.

Typical ratio: 5–15×

High-Cardinality Strings (URLs, JSON)

ZSTD(3) — only option that helps for random data

Why: URLs, long strings, JSON payloads have no exploitable semantic patterns. ZSTD with higher level gives best ratio among options.

Typical ratio: 1.5–4×

Counter / Metric Values

T64, Delta(8), ZSTD(1)

Why: T64 reorganizes bytes, Delta removes large base values, ZSTD compresses the small deltas. Best for monotonically increasing counters.

Typical ratio: 10–30×

📋 Stacking Rules & Constraints

Maximum 4 codecs per column definition

Last codec in the stack must be a generic codec (LZ4, ZSTD, LZMA, etc.)

Specialized codecs (Delta, DoubleDelta, Gorilla, T64) should come first, generic last

Codecs apply per granule (8192 rows), so data patterns within a granule matter most

Changing codec on existing columns requires ALTER TABLE ... MODIFY COLUMN

Codec selection is during table creation — cannot be changed per-row or per-part

📏 Measuring Your Compression Ratios

ClickHouse provides several ways to inspect actual compression effectiveness. Use these queries to understand how your data is actually compressing and identify opportunities for improvement.

🔍 system.parts — Per-Part Compression Stats

Every part in ClickHouse tracks compressed and uncompressed sizes. Query this to see per-table, per-part actual ratios.

clickhouse-sql

SELECT
    database,
    table,
    partition_id,
    name AS part_name,
    active,
    bytes_on_disk AS compressed_bytes,
    primary_key_bytes_in_memory AS pk_bytes,
    rows,
    ROUND(data_compressed_bytes / data_uncompressed_bytes, 3) AS compression_ratio,
    data_uncompressed_bytes,
    data_compressed_bytes
FROM system.parts
WHERE database = 'default' 
  AND table = 'events'
  AND active = 1
ORDER BY partition_id, name;

📊 system.columns — Column-Level Info

See which columns are storing data and their default codec specifications.

clickhouse-sql

SELECT
    name AS column_name,
    data_type,
    codec_desc AS codec,
    formatReadableSize(data_compressed_bytes) AS compressed_size,
    formatReadableSize(data_uncompressed_bytes) AS uncompressed_size
FROM system.columns
WHERE database = 'default'
  AND table = 'events'
  AND data_compressed_bytes > 0
ORDER BY data_uncompressed_bytes DESC;

🧪 testCompression — Experimental Codecs

ClickHouse includes a testCompression() table function that compresses a sample of data with different codecs and reports the results.

clickhouse-sql

-- Test different codecs on your data
SELECT
    codec,
    compressed_size,
    uncompressed_size,
    ROUND(compression_ratio, 3) AS ratio,
    encode_time_us,
    decode_time_us
FROM (
    SELECT *
    FROM testCompression('your_table')
    WHERE compression_method NOT LIKE 'Preliminary%'
)
ORDER BY compression_ratio DESC;

📈 Per-Column Ratio Analysis

Identify which columns compress well and which don't — helps focus codec optimization effort.

clickhouse-sql

-- Find columns with poor compression (ratio < 2x)
SELECT
    database,
    table,
    name AS column_name,
    data_type,
    codec_desc,
    ROUND(data_uncompressed_bytes / NULLIF(data_compressed_bytes, 0), 2) AS ratio
FROM system.columns
WHERE database = 'default'
  AND table LIKE '%events%'
  AND data_compressed_bytes > 0
HAVING ratio < 2.0
ORDER BY ratio ASC;

🎯 Expected Compression Ratios by Data Type

Data Type	Recommended Codec Stack	Expected Ratio	Notes
Timestamps (DateTime, DateTime64)	`DoubleDelta, ZSTD(1)`	10–50×	Best when regular intervals
Float32/Float64 (slowly varying)	`Gorilla, ZSTD(1)`	8–30×	Sensor readings, metrics
UInt64 IDs (monotonic)	`Delta(8), ZSTD(1)`	10–20×	Auto-increment, event IDs
UInt64 IDs (random)	`ZSTD(3)`	1.2–2×	Limited gains on random IDs
LowCardinality(String)	`ZSTD(3)`	5–15×	Country codes, status strings
String (high entropy)	`ZSTD(3)`	1–3×	JSON, URLs, random strings
Nullable(DateTime)	`DoubleDelta, ZSTD(1)`	8–40×	NULL bits pack well

🔑 Primary Key Columns and Compression

Primary key columns appear in every MergeTree data file — the primary key index (.idx), the primary key column data (.bin), and the mark file (.mrk). Compression on these columns has outsized impact.

MergeTree Column Files for a Single Part

timestamp.idx

~8KB

Sparse index — one entry per granule (8192 rows)

Not compressed (random access needed)

timestamp.mrk

~1KB

Mark file — offset pointers into data file

Not compressed

timestamp.bin

Varies

Column data — this IS compressed

User-specified codec applies here

Why Primary Key Column Compression Matters More

Every query reads primary key columns first — they're used for WHERE clause filtering, range scans, and primary key lookups
Primary key columns appear in ORDER BY clauses — almost every query processes them
They're read on every part scan — even queries that filter to specific partitions still read primary key data for marking
MergeTree sorts by primary key — this sorting exposes patterns (monotonic increase, low cardinality) that good codecs exploit

Recommendations for Primary Key Columns

Timestamps in the primary key:

DoubleDelta, ZSTD(1) — almost always the right choice. Timestamps in primary keys are sorted and often monotonic.

Low-cardinality dimensions (category, status, country):

ZSTD(1) — works well for low-cardinality strings that appear in the primary key. Consider LowCardinality(String) type as well.

High-cardinality IDs in primary key (user_id, order_id):

Delta(8), ZSTD(1) if monotonically increasing. ZSTD(3) if random. Even random IDs compress somewhat with ZSTD.

⏭️ Data Skipping with Compressed Data

ClickHouse's data skipping relies on reading marks (offset pointers) and using the sparse primary key index to determine which granules (8192-row chunks) to decompress and scan. Compression doesn't break this — it makes it more important.

🔎 How Skipping Works with Compression

Read Marks

Mark files (.mrk3) are read to find data offsets. Marks are small (~16 bytes each) and not compressed — they give ClickHouse the starting byte position for each granule.

Primary Key Index Lookup

Sparse index (.idx) is checked. With 8192 rows per granule, an index of ~1000 entries covers 8M rows. The index tells us which granule might contain matching rows.

Selective Decompression

Only granules that might contain matching data are decompressed. If the WHERE clause filters to 3 out of 1000 granules, only ~3/1000 of the compressed column data is decompressed.

Column Data Scan

Decompressed data is scanned for exact matches. If compression ratio was 10×, the decompressed data is now 10× larger in memory — but it was 10× faster to read from disk.

📊 Compression Impact on Data Skipping

Factor	Low Compression (1–2×)	Medium Compression (4–8×)	High Compression (10–50×)
Disk I/O	High — reading more raw bytes	Balanced — good ratio with moderate CPU	Low — fewer bytes to read
Decompression CPU	Low — less data to decompress	Moderate — reasonable workload	Higher — more data to decompress per match
Cache Efficiency	Poor — large data doesn't fit in cache	Good — compressed data fits better	Excellent — more data fits in cache
Best for	Hot data, very frequent access	General analytics	Wide scans, archival data

📍 Mark Files and Granules

ClickHouse organizes data into granules — 8192 consecutive rows that are processed together. Each granule has one entry in the primary key sparse index and one mark (offset pointer).

Rows:

Granule 0

8192 rows

mark[0]

Granule 1

8192 rows

mark[1]

Granule 2

8192 rows

mark[2]

...

Granule N

8192 rows

mark[N]

PK Index:

idx[0]

idx[1]

idx[2]

...

idx[N]

Codecs apply at the granule level — each granule can have slightly different compressed size, and codec selection is fixed per column but compression ratio varies per granule based on actual data patterns.

🏷️ Low Cardinality Columns — Built-In Efficiency

Low-cardinality columns (columns with few unique values relative to total rows) compress exceptionally well. ClickHouse has both codec-level optimizations and a dedicated LowCardinality data type that encodes values internally.

Option 1: LowCardinality(String) Type

The LowCardinality(String) type tells ClickHouse to use internal dictionary encoding. Strings are replaced with integer indices into a dictionary at write time. This reduces storage dramatically for low-cardinality strings.

clickhouse-sql

-- country_code has only ~200 unique values across 100M rows
CREATE TABLE events (
    event_id UInt64,
    timestamp DateTime64(3) CODEC(DoubleDelta, ZSTD(1)),
    country_code LowCardinality(String),  -- internal dictionary encoding
    status_code UInt16 CODEC(Delta(2), ZSTD(1)),
    value Float64 CODEC(Gorilla)
) ENGINE = MergeTree()
ORDER BY (country_code, timestamp);

Raw String storage:

~200 bytes × 100M = ~19 GB

LowCardinality storage:

~2 bytes × 100M + dictionary = ~200 MB + overhead

Compression ratio:

~95× just from dictionary encoding!

Option 2: Enum / Codes

For columns with very few fixed values (status codes, categories), use Enum8 or Enum16 instead of String. Even lower storage than LowCardinality.

clickhouse-sql

CREATE TABLE http_requests (
    timestamp DateTime64(3) CODEC(DoubleDelta, ZSTD(1)),
    status_code Enum8('200' = 200, '301' = 301, '404' = 404, '500' = 500),
    request_count UInt64 CODEC(Delta(8), ZSTD(1))
) ENGINE = MergeTree()
ORDER BY timestamp;

Option 3: Codec on Regular Column

If you can't change the data type, use codecs to compress low-cardinality numeric data.

clickhouse-sql

-- Integer status codes: 200, 301, 404, 500 (4 unique values)
ALTER TABLE events MODIFY COLUMN status_code CODEC(Delta(2), ZSTD(1));

-- The Delta codec is highly effective for small integers
-- because tiny deltas compress much better than raw values

📊 Real-World Example: IoT Sensor Time Series

Complete table design for a sensor monitoring system, with codec selection for each column.

clickhouse-sql — IoT Sensor Table with Full Codec Strategy

CREATE TABLE iot_sensors (
    -- Primary key columns: frequently used in WHERE, ORDER BY
    device_id UInt64 CODEC(Delta(8), ZSTD(1)),
    timestamp DateTime64(3) CODEC(DoubleDelta, ZSTD(1)),
    
    -- Measurement values: slowly varying floats → Gorilla
    temperature Float64 CODEC(Gorilla, ZSTD(1)),
    pressure Float64 CODEC(Gorilla, ZSTD(1)),
    humidity Float64 CODEC(Gorilla, ZSTD(1)),
    battery_voltage Float64 CODEC(Gorilla, ZSTD(1)),
    
    -- Signal quality: 0-100 integer, low cardinality
    signal_quality UInt8 CODEC(Delta(1), ZSTD(1)),
    
    -- Location: LowCardinality string
    location_id LowCardinality(String),
    
    -- Error code: low cardinality integer → Delta
    error_code UInt16 CODEC(Delta(2), ZSTD(1)),
    
    -- Raw payload: high entropy JSON, ZSTD only option
    raw_payload String CODEC(ZSTD(3))
) ENGINE = MergeTree()
ORDER BY (device_id, timestamp)
PARTITION BY toYYYYMM(timestamp)
TTL timestamp + INTERVAL 90 DAY;

Why these codecs?

device_id: Monotonically increasing UInt64 → Delta(8) reduces to tiny deltas
timestamp: Regular 1ms intervals → DoubleDelta gets the delta-of-delta close to 0
temperature/pressure/humidity: Slowly varying floats → Gorilla XOR gives 8–30× ratio
signal_quality: UInt8 (0–100) → Delta(1) handles tiny deltas perfectly
location_id: 1000 unique strings → LowCardinality saves ~50× over plain String
raw_payload: High-entropy JSON → ZSTD(3) is the only option that helps

Expected compression

Raw data size: ~120 bytes/row

Compressed size: ~6–12 bytes/row

Overall ratio: 10–20× typical

For 1B rows: ~120 GB → ~8–12 GB

🏗️ MergeTree Storage and Compression Interaction

ClickHouse's MergeTree storage engine is where codec selection actually happens. Understanding how MergeTree stores data helps you make better codec choices.

📁 MergeTree Part File Structure

Each MergeTree part (a batch of sorted rows) contains these files on disk:

Per-Column Files

{column}.bin

Column data — THIS is where codecs apply. Compressed column values.

{column}.mrk3

Mark file — offset pointers (not compressed). Granule boundaries.

{column}.idx

Primary key sparse index — not compressed. One entry per granule.

Primary Key Files

primary.idx

Combined primary key index — one entry per granule across all pk columns.

primary.mrk3

Primary key mark file.

Part Metadata

count.txt

Row count in this part.

checksums.txt

File checksums for integrity verification.

columns.txt

Column metadata and codec specifications.

💡 Storage Insight: Codec Selection Affects Disk I/O

When ClickHouse reads a column for query processing, it:

Reads mark file (.mrk3) to find granule offsets — fast, small, uncompressed
Uses primary key sparse index to skip granules that don't match WHERE clause — fast
Reads compressed column data for matching granules only — this is where codec matters
Decompresses in-memory — CPU cost, but amortized over fast columnar processing

I/O time = (compressed_bytes_read / disk_speed) + (decompressed_bytes / memory_bandwidth × cpu_cycles_per_byte)

Better compression reduces the first term (I/O) at the cost of increasing the second term (decompression CPU). The tradeoff almost always favors better compression in OLAP workloads.

🔄 Part Merges and Compression

When MergeTree merges parts, it recompresses the data. This is when codec choice affects write performance — and why you should test codec combinations before production deployment.

✓

Merges run in background — write performance during ingestion doesn't directly depend on codec compression ratio

✓

Better codec = faster merges — less data to read and write during the merge process

Changing codecs on existing tables requires ALTER TABLE ... MODIFY COLUMN which triggers a full rewrite

Stack codecs wisely — the most effective stacks (DoubleDelta + Gorilla + ZSTD) do more CPU work per byte during merge

📋 Codec Quick Reference

Specialized Codecs

Codec	Best For	Data Types
Delta(N)	Timestamps, monotonic integers, regular intervals	Int, DateTime
DoubleDelta	Regular-interval timestamps, constant-rate sequences	Int, DateTime
T64	Integer preprocessing before generic codec	Int (all sizes)
Gorilla	Time series floats (temperature, metrics, sensors)	Float32, Float64
DEFAULT	Random data (UUIDs, hashes, encrypted blobs)	Any

Generic Codecs

Codec	Ratio	Speed	CPU
LZ4	2–4×	Fastest decode	Low
ZSTD(1)	3–6×	Fast	Medium
ZSTD(3)	4–8×	Medium	Medium-High
ZSTD(10)	5–10×	Slow	High
LZMA	5–10×	Very slow	Very High
Zlib	3–5×	Medium	Medium

🔥 Common Codec Stacks

DoubleDelta, ZSTD(1) — Timestamps (best default for time series)

Gorilla, ZSTD(1) — Floating point time series (metrics, sensors)

Delta(8), ZSTD(1) — Monotonic UInt64 IDs

T64, Delta(8), ZSTD(1) — Counter columns (high-frequency monotonic)

ZSTD(3) — High-entropy strings (JSON, URLs, random data)

Delta(2), ZSTD(1) — Small integers (status codes, categories)

DoubleDelta, Gorilla, ZSTD(1) — Full time series stack (timestamp + float value)