ClickHouse vs Apache Druid

Druid and ClickHouse are both real-time analytics databases descended from columnar storage, and both rose from "logs and metrics need their own engine." Beyond that they diverge sharply: Druid pre-aggregates at ingestion ("rollup") and stores time-bucketed segments tuned for slice-and-dice OLAP queries; ClickHouse stores raw rows in MergeTree parts and lets the SQL engine aggregate at query time. Druid is a multi-process distributed system; ClickHouse is a single binary. Druid's sweet spot is the time-series dashboard with predictable shapes; ClickHouse's is general-purpose OLAP plus ad-hoc SQL.

Architectural shapes

Druid is N services (Broker, Coordinator, Historical, MiddleManager, Overlord, Router) plus deep storage and a metadata DB. ClickHouse is one process plus Keeper.

Side by side

	ClickHouse	Druid
Process model	1 binary	5+ services
Storage unit	Parts within partitions	Time-bucketed segments
Aggregation	Lazy, query-time SQL	Rollup at ingestion (often)
Schema	SQL types	Dimensions + metrics + time
SQL	Full	Druid SQL (subset, planner-translated)
Joins	Hash / merge / direct	Broadcast lookup only
Updates	Mutations + ReplacingMergeTree	Re-ingest segments
Ingest path	HTTP/native INSERT	Indexer task (Kafka → segment)
Failure model	Per-node, replicas	Segments live in deep storage

Rollup vs raw rows

Druid's defining feature is rollup: at ingestion time, rows with the same dimensions and time-bucket are pre-aggregated into a single row with summed metrics. A 1B-event raw stream might land as 100M rolled-up rows. This is what makes Druid fast on classic dashboard queries — most of the work is already done.

ClickHouse stores raw rows by default. To get the same effect you build a MaterializedView writing into an AggregatingMergeTree with sumState/uniqState. The result is the same — pre-aggregated tiers — but you control when and how aggregation happens, and the raw rows remain queryable.

-- ClickHouse equivalent of Druid rollup
CREATE MATERIALIZED VIEW events_5m TO events_5m_agg AS
SELECT
    toStartOfFiveMinutes(ts) AS window,
    event,
    sumState(toUInt64(1)) AS hits
FROM events
GROUP BY window, event;

Real-time ingestion path

Druid's archetype is "Kafka → MiddleManager (indexing task) → segment → handoff to Historical → query." The path involves the Coordinator scheduling segments onto Historical nodes after a configurable delay (segment_granularity, ~10 minutes). Recent data lives on MiddleManagers (slow) until handoff.

ClickHouse INSERTs land as parts in seconds and are queryable immediately. The Kafka engine (ENGINE = Kafka) plus a materialized view gives you the same "consume topic, aggregate, store" pipeline in pure SQL with no external indexer.

Query latency

For pre-aggregated dashboard queries (group by dimensions, time-bucket, single time range), Druid's segment layout is microsecond-fast. For queries shaped differently — joins, sub-queries, ad-hoc filters on un-rolled dimensions — Druid is uncomfortable; SQL is translated into a more limited native API.

ClickHouse is more uniform. Latency depends on partition pruning, primary-key alignment, and codec choice, not on whether the query "looks like" a Druid topology.

Operations

Druid: 5+ JVM services per cluster, deep storage (S3/HDFS), metadata DB (Postgres/MySQL), ZooKeeper. Tuning means understanding which service does what and how segment handoff works.

ClickHouse: one binary per node, plus Keeper for replication. Tuning means knowing MergeTree settings, codec choice, and partition strategy.

Schema model

Druid splits columns into three categories: timestamp (mandatory), dimensions (group-by columns), and metrics (aggregated values). Schema is set at ingestion and roll-up is irreversible; if you drop a dimension, the rolled-up data merges across that dimension and the original detail is gone.

ClickHouse uses ordinary SQL types. Any column can be filtered, grouped, or aggregated; raw rows remain intact. The flexibility cost: you set storage policy explicitly via ORDER BY, codecs, and materialized views, where Druid's segment design "decides" for you.

-- Druid spec (excerpt, JSON)
{
  "dimensionsSpec": { "dimensions": ["service", "region"] },
  "metricsSpec":    [{"type":"longSum","name":"hits","fieldName":"count"}],
  "granularitySpec": {"queryGranularity":"MINUTE","rollup":true}
}

-- ClickHouse equivalent
ENGINE = AggregatingMergeTree
ORDER BY (service, region, toStartOfMinute(ts))

Tradeoffs

+ ClickHouse: simpler ops, full SQL, raw rows queryable.
+ Druid: pre-rolled-up data is microsecond fast for matching queries.
+ Druid: deep-storage isolation makes individual-node failure cheap.
− ClickHouse: rollup is opt-in via MVs; you assemble it.
− Druid: limited joins, no full SQL, multi-service operational model.
− Druid: ingestion handoff delay means recent rows are slower.

Failure model

Druid's deep storage isolates the durability question: even if every Historical node loses its disk, the segments are still in S3/HDFS and can be re-served by any other node. The Coordinator handles re-balancing automatically. The cost: every query reads from local cache backed by deep storage; cold reads can be slow.

ClickHouse stores parts on local disks of replicas (or S3 with the new disk type). Replication via Keeper provides redundancy within a shard; loss of all replicas of a shard is data loss. With S3-backed storage (24+) the durability story converges with Druid's, at the cost of higher read latency.

For "lose a node, lose nothing, query continues unaffected," Druid's design has a clean answer. For "low-latency reads, simpler ops, manageable durability via replicas," ClickHouse's wins.

FAQ

Which is faster for dashboards?

If your query is a "group by dimensions over time", Druid wins on pre-rolled-up data. ClickHouse with an AggregatingMergeTree MV reaches the same speed but requires you to build the rollup.

Can Druid do joins?

Only "lookup" joins (broadcast small dimension tables). General hash joins are not supported. ClickHouse joins everything, with the caveats covered in the JOINs page.

What about updates?

Druid replaces full segments — all-or-nothing per time bucket. ClickHouse has lightweight DELETE and ReplacingMergeTree for fine-grained updates.

Should I migrate from Druid?

If your query mix has grown beyond classic OLAP cubes (joins, ad-hoc, complex SQL), ClickHouse fits better. If you're pure-cube and happy with Druid's ops cost, stay.

What about Apache Pinot?

Pinot is closer to Druid in spirit — segments, deep storage, multi-service. It has better SQL than Druid but the same "classic OLAP" archetype.