Memory Management

How an In-Process Database Shares RAM Without Hogging It

DuckDB runs inside your process. It shares the same address space as your Python script, your web server, or your data pipeline. Unlike a standalone database server that owns all available RAM, DuckDB needs to be a good memory citizen — using what it needs, releasing what it does not, and gracefully handling situations where the data outgrows the available memory.

The key mechanism is the buffer manager: a pool of fixed-size memory blocks that tracks every allocation, enforces configurable limits, and spills to temporary disk files when pressure gets too high. Combined with DuckDB's streaming execution model — where most operators only need enough memory for a single vector of 2048 rows — this architecture lets DuckDB process datasets far larger than RAM on a single machine.

The Embedded Memory Challenge

A server database like PostgreSQL owns its memory. DuckDB borrows it.

📦

Shared Address Space

DuckDB runs as a library inside your application. If it allocates 12GB on a 16GB machine, your application gets the remaining 4GB minus OS overhead. There is no isolation boundary — both compete for the same physical pages.

⚠

Unpredictable Workloads

A user might run a simple SELECT that touches 100 rows, then follow it with a 10-way join over 500 million rows. The memory system must handle both extremes without manual tuning between queries.

🛡

No OOM Crashes

If DuckDB is embedded in a production web server, an out-of-memory crash takes down the entire service. The memory manager must degrade gracefully — slowing down rather than terminating.

Buffer Manager Architecture

All DuckDB memory flows through the buffer manager. It allocates blocks, tracks usage, and decides what stays in RAM and what gets evicted.

Memory limit: 4 GB

Query memory pressure: 40%

Buffer Pool Usage

1.6 GB used 4 GB limit

Normal operation — blocks allocated from pool

Block Size

256 KB

Fixed-size allocation unit. Large enough to hold useful data, small enough for fine-grained eviction.

Eviction Policy

LRU-based

Least recently used unpinned blocks are evicted first. Pinned blocks (actively being read by an operator) are never evicted.

Temp Directory

Auto-managed

Evicted blocks are written to temporary files. DuckDB creates and cleans up these files automatically.

Memory Pool Visualization

Watch how blocks are allocated, pinned, and evicted as a query executes. When the pool fills up, unpinned blocks spill to disk so the query can continue.

Hash Table Sort Buffer Scan Buffer Free Spilling to Disk

Block Lifecycle

Every block in the buffer pool transitions through a defined set of states.

Allocated

Block carved from the pool. Assigned to an operator.

→

Pinned

Actively in use. Cannot be evicted. An operator is reading or writing this block.

→

Unpinned

Operator released the pin. Still in memory but eligible for eviction.

→

Evicted

Written to temp file on disk. Memory freed for reuse. Can be reloaded if needed.

Spill-to-Disk Mechanism

When a query needs more memory than the configured limit, DuckDB does not fail. It writes intermediate results to temporary files and continues.

Large Sort

Sorting 100 million rows with a 2GB memory limit. The sort operator fills its buffer, writes sorted runs to disk, then merges them in a final pass. This is external merge sort — the same algorithm that databases have used for decades, integrated into DuckDB's pipeline-based execution.

Memory used 2 GB (capped)

Temp disk written ~8 GB (sorted runs)

Hash Join Build

Building a hash table on a 50-million-row table. When the hash table exceeds the memory limit, DuckDB partitions both sides of the join, spills partitions to disk, and processes them one at a time. Each partition fits in memory, so the join completes without requiring the full hash table to be resident.

Partitions spilled ~16 (adaptive)

Overhead 2-5x slower than in-memory

Aggregation

GROUP BY over a high-cardinality column (millions of groups). The hash aggregate operator tracks its memory usage. When it approaches the limit, it partitions the groups, spills some partitions, and processes remaining ones in memory. Spilled partitions are loaded back one at a time in a second pass.

Strategy Partition + two-pass

Transparent Yes — same SQL, same results

Streaming Execution and Memory

Most operators in DuckDB never accumulate data. They process one vector at a time, keeping memory usage constant regardless of table size.

Table Scan

2048 rows

▶

Filter

2048 rows

▶

Projection

2048 rows

▶

Result

Streamed out

This entire pipeline uses constant memory: one vector (2048 rows x selected columns) per thread. A scan-filter-project over 10 billion rows uses the same memory as over 10 thousand rows.

Pipeline Breakers: Where Memory Accumulates

ORDER BY (Sort)

Must see all rows before producing output. Buffers entire input, spills sorted runs if needed.

Hash Join (Build Side)

Builds hash table on one input before probing with the other. Table must fit in memory or be partitioned to disk.

GROUP BY (Hash Aggregate)

Accumulates groups. Low-cardinality groups fit in memory easily; high-cardinality may require partition spillover.

DISTINCT

Tracks all unique values seen so far. Functionally equivalent to GROUP BY with no aggregation function.

Allocation Strategy

DuckDB uses a purpose-built allocator that balances performance, predictability, and fragmentation avoidance.

Aspect	DuckDB Approach	Typical Server DB
Block pool	Pre-allocated fixed-size blocks (256KB). Avoids external fragmentation.	Shared buffer pool with 8KB pages (PostgreSQL) or larger blocks (MySQL InnoDB 16KB).
Large allocations	Oversized requests fall through to the system allocator but are tracked against the memory limit.	Typically use the shared buffer pool or dedicated memory arenas.
Thread safety	Each thread has a local allocation cache. Contention is minimized because DuckDB's morsel-driven model gives each thread independent work units.	Centralized buffer pool with latches or lock-free structures.
Temp files	Automatically created in a configurable temp directory. Cleaned up when the connection closes or the query finishes.	Explicit tablespace configuration for temp tables. Requires DBA setup.

Frequently Asked Questions

What happens when DuckDB exceeds the memory limit?

DuckDB does not crash or throw an out-of-memory error immediately. When the buffer manager reaches the configured memory limit, it evicts unpinned blocks from the buffer pool and writes them to temporary files on disk. Operators that need more memory than the limit — like large sorts or hash joins — transparently spill their intermediate data to these temp files and continue processing. The query completes, just slower than if everything fit in RAM.

How do I set a memory limit in DuckDB?

Use SET memory_limit='4GB'; (or any value) at the start of your session. You can also pass it as a configuration option when opening a connection. If you do not set a limit, DuckDB defaults to 80% of available system RAM. For embedded use cases where DuckDB shares memory with your application, setting an explicit limit prevents it from starving other components of memory.

Does DuckDB use memory-mapped files?

DuckDB uses a combination of standard heap allocation and its own buffer pool. It does not rely on mmap for general query processing the way some databases do. The buffer pool manages fixed-size blocks (typically 256KB) and handles eviction and spill-to-disk internally. This gives DuckDB predictable memory behavior and avoids the unpredictable page-fault patterns that mmap-based systems can suffer from under memory pressure.

How does streaming execution reduce memory usage?

DuckDB processes data in pipelines. A scan-filter-project pipeline only materializes one vector of 2048 rows at a time in each thread. The rows flow through the entire operator chain before the next batch is fetched. This means a simple filter query over a billion-row table might only need a few kilobytes of working memory. Only pipeline-breaking operators like sorts, hash join builds, and aggregations need to accumulate data, and those are the operators that spill to disk when necessary.

Can I monitor DuckDB's memory usage at runtime?

Yes. You can query the pragma_database_size() function to see current database size, and use CALL duckdb_temporary_files() to inspect spill files. The buffer manager tracks total allocated memory internally. In Python, you can also use standard process-level tools like psutil to monitor the RSS of the process DuckDB is embedded in, giving you a full picture of both DuckDB and application memory usage.

The Embedded Memory Challenge

Shared Address Space

Unpredictable Workloads

No OOM Crashes

Buffer Manager Architecture

Memory Pool Visualization

Block Lifecycle

Spill-to-Disk Mechanism

Streaming Execution and Memory

Pipeline Breakers: Where Memory Accumulates

Allocation Strategy

Frequently Asked Questions

🔗 Related Topics