Producers

Batching, Compression, and the Knobs That Matter

A Kafka producer is not a thin client wrapping send(). It is a buffered, asynchronous, per-partition batcher with its own background I/O thread. Records are accumulated in RecordAccumulator by destination partition, compressed as a batch, and sent in configurable-size requests to the partition leader.

Tuning the producer well is mostly tuning four knobs: batch.size, linger.ms, compression.type, and acks. The defaults optimize for low latency on development workloads. Production throughput configurations look very different.

Architecture

Application threads call send(), the Sender thread drains batches to brokers.

App thread producer.send() App thread App thread RecordAccumulator (in-memory buffer per partition) Batch: orders-0 12.4 KB / 16 KB, 8ms old Batch: orders-1 16.0 KB / 16 KB FULL Batch: orders-2 3.1 KB / 16 KB, 2ms old Sender thread drains ready batches groups by broker compresses each batch sends ProduceRequest handles ack / retry Broker 1 orders-0 leader Broker 2 orders-1 leader Broker 3 orders-2 leader batch ready when full OR linger.ms elapsed; one ProduceRequest may carry many batches to one broker

Key Numbers

16 KB
batch.size default
0 ms
linger.ms default (latency-optimized)
5
max.in.flight.requests.per.connection (with idempotence)
32 MB
buffer.memory default (total accumulator)
2 min
delivery.timeout.ms default
~3-5x
typical lz4 compression ratio for JSON
none
compression.type default

Batching: linger.ms and batch.size

A batch is sent when either it fills up to batch.size, or linger.ms passes since the first record was added.

# Latency-optimized (default)
linger.ms=0
batch.size=16384
# Effect: send immediately, batches are tiny under low load

# Throughput-optimized
linger.ms=20
batch.size=131072        # 128 KB
# Effect: wait up to 20ms for more records, batches up to 128 KB
# Adds <= 20ms latency, 5-10x throughput improvement

# Extreme throughput (bulk loaders)
linger.ms=100
batch.size=1048576       # 1 MB
compression.type=zstd
# Adds 100ms latency, often 50x throughput vs default

The interaction with compression is multiplicative: bigger batches compress much better because the LZ4/zstd dictionary has more redundancy to exploit. A 16 KB batch of similar JSON records typically compresses 2-3x; a 1 MB batch of the same records often hits 8-10x.

Compression Codecs

CodecRatio (JSON)Producer CPUNotes
none1.0x0%Default. Wastes network and disk.
snappy3.0xLowOlder default, similar to lz4.
lz43.5xLowBest balance for most workloads.
gzip5.0xHigh (3-5x lz4)Rarely worth the CPU.
zstd4.5xMediumBest ratio at acceptable CPU. Default in many new deployments.

Compression happens once on the producer. Brokers store batches as-is (the log format includes a compression byte) and consumers decompress. This means the broker does not pay CPU for compression — but it also means the broker cannot recompress with a better codec, so producer choice persists.

Acks Levels

# acks=0  — fire and forget
acks=0
# Producer does not wait for any ack. Lowest latency, can lose data on broker crash.
# Use only for metrics/telemetry where loss is acceptable.

# acks=1  — leader ack
acks=1
# Producer waits for leader to write to its log (page cache, not necessarily disk).
# Loses data if leader crashes before followers replicate.

# acks=all — full ISR ack (recommended)
acks=all
min.insync.replicas=2     # broker-side: configured per topic
replication.factor=3
# Leader waits for all in-sync followers to ack. With min.insync.replicas=2,
# the leader rejects writes if ISR drops below 2, surfacing the durability problem.

The subtle gotcha: acks=all alone is not enough. If two of your three replicas die and the leader is the only ISR member, acks=all degrades to acks=1 silently. min.insync.replicas=2 makes that case fail fast with NotEnoughReplicasException, which the producer can retry or surface as an alert.

max.in.flight.requests.per.connection

How many ProduceRequests can be outstanding to one broker at once. Affects ordering and throughput.

# Strict ordering (legacy non-idempotent)
max.in.flight.requests.per.connection=1
# Guarantees in-order delivery even on retry, but kills pipelining.
# Throughput drops 3-5x.

# Idempotent (recommended)
enable.idempotence=true
max.in.flight.requests.per.connection=5
# Broker dedupes by sequence number, so reordering is detected and corrected.
# Maintains per-key ordering with full pipelining.

# Non-idempotent, high throughput, no ordering needed
max.in.flight.requests.per.connection=100
acks=1
# Risky: a retry of an older batch can land after a newer batch.

Retries and Timeouts

# How many times to retry on retriable errors (default Integer.MAX_VALUE)
retries=2147483647

# Total time a record can spend in the producer before failing
delivery.timeout.ms=120000  # 2 min default

# Per-request timeout
request.timeout.ms=30000

# How long to wait between retry attempts
retry.backoff.ms=100

# Effect: producer keeps retrying until delivery.timeout.ms expires.
# Common retriable errors: NotEnoughReplicas, NotLeaderForPartition, RequestTimedOut.
# Non-retriable: SerializationException, RecordTooLargeException, AuthenticationException.

Partitioner: Sticky vs Hash

# Default since 2.4: built-in sticky partitioner
# - If record has a key: hash(key) % numPartitions  (consistent per-key routing)
// - If record has no key: stick to one partition until batch is full, then switch

# Custom partitioner (rare)
public class MyPartitioner implements Partitioner {
    public int partition(String topic, Object key, byte[] keyBytes,
                         Object value, byte[] valueBytes, Cluster cluster) {
        List<PartitionInfo> parts = cluster.partitionsForTopic(topic);
        // route premium tenants to partitions 0-3, others to 4-N
        if (((String)key).startsWith("premium-")) {
            return Math.abs(key.hashCode()) % 4;
        }
        return 4 + (Math.abs(key.hashCode()) % (parts.size() - 4));
    }
}

The sticky partitioner improved tail latency dramatically when KIP-480 landed. With round-robin and no key, a producer with 100 partitions and 100 messages per second made 100 batches of 1 record each. With sticky, the same workload makes one batch of 100 records, which compresses, sends, and acks ~10x faster — and partition load remains balanced over time.

Idempotent vs Transactional

PropertyIdempotent onlyTransactional
Configenable.idempotence=true+ transactional.id=...
Dedupe scopePer (PID, partition), single producer instanceSpans producer restarts via fenced epoch
AtomicityPer-recordAcross multiple partitions + offset commits
Cost~5% throughput overhead10-30% overhead, plus consumer-side LSO buffering
Use whenAlways — should be the defaultRead-process-write to Kafka

Tradeoffs and When Not to Use Defaults

Production checklist

  • enable.idempotence=true (free correctness)
  • compression.type=lz4 or zstd
  • linger.ms >= 5 (huge throughput win)
  • acks=all + min.insync.replicas=2
  • delivery.timeout.ms tuned for your SLO

Common pitfalls

  • linger.ms=0 in production (tiny batches, poor compression)
  • acks=all without min.insync.replicas (silent durability loss)
  • max.in.flight=1 "for safety" (3-5x throughput hit, idempotence is the right answer)
  • Forgetting to call producer.flush() before shutdown (buffer loss)

Frequently Asked Questions

What is the difference between linger.ms and batch.size?

batch.size (default 16384 bytes) is the maximum bytes per partition batch. linger.ms (default 0) is the maximum time the producer waits before sending a batch even if it is not full. The producer sends as soon as either condition is met. Setting linger.ms=10 gives the producer 10ms to accumulate more records, which dramatically improves throughput at the cost of 10ms added latency.

Does acks=all guarantee no data loss?

Only when combined with min.insync.replicas >= 2 and replication.factor >= 3. acks=all means the leader waits for all members of the in-sync replica set (ISR) to acknowledge before responding. If the ISR shrinks to just the leader (because followers fell behind), acks=all degenerates to acks=1. min.insync.replicas=2 forces the leader to refuse writes when ISR is below 2, surfacing the problem instead of silently losing durability.

Why is max.in.flight.requests.per.connection limited to 5 with idempotence?

The broker tracks the last 5 sequences per (PID, partition) to detect duplicates from retries. If you allowed 6 in-flight requests and the third one failed and retried, the producer would have received responses out of order, producing OutOfOrderSequenceException. The 5-batch window is the broker's dedupe buffer size. Without idempotence you can technically go higher, but reordering on retry makes the broker write batches in the wrong order, breaking per-key ordering.

What does the sticky partitioner actually do?

Before 2.4, the default partitioner round-robined records with no key across partitions. This created tiny batches per partition. The sticky partitioner picks one partition and sends all records there until the batch is full or linger.ms expires, then picks a new partition. This produces large batches that compress better and have less broker overhead. With the same throughput, the average partition still gets even load — it just gets it in bursts.

Which compression codec should I use?

lz4 or zstd in almost all cases. lz4 has the best compression-CPU tradeoff for typical JSON/Avro records. zstd compresses ~20% better at similar CPU. snappy is older and roughly equivalent to lz4. gzip compresses well but is much slower (3-5x) and rarely worth it. Compression happens once on the producer; the broker stores compressed batches as-is and the consumer decompresses, so producer-side codec choice dominates.