Trace Sampling Strategies
Head-Based vs Tail-Based · Probabilistic vs Deterministic · The Cost-Quality Tradeoff
A single production request at 100 RPS with 30 spans and ~1 KB per span produces 3 MB/sec of telemetry. At 1 GB/day ingestion, a single observability backend can cost $50K/month. Sampling is the pressure valve — but it has a catch: the traces most worth keeping are the ones that look normal until they don't. Sampling strategies answer the question of when the sampling decision is made and what rule is applied.
The core split: head-based sampling decides before the trace is complete (simple, low overhead, but may discard the interesting tail). Tail-based sampling decides after the trace finishes (keeps exactly the right traces, but requires buffering and distributed coordination). Most production systems use a hybrid: head-based for the 99% of fast requests, tail-based for the 1% that are slow or errored.
Why Sample — The Bandwidth Math
Move the sliders to see how fast trace volume grows.
Head-Based Sampling — Decision at Request Start
The sampler decides whether to keep a trace before seeing its outcome. Low overhead, distributed-friendly, but can't prioritize slow/error traces.
Tail-Based Sampling — Decision After the Trace
The collector buffers spans and decides to keep a trace only after it completes. Guarantees you keep the interesting 1% — slow requests, errors — but requires stateful buffering and memory.
Trace Decision Simulator
Generate traces and watch each sampling strategy make keep/discard decisions. Which strategy keeps the most interesting traces?
Hybrid Sampling — The Production Standard
Most large-scale deployments use a two-stage approach: head-based sampling at the application (collect 1% always) + tail-based sampling in the collector (up-sample the interesting head-based traces).
OTel TailSamplingProcessor Configuration
The OTel Collector tail_sampling processor evaluates policy rules against
completed traces. Policies are checked in order; the first match wins.
processors:
tail_sampling:
decision_wait: 10s # Buffer spans this long before deciding
num_traces: 100_000 # Max traces in memory
expected_new_traces_per_sec: 10_000
policies:
# 1. Always keep errors
- name: errors-policy
type: status_code
status_code: { status_codes: [ERROR] }
# 2. Always keep slow traces (p99 > 1s)
- name: slow-traces-policy
type: latency
latency: { threshold_ms: 1000 }
# 3. Keep rare operations (low-frequency spans)
- name: rare-ops-policy
type: string_attribute
string_attribute:
key: operation.name
values: ["/admin.*", "/debug.*"] # rare regex patterns
# 4. Keep traces with specific user segments
- name: vip-users-policy
type: string_attribute
string_attribute:
key: user.tier
values: ["premium", "enterprise"]
# 5. Probabilistic sampling on everything else (1 in 10)
- name: probabilistic-policy
type: probabilistic
probabilistic: { sampling_percentage: 10 } num_traces × avg_trace_size = collector RAM. At 100K traces × 100KB avg = 10 GB RAM. Tune decision_wait shorter to reduce buffer.Sampling Strategy Comparison
| Strategy | Decision timing | Interesting trace recall | Memory overhead | Complexity | Best for |
|---|---|---|---|---|---|
| Probabilistic head | At request start | ~10% of slow/error | None | Low | High-volume, uniform traffic |
| Fixed-rate head | At request start | ~10% of slow/error | None | Low | Budget predictable storage |
| Rule-based head | At request start | 60–80% of slow/error | None | Medium | Known error patterns |
| Tail-based | After trace complete | 90–99% of slow/error | High (buffer) | High | Low-volume critical services |
| Hybrid (head+tail) | Both stages | 85–95% of slow/error | Medium | High | Large-scale production |
FAQ
Does head-based sampling lose the slow request that caused the incident?
Possibly. If your 1% sample happens to land on a request that becomes slow only after 400ms, and your latency threshold is 500ms, you might miss it. This is why most production deployments add a tail-based layer for the tail — or use rule-based head sampling to always sample requests from VIP users or specific error-prone endpoints.
Why does tail-based sampling require so much memory?
Because the decision to keep a trace isn't made until the last span arrives. If a trace takes 5 seconds to complete, the collector must hold all its spans in memory for 5 seconds. At 10K traces/sec with avg 500ms duration and 10KB per trace, that's ~50 GB of buffer memory just for in-flight traces. This is why num_traces is a hard limit — when the buffer is full, new traces are dropped without tail evaluation.
What is the "tail sampling budget"?
The maximum number of traces per second your observability backend can ingest. For Jaeger with Elasticsearch backend, this might be 500 traces/sec. The tail sampler keeps exactly that many, selected by policy priority. If only 50 traces match your policies, only 50 are kept — leaving head-based sampling as the fallback for the remaining budget.
Can sampling cause me to miss rare bugs that only affect 1 in 10,000 requests?
Yes. At 1% probabilistic sampling, a bug affecting 0.01% of requests will appear in roughly 1 in 1,000,000 sampled traces — effectively invisible. For rare bugs, use tail-based sampling with a "keep everything from this endpoint" rule, or temporarily disable sampling for that endpoint during incident investigation.
What does consistent sampling mean and why does it matter?
Consistent sampling means all spans from the same trace are either all kept or all dropped — you never sample a partial trace. This is achieved by generating the sampling decision from the trace_id hash (deterministic), so any collector processing any span from that trace reaches the same decision. This is critical for tail-based sampling where spans from the same trace may arrive at different times from different services.
How do exemplar links work with sampled traces?
Exemplars are single representative values from a histogram bucket that link to the actual trace that generated them. With head-based sampling, the exemplars in your metrics are from the 1% sampled traces — which may not include the worst outliers. With tail-based sampling, you can configure exemplars specifically for error and slow traces, making metric drill-down actually useful.