Distributed Tracing
Spans, traceparent, and the propagation that makes it all work
A trace is a tree of spans. Each span represents a unit of work — an HTTP handler, a database query, a goroutine — with a start time, a duration, a name, attributes, and a parent. The root span has no parent. Every other span has exactly one. Together they describe a single request's journey through every service it touched.
The hard part isn't drawing the tree. It's propagating context across
process boundaries so service B knows it's part of service A's trace. That happens
via HTTP headers (traceparent), gRPC metadata, message queue
attributes, and database session vars — standardized by the W3C Trace Context
spec and implemented by every modern instrumentation library.
Anatomy of a Trace
A trace is identified by a 128-bit trace_id. Each span has a 64-bit span_id and an optional parent_span_id. Wall-clock timestamps glue them into a tree.
Key Numbers
The Span Model
A span carries ten or so essential fields. Everything else is metadata layered on top.
{`message Span {
bytes trace_id = 1; // 128-bit trace identifier
bytes span_id = 2; // 64-bit span identifier
bytes parent_span_id = 4; // 64-bit, empty for root
string name = 5; // operation name, e.g. "GET /api/users"
SpanKind kind = 6; // CLIENT, SERVER, INTERNAL, PRODUCER, CONSUMER
fixed64 start_time_unix_nano = 7;
fixed64 end_time_unix_nano = 8;
repeated KeyValue attributes = 9; // http.method, db.statement, ...
repeated Event events = 11; // exception, log-like markers
repeated Link links = 13; // pointers to related spans
Status status = 15; // OK / ERROR + message
string trace_state = 16; // tracestate header passthrough
}`} Span kind matters for analytics. SERVER spans are entry points to a service. CLIENT spans are calls to other services. PRODUCER/CONSUMER are message queue ops. INTERNAL is everything else. Backends use kind to build service maps and to label spans as "incoming" vs "outgoing".
W3C traceparent & tracestate
The W3C Trace Context spec defines two HTTP headers that every modern tracer respects.
traceparent carries the trace_id, span_id, and sampling flag.
tracestate carries vendor-specific context.
{`traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^ ^^
| | | |
| trace-id (16 bytes hex) parent-id flags
version (sampled bit)
tracestate: vendor1=value1,vendor2=value2,otel=k1:v1;k2:v2
# Receiving service:
# 1. Parse traceparent
# 2. Use trace-id and parent-id as the context for new spans
# 3. Honor the sampled flag (or apply local policy that overrides it)
# 4. Pass through tracestate, prepending its own vendor entry if any`}
The 00 at the start is the version. The trailing 01 is the
flags byte; bit 0 is the sampled flag. If the upstream caller sampled the trace,
every downstream service should respect that and emit spans (head-based consistent
sampling).
B3 (Zipkin) Headers
Older, still common in Zipkin-derived ecosystems. B3 splits the trace context across multiple headers (multi-header B3) or stuffs it into one (single-header B3).
{`# Multi-header B3
X-B3-TraceId: 4bf92f3577b34da6a3ce929d0e0e4736
X-B3-SpanId: 00f067aa0ba902b7
X-B3-ParentSpanId: 0020000000000001
X-B3-Sampled: 1
# Single-header B3 (preferred today)
b3: 4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-1-0020000000000001
# OTel propagators support both via composite:
# OTEL_PROPAGATORS=tracecontext,baggage,b3multi`} Production tip: configure your services to extract both W3C and B3 (read either) but only inject W3C (write the standard). That handles legacy callers gracefully without spreading B3 to new services.
Propagation Across Boundaries
Every transport that crosses a service boundary needs an idiomatic way to carry trace context. The OTel propagator interface gives one API; each integration applies it differently.
HTTP (REST, GraphQL)
traceparent and tracestate headers. Server middleware
extracts on incoming, client middleware injects on outgoing. Every OTel HTTP
instrumentation does this automatically.
gRPC
gRPC metadata, the equivalent of HTTP headers. Same key names
(traceparent). The OTel gRPC interceptor handles both server and
client sides; the metadata travels with the request like a header.
Kafka / RabbitMQ / SQS
Per-message headers / properties. The producer attaches traceparent to the message, the consumer extracts it and either continues the same trace (PRODUCER → CONSUMER spans) or starts a new trace linked to the producing trace via a Link.
SQL databases
Either as a SET LOCAL session variable (SET LOCAL trace_id = '...')
or embedded in a SQL comment (/* traceparent='...' */). The latter
works without server-side support and shows up in slow query logs — useful
for connecting DB latency back to the originating trace.
OTel SDK Span Lifecycle
The path a span takes from creation to export.
{`# Pseudocode of OTel SDK internals
def start_span(name, kind, parent_context):
# 1. Read parent context (from incoming traceparent or current goroutine)
parent = parent_context.span_context
# 2. Sampler decides keep/drop
decision = sampler.should_sample(parent, trace_id, name, kind, attrs)
if decision == DROP:
return NonRecordingSpan() # cheap noop
# 3. Allocate span object
span = Span(
trace_id = parent.trace_id or new_random_128(),
span_id = new_random_64(),
parent_id = parent.span_id,
start_time = now_ns(),
...
)
return span
span = tracer.start_span("payment.charge", kind=CLIENT)
try:
span.set_attribute("payment.amount_usd", 42.99)
do_work()
except Exception as e:
span.record_exception(e) # adds an Event with type=exception
span.set_status(StatusCode.ERROR)
raise
finally:
span.end() # end_time = now_ns()
# SpanProcessor receives the ended span
# BatchSpanProcessor queues it; export thread sends OTLP every N seconds`} Attributes vs Events vs Links
Three different ways to attach context to a span. Choosing the right one matters for both backend storage and queryability.
| Mechanism | Shape | Use for |
|---|---|---|
| Attributes | key/value pairs on the span itself | Span-scoped facts: http.method, db.statement, user.tier |
| Events | timestamped log-like markers within the span | Things that happened during the span: exceptions, retry attempts, cache misses |
| Links | references to other spans (different trace OK) | Many-to-one or causal relations: batch consumers from N producers |
The semantic conventions (semconv) standardize attribute names. Use
http.method, not method. Use db.system, not
db_type. Vendors and backends rely on these names for service maps and
built-in dashboards.
Sampling Decisions in the SDK
The Sampler interface receives the parent context and decides keep/drop before any attributes are set. Three implementations cover most needs.
{`# OTel SDK sampler types (Python example)
sampler = ParentBased(root=TraceIdRatioBased(0.01))
# - If a parent context exists with sampled=true, sample the child
# - If a parent context exists with sampled=false, drop
# - If no parent (root span), apply 1% probabilistic on trace_id
sampler = AlwaysOnSampler() # 100% - tail sampling at collector handles cost
sampler = AlwaysOffSampler() # 0% - useful for forks of a request
# Custom sampler that always samples /checkout traces
class HighValueRouteSampler(Sampler):
def should_sample(self, parent, trace_id, name, kind, attrs):
if attrs.get("http.route") == "/checkout":
return Decision.RECORD_AND_SAMPLE
return self.fallback.should_sample(...)`} Tradeoffs
More spans = better visibility
Wrapping every function call in a span gives perfect debuggability. It also produces 1000-span traces that nobody can read and storage bills nobody can pay.
Fewer spans = cheaper, less detail
Span only the entry points and outgoing calls. The internals show up as duration on the entry span. Less detail, but the service map stays clean and budgets stay sane.
Attribute richness = backend cost
Tempo and Jaeger both index attributes for search. More attributes means more index entries and slower query response. Pick a 5-10 attribute discipline and stick to it.
Manual vs auto-instrumentation
Auto wins on coverage (HTTP, gRPC, DB, cache, queues all instrumented automatically). Manual wins on business semantics (span events for "coupon applied"). Use both.
FAQ
Why is trace_id 128 bits when span_id is only 64?
The trace_id has to be globally unique across all your services, all your customers, all the time. Birthday-paradox math says 64 bits collides at scale; 128 bits is comfortably collision-free. span_id only has to be unique within a trace, where 64 bits is plenty.
Do I need to instrument my code manually?
For HTTP, gRPC, popular databases, and message queues: no. OTel auto-instrumentation libraries wrap them. For your business logic: yes, the spans that name your domain ("checkout", "fraud_check") have to be added by you because no library knows what they mean.
What if a service doesn't propagate context?
You get a "broken trace" — the receiving service starts a new trace_id and the connection is lost. Mitigate by adding OTel auto-instrumentation as early as possible in your stack (often as a sidecar or middleware), so context propagates even before code changes.
How do I trace through a load balancer?
Most LBs (Envoy, nginx, HAProxy) pass headers transparently. Some can also generate trace context themselves. The risk is custom proxies or API gateways that strip "unknown" headers - configure them to allowlist traceparent and tracestate.
Should I trace synchronous internal function calls?
Generally no. Spans for in-process function calls produce noisy traces and double-count time (parent span includes child). Use them sparingly for boundaries where you need timing data, not for every function.
What's a "span link" actually for?
A batch consumer that processes 100 messages from Kafka. Each message comes from a different producing trace. The consumer span Links to all 100 producing spans — one consumer trace, with provenance to N parents. Without links you'd lose the connection or fan out the trace tree absurdly.