Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.
Log-to-trace correlation: trace IDs in logs
It is 21:07 IST. Karan, an SRE at a hypothetical Mumbai-based payments processor we will call PayWeave, is staring at a Loki tab. A merchant has just escalated: "Order ID 88324 failed checkout, customer charged but order missing — please trace it." Karan runs {service="checkout-api"} |= "88324" and finds three log lines. One says payment_init merchant=zappyfoods order=88324. The next, six seconds later, says payment_failed reason=upstream_timeout order=88324. The third, a minute later, says compensation_pending order=88324. He has the story. He does not have the spans — which downstream service timed out, how long the redis lock was held, whether the NPCI hop returned an error or just blackholed the request. The trace exists in Tempo. It has 47 spans. He just cannot find it, because none of those three log lines contain a trace_id. He will spend the next 18 minutes grepping for the vpa and timestamp-correlating against Tempo to find a probable trace — most of which will turn out to be wrong.
Then his colleague Aditi pushes a one-line patch: every log call now binds the active span's trace_id into the structured log record. Twenty hours later, the same merchant escalates a different order. This time, {service="checkout-api"} |= "88991" returns three lines, each carrying trace_id=4b7a9c2e3f8d1a06…. Karan clicks the trace_id in Grafana. Tempo opens. The 47-span tree shows a 4.8-second wait on npci-axis-bank-rail followed by an HTTP 504 from the bank simulator. Time-to-root-cause: 90 seconds. The single-line patch was the entire fix.
A trace_id is a 128-bit identifier assigned at the edge of a request and propagated through every span; logs become correlatable when the application binds that trace_id into every log record emitted while the span is active. The mechanism is small (one MDC-style context binding, one log-formatter field) but the wiring is fragile — wrong logger, wrong context, missing span, or wrong format and the trace_id is silently absent. This is the cheapest cross-pillar correlation edge in observability, and the one most teams ship broken on the first try.
Why a log line without a trace_id is a dead end
A log line is, structurally, a tuple of (timestamp, level, message, fields). In a structured-logging pipeline the fields are a JSON object — {"service": "checkout-api", "merchant": "zappyfoods", "order": "88324", "level": "ERROR", "msg": "payment_failed"}. The line is searchable: Loki's inverted index over labels and substring grep over the body lets you find every log that mentions order 88324. What it does not give you is the path — which other services touched this request, in what order, and where time was spent. The path lives in a trace, and the trace lives in Tempo (or Jaeger, or Grafana Cloud Traces, or Honeycomb, or whichever backend the team chose). The two stores are physically separate, indexed differently, and queryable through different APIs.
Without a join key between them, the correlation walk is forensic: you read the log timestamp, switch tabs to Tempo, run a TraceQL query like {service.name="checkout-api" && duration > 5s} for the same minute, and hope that exactly one trace matches. At Hotstar during the IPL final, the rate is 38,000 RPS and the duration filter narrows it to maybe 400 traces per minute. You will not pick the right one by eye. You will pick three or four candidates and compare span attributes against the log fields, which adds 5–10 minutes per investigation. With a trace_id in the log, the correlation walk is one click. The 90-second time-to-root-cause replaces the 18-minute scavenger hunt.
The trace_id is also the only field that survives the cross-process boundary unambiguously. A request enters at the edge gateway, a trace_id is assigned, the trace_id is written into the W3C traceparent header (00-<trace_id>-<span_id>-01), the next service reads the header, the trace_id flows. Every span emitted across that request — gateway, auth-service, checkout-api, payments-router, npci-rail, ledger-writer — carries the same trace_id. Every log line emitted by any of those services can carry that same trace_id if the wiring is right. The result is that grepping Loki for one trace_id returns logs from all eight services in chronological order, automatically reconstructing the cross-service narrative the trace itself only has as spans. Logs and traces become two views of the same request.
Why the trace_id and not the request_id, the order_id, or the timestamp: the trace_id is the only identifier guaranteed to be unique across the entire distributed request, assigned exactly once at the edge, propagated through every transport (HTTP, gRPC, Kafka, SQS) by the OTel context-propagation contract, and present on every span in the trace. A request_id is application-level — different services may assign their own. An order_id is business-level — multiple distributed transactions may touch the same order over time. A timestamp is one-to-many — at 38,000 RPS many traces share any given second. Only the trace_id has the cross-process, cross-component, one-to-one property that makes it a clean join key. The W3C traceparent spec exists specifically to standardise this propagation; using anything else for cross-service correlation is rebuilding that spec from scratch with worse compatibility.
Binding the trace_id — context, MDC, and the formatter
The mechanical question is: how does the trace_id end up inside a log record? A naive answer is "pass it as a parameter to every logger.info() call". This works in a single function but does not survive across function boundaries — every helper would need to take a trace_id argument, every library you call (database driver, HTTP client, retry decorator) would need to carry it. The correct answer is context binding: the trace_id is stored in a thread-local (or async-local) context, the logger reads it from the context at format time, and every log line emitted while the context is active automatically carries it. The reader writes log calls as they always did; the context binding does the rest.
In Java this pattern is called MDC — Mapped Diagnostic Context — and is part of SLF4J / Logback. In Python the equivalent is contextvars.ContextVar (PEP 567), or loguru.bind(), or structlog.contextvars. In Go it is context.Context with a custom logger key. In Node it is AsyncLocalStorage. The pattern is universal: a context primitive that propagates with the call stack (synchronously or asynchronously), and a logger that reads from the primitive at format time.
The OpenTelemetry SDK does not directly write to your logger's MDC. Instead, it provides a hook: when a span becomes the active span (via with tracer.start_as_current_span(...) in Python, or Scope in Java), the OTel SDK pushes the span context — including trace_id and span_id — onto its own context. A small adapter library reads from the OTel context and writes to your logger's MDC. In Python this is opentelemetry-instrumentation-logging or the manual pattern of reading trace.get_current_span().get_span_context() inside a logging.Filter. In Java it is opentelemetry-logback-mdc-1.0 or opentelemetry-log4j-context-data-2.17. The bridge is a few lines but is the load-bearing wire — without it, OTel knows the trace_id and your logger does not, and they pass each other in the night.
Once the bridge is wired, the third piece is the format: the log record carries the trace_id, but does the formatter actually emit it? A JSON formatter usually does (it dumps every field on the record). A standard printf-style formatter does not unless you explicitly add %X{trace_id} (Logback) or %(trace_id)s (Python logging). Many "we wired OTel logging but trace_ids are not appearing" stories end at the formatter — the trace_id was on the record, the formatter just did not project it into the output. The diagnostic is to call logger.info(...) inside a span and immediately read the formatted output: if the trace_id is not in the line, the formatter is wrong; if it is in the line but Loki is not surfacing it, the Loki ingestion is dropping it (usually because the structured field name does not match the JSON parser stage's expected field).
# log_trace_bridge.py — Flask app emitting structured logs with trace_id binding
# pip install flask opentelemetry-api opentelemetry-sdk \
# opentelemetry-exporter-otlp opentelemetry-instrumentation-flask \
# opentelemetry-instrumentation-logging python-logging-loki loguru
import logging, os, time, random
from flask import Flask, request, jsonify
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.instrumentation.flask import FlaskInstrumentor
from opentelemetry.instrumentation.logging import LoggingInstrumentor
# 1. configure OTel — every span carries service.name, instance.id, env
resource = Resource(attributes={
"service.name": "checkout-api",
"service.instance.id": "checkout-api-7d9f-xk2",
"deployment.environment": "production",
})
provider = TracerProvider(resource=resource)
provider.add_span_processor(BatchSpanProcessor(
OTLPSpanExporter(endpoint="http://localhost:4318/v1/traces")))
trace.set_tracer_provider(provider)
tracer = trace.get_tracer(__name__)
# 2. the bridge — auto-injects otelTraceID, otelSpanID, otelServiceName
# into every log record emitted inside an active span
LoggingInstrumentor().instrument(set_logging_format=False)
# 3. JSON formatter — projects the otel* fields into the JSON output
class JsonFormatter(logging.Formatter):
def format(self, r):
import json
rec = {
"ts": self.formatTime(r, "%Y-%m-%dT%H:%M:%S.%fZ"),
"level": r.levelname, "service": "checkout-api",
"msg": r.getMessage(),
"trace_id": getattr(r, "otelTraceID", None),
"span_id": getattr(r, "otelSpanID", None),
}
# bring in any kwargs passed via extra={...}
for k, v in r.__dict__.items():
if k.startswith("biz_"):
rec[k[4:]] = v
return json.dumps(rec)
handler = logging.StreamHandler()
handler.setFormatter(JsonFormatter())
log = logging.getLogger("checkout"); log.setLevel(logging.INFO); log.addHandler(handler)
app = Flask(__name__)
FlaskInstrumentor().instrument_app(app) # auto-creates spans on every request
@app.route("/checkout/<order_id>")
def checkout(order_id):
log.info("payment_init", extra={"biz_order": order_id})
time.sleep(random.uniform(0.05, 0.2))
if random.random() < 0.1:
log.error("payment_failed", extra={"biz_order": order_id, "biz_reason": "upstream_504"})
return jsonify(ok=False), 502
log.info("payment_done", extra={"biz_order": order_id})
return jsonify(ok=True)
if __name__ == "__main__":
app.run(port=8080)
Sample output from curl http://localhost:8080/checkout/88991 then tailing the app's stdout:
{"ts": "2026-04-25T15:37:08.103Z", "level": "INFO", "service": "checkout-api",
"msg": "payment_init", "trace_id": "4b7a9c2e3f8d1a06b1c9f4e2d8a73c10",
"span_id": "9f4e2d8a73c10b1c", "order": "88991"}
{"ts": "2026-04-25T15:37:08.281Z", "level": "ERROR", "service": "checkout-api",
"msg": "payment_failed", "trace_id": "4b7a9c2e3f8d1a06b1c9f4e2d8a73c10",
"span_id": "9f4e2d8a73c10b1c", "order": "88991", "reason": "upstream_504"}
Walking the load-bearing lines: LoggingInstrumentor().instrument(set_logging_format=False) is the bridge — it monkey-patches the Python logging module so that every LogRecord created inside an active span gets otelTraceID, otelSpanID, and otelServiceName attached. The set_logging_format=False is critical: with True, OTel rewrites the formatter for you to a default printf-style line that does not produce JSON. With False, you keep your JSON formatter and it pulls the otel-* fields explicitly. FlaskInstrumentor().instrument_app(app) auto-creates a server span around every HTTP request — without it, there is no active span when the route handler runs, and the trace_id is None. getattr(r, "otelTraceID", None) in the formatter is defensive: if the log call happens outside an active span (a startup log, a background-thread log, a cron-job log), the field is missing and the JSON record carries null. Logging null is honest — the reader knows there was no active trace — and is far better than crashing with AttributeError. extra={"biz_order": order_id} is the standard Python pattern for adding business fields to a log record without colliding with the framework's own fields; the biz_ prefix is one convention, others use tags={...} or bind(...) (loguru).
Why JSON, not key-value or printf: structured logging backends (Loki, Elasticsearch, ClickHouse) all parse JSON natively. The trace_id field becomes a queryable label — {service="checkout-api"} | json | trace_id="4b7a9c..." in LogQL. With printf-style logs, the trace_id is buried in the message text and Loki can only substring-grep — far slower and far less precise (a substring match also hits log lines where the trace_id appears as part of another identifier, an extremely rare but real false-positive). The 30 extra bytes per line for JSON braces and field names are the cheapest correlation tax in observability; teams that compress them away with logfmt or plain text save 10% on ingestion bandwidth and pay 50% on time-to-root-cause. The math is bad and the discipline of "JSON first" is established practice across Razorpay, Hotstar, Swiggy, and Zerodha for exactly this reason.
Wiring it end-to-end — Loki, Grafana, Tempo
Emitting the trace_id into the log record is half the job. The other half is making the trace_id clickable — when an SRE looks at a log line in Grafana, the trace_id field should render as a link that opens Tempo with the trace pre-loaded. Without the link, the SRE copies the trace_id, switches to the Tempo tab, pastes it, runs a query — the same friction the bare correlation walk had, just slightly faster. The clickable link is the difference between "it's possible" and "it's instinctive".
The wiring is a Grafana data-source configuration on the Loki side. Loki's data source has a derivedFields array — each entry specifies a regex over the log content, a label name, and a target data-source UID with a query template. For trace_id correlation, the canonical entry matches "trace_id":"([a-f0-9]{32})" (or traceID, depending on your field name) and links to the Tempo data source with the captured group as ${__value.raw}. When Grafana renders a Loki log line, it scans for the regex, extracts the trace_id, and renders a button next to the field. One click and the Tempo panel opens.
# grafana/provisioning/datasources/loki.yml — derivedFields wires the click
apiVersion: 1
datasources:
- name: Loki
type: loki
url: http://loki:3100
jsonData:
derivedFields:
- name: TraceID
matcherType: label # match on a parsed JSON field, not regex
matcherRegex: trace_id # the field name in the log record
datasourceUid: tempo-uid
url: '$${__value.raw}' # Tempo accepts trace_id as URL value
- name: SpanID
matcherType: label
matcherRegex: span_id
datasourceUid: tempo-uid
url: '$${__value.raw}'
# grafana/provisioning/datasources/tempo.yml — tracesToLogs wires the reverse
apiVersion: 1
datasources:
- name: Tempo
type: tempo
uid: tempo-uid
url: http://tempo:3200
jsonData:
tracesToLogsV2:
datasourceUid: loki-uid
# use the trace's resource attributes to filter Loki
tags: [{ key: 'service.name', value: 'service' }]
# bound the time window to the span's duration ±2 minutes
spanStartTimeShift: '-2m'
spanEndTimeShift: '2m'
# the actual LogQL query — find logs in this trace via trace_id field
customQuery: true
query: '{service="$${__tags}"} | json | trace_id="$${__trace.traceId}"'
This pair makes the navigation graph undirected. From a log line, derivedFields opens the trace in Tempo. From a span in Tempo, tracesToLogsV2 opens Loki pre-filtered to that trace's trace_id and the matching service. The SRE clicks one direction or the other depending on which artefact they have in hand. The bidirectional bridge is the same shape as the metric↔trace bridge documented in /wiki/exemplars-metrics-traces — a sparse identifier pinned at observation time, joinable in either direction at query time.
When the wiring breaks — six failure modes and how to find them
In production, log-to-trace correlation breaks in characteristic ways. The diagnostic ladder for "I see the log line, the trace_id is missing or not clickable" walks them in order.
Break 1: the log call happens outside an active span. A startup log, a background-thread log, a cron-job log, a log inside an @app.before_request handler that runs before the framework's instrumentation creates the server span — none of these have an active span, so the OTel context-bridge attaches no trace_id. The log record's otelTraceID field is None. The fix depends on the case: for legitimate no-trace logs (startup, shutdown), accept that the field is absent. For logs that should have a trace_id (a request handler) but do not, check that the framework instrumentation ran first — the order of LoggingInstrumentor().instrument() and FlaskInstrumentor().instrument_app() matters; both should be called before the first request arrives.
Break 2: the OTel bridge sets the wrong format. LoggingInstrumentor().instrument() with the default set_logging_format=True rewrites the Python logger's format to a printf template like %(otelTraceID)s %(levelname)s %(message)s. If your service uses a JSON formatter, the OTel-installed printf formatter overwrites it, and your structured logs become flat text. The fix is to pass set_logging_format=False and rely on your JSON formatter to project otelTraceID explicitly — as the example code does.
Break 3: the JSON formatter does not project otelTraceID. The OTel bridge attaches the field to the LogRecord, but if your formatter only emits a hardcoded set of fields (timestamp, level, message), the trace_id never makes it into the output. The diagnostic is to print the log record's __dict__ and check for otelTraceID — if present, formatter is the problem; if absent, the bridge did not run. The fix is to add the field to the formatter, either explicitly or by iterating over r.__dict__.
Break 4: the log shipper renames or drops the field. Promtail, Vector, and Fluent Bit all have a JSON-parse stage in their pipeline. If the stage is configured to keep only certain fields (expressions: { level: level, msg: msg }), the trace_id is dropped at ingestion. If the stage renames otelTraceID → otel_trace_id, Loki indexes it under a different name and the Grafana derivedField regex (matcherRegex: trace_id) does not match. The fix is to configure the shipper's JSON stage to preserve the trace_id field name end-to-end.
Break 5: Loki ingests the field as content, not as a parsed label. Loki's storage model has two layers: stream labels (low-cardinality, indexed) and log content (high-cardinality, content-addressed). A trace_id has 100% cardinality and must live in the content. To query against it, the LogQL query must include | json | trace_id="..." to parse and filter — the trace_id is not a stream label and never should be (one new stream per trace_id would explode Loki's index instantly). If your Grafana derivedField is configured to match a stream label rather than a parsed JSON field, the regex will never fire because the trace_id never appears in the line as a label. The fix is to use matcherType: label with matcherRegex: trace_id (matches the parsed JSON field), or matcherType: regex with matcherRegex: '"trace_id":"([a-f0-9]{32})"' (matches the substring).
Break 6: the Grafana data source UIDs do not match. The Loki derivedField points at datasourceUid: tempo-uid. The Tempo data source must have uid: tempo-uid set in its provisioning YAML. If they mismatch, the click resolves to "data source not found" and Grafana silently does nothing. The fix is to align the UIDs in both YAML files — this is the one-character bug that costs many hours.
Why these breaks are silent rather than loud: at every stage of the pipeline, the trace_id is treated as an optional field — the OTel bridge attaches it if there is an active span, the formatter projects it if the field exists, the shipper preserves it if the parser keeps it, Loki indexes it if the JSON stage finds it, Grafana renders the link if the regex matches. None of these stages have a reason to error on absence: a log line without a trace_id is still a perfectly valid log line. The system is designed to degrade gracefully, which is correct behaviour for a non-critical field. The cost is that a wiring break never throws, never paginates, never even logs a warning. The mitigation is the contract test mentioned above — turn the optional field into a CI-enforced invariant for the production code paths where it must be present.
The six breaks compose. A team can have all six configured, then change the JSON formatter in a refactor, drop the trace_id from the projected fields (Break 3), and ship — the breakage is silent because the build passes, the tests pass, and only the SRE on the next 02:00 incident discovers that trace_ids are missing. The mitigation is a contract test in CI that emits a log line inside a span, parses the JSON, and asserts that trace_id is present and 32 hex characters. Twelve lines of test, prevents months of silent regression.
Common confusions
- "trace_id and span_id give you the same correlation." The trace_id identifies the entire request; the span_id identifies one operation within it. Logs typically carry both — the trace_id for cross-service joining, the span_id for "which exact operation in the trace did this log come from". Filtering Loki by trace_id returns all logs for the request; filtering by span_id returns logs from one specific span (usually one service, one function). Most teams index on trace_id for correlation and use span_id as a secondary filter when the trace_id matches multiple suspect operations.
- "OpenTelemetry handles log correlation automatically." OpenTelemetry has three signals: traces, metrics, logs. The trace and metric signals are mature; the log signal (OTLP/logs) is real but adoption is partial. The correlation pattern this article describes is the bridge approach — your existing logger keeps emitting logs to your existing log shipper, and OTel just injects the trace_id into the records. The pure-OTel approach (logs as OTLP, ingested into a backend that joins natively) exists but most teams in 2026 are still on the bridge approach. Both work; the bridge is more common and more interoperable.
- "Putting trace_id in logs blows up cardinality." The trace_id is in the content of the log line, not in the stream labels. Loki's index sees the same handful of stream labels (
service,level,env); the trace_id is searchable via| json | trace_id=...but does not multiply the stream count. The cost is content storage (~32 bytes per log line, compressed by gzip to ~16) — negligible compared to message bodies. - "trace_id propagation is automatic across all transports." OTel context propagation is automatic for HTTP and gRPC if you use the OTel auto-instrumentation; it is not automatic for Kafka, SQS, RabbitMQ, Redis pub/sub, or your custom RPC. For those, you must explicitly inject the trace_id into the message (typically as a
traceparentheader) and extract it on the consumer side viapropagate.extract(carrier). Many "trace stops at the queue" mysteries are missing carriers on the producer side of an asynchronous transport. - "If I sample traces, I should sample logs the same way." Sampling decisions belong to traces; logs typically are not sampled the same way. A 1%-sampled trace pipeline still emits 100% of error logs (because logs are independently valuable for grep). The trace_id in the log will sometimes point at a trace Tempo dropped — the click leads to a 404. Some teams accept this; some run an "always-keep" rule for traces that have any ERROR-level log within their duration window, using Tempo's
metricsGeneratoror a tail-sampler with a log-tag rule. The asymmetry is a known tradeoff. - "trace_id in logs makes logs replaceable by traces." Spans are bounded — they record what the OTel SDK was told to record (durations, attributes, errors). Logs are unbounded — you can log any string from anywhere in your code. Logs catch what the trace did not (a startup error, a config-load decision, a feature-flag evaluation, a cron-job result). The two pillars are complementary; the trace_id makes them jointly queryable, not interchangeable.
Going deeper
The W3C traceparent spec and why the format matters
The trace_id you put in your logs must match the trace_id propagated across services. The W3C Trace Context specification defines traceparent as a four-part ASCII string: <version>-<trace_id>-<parent_span_id>-<flags>. The trace_id is exactly 32 lowercase hex characters (16 bytes / 128 bits). Versions older than the W3C spec used 64-bit trace_ids (B3 / Zipkin); some legacy systems still emit those. If your service receives a 16-character hex trace_id (B3 format) but logs a 32-character one (after OTel pads it with leading zeros), Loki queries for the original B3 format will not find the padded version. The fix is to normalize at ingest time — either always-pad B3 trace_ids to 32 hex on emission, or always-strip leading zeros at query time. The OTel SDK can be configured to emit B3-compatible IDs via the propagators configuration, but the safer modern choice is W3C-only and a one-time migration of any B3-emitting clients.
Async context and the contextvars trap
Python's contextvars.ContextVar propagates correctly across async/await boundaries — every awaited coroutine inherits the parent's context. It does not propagate across thread-pool executors (run_in_executor) without explicit copy. A common bug: a Flask/FastAPI handler starts a span, calls await loop.run_in_executor(pool, slow_db_call), the executor thread runs logger.info(...), and the log emits with no trace_id because the executor thread had no context. The fix is to pass contextvars.copy_context() into the executor, or to use asyncio.to_thread() (Python 3.9+), which copies context automatically. Equivalent traps exist in Java's CompletableFuture.runAsync (use MDC.getCopyOfContextMap() then MDC.setContextMap() in the executor) and in Go's context.Context propagation through goroutines (always pass the context explicitly).
Production wiring at Razorpay-scale, hypothetically
At a hypothetical 38,000 RPS payments service, the trace-id-in-log machinery is operating on roughly 200,000 log lines per second across 80 services. The cost decomposes as: ~32 bytes per line for the trace_id field (6.4 MB/s raw, 3.2 MB/s after gzip ≈ 280 GB/day across the fleet); ~0% additional CPU for the OTel bridge (the field is a thread-local lookup, ~50ns); ~0.5% additional Loki storage (trace_id is a content field, gzip-compresses well because of repetition within a request). The operational benefit is harder to quantify but well-documented: time-to-root-cause for a triage-ladder incident drops from a typical 25–45 minutes (timestamp-correlation forensics) to 90–180 seconds (one click). For an on-call team handling 10–15 actionable incidents per week, the savings are 4–8 engineer-hours per week — far more than the storage cost. This is the rare observability primitive whose ROI is overwhelmingly obvious; the only reason teams fail to ship it is the wiring fragility documented above.
Alternatives to trace_id — request_id, correlation_id, and why they are weaker
Some teams roll their own request_id or correlation_id and propagate it through HTTP headers. This works for a single-service or a small mesh but breaks at scale: every team writes their own header name (X-Request-Id, X-Correlation-ID, X-Trace, ad infinitum), every team writes their own propagation logic, and the moment a request crosses a team boundary the propagation drops. The W3C traceparent exists specifically to standardise this, and the OTel SDK propagates it through every OTel-instrumented transport. A request_id is a useful secondary identifier (the customer-facing ticket reference, the order ID, the idempotency key) but should never be the primary cross-service correlation key. Use the trace_id for that and the request_id for business-level grep ("what happened to order 88991").
Sampling and the broken-link problem
If your trace pipeline samples — say, head-based at 1% — then 99 out of 100 log lines that carry a trace_id point at a trace Tempo never received. The click resolves to a 404. The fix space has three shapes: (1) keep all logs always (the default — logs survive sampling, the broken-link cost is acceptable for the rare 1% case); (2) tail-sample traces with an "always keep if any ERROR log" rule, using OTel collector's tail_sampling processor with a string_attribute policy on severity=ERROR; (3) keep error logs as exemplar candidates and ensure the sampler retains traces that any error log refers to (more complex; requires the logging pipeline and sampling pipeline to share state, typically via a shared collector). Production teams at scale generally pick (2) — the 1% sample for happy-path traces, 100% retention for traces that errored, and accept that some happy-path log lines have orphan trace_ids. This is documented in /wiki/log-sampling-head-based-tail-based and /wiki/trace-sampling-head-tail-adaptive.
Cross-team contracts — the trace_id field is API-shaped
In a 80-microservice mesh built by 14 product teams (which is roughly Hotstar / Swiggy / Flipkart team-count territory), the field name trace_id is itself an API contract. If team A emits trace_id, team B emits traceId, team C emits traceID, and team D emits otelTraceID, then the Grafana derivedField regex needs four entries to catch them all, the LogQL | json | trace_id="..." filter only matches a quarter of logs, and any cross-team incident triage requires the SRE to know which team uses which spelling. The fix is to publish the field name as a platform-team contract — typically trace_id (snake_case, matches the W3C spec's lowercase convention) — and ship it as a default in the shared logging library every team imports. The platform team owns the field, the product teams import the library, and the field name is consistent everywhere. This is the same discipline as standardising service.name as the OTel resource attribute — a small contract that pays off enormously when stitched across the fleet.
Where this leads next
Log-to-trace correlation is one edge of the broader cross-pillar correlation graph. The metric-to-trace edge is covered in /wiki/exemplars-metrics-traces — exemplars are the metric-side equivalent of trace_id-in-logs. The trace-to-metric edge (drilling from a span to its histogram contribution) is covered in /wiki/drill-down-and-correlation. Together with this article, those three describe the full undirected navigation graph that "single pane of glass" actually means in practice.
The propagation mechanism this article assumed — the trace_id arriving on the request and being available in the active span — is itself a system worth understanding. The W3C traceparent format, the OTel propagator chain, the per-transport extractors and injectors, and the cross-team interoperability story are covered in /wiki/b3-w3c-trace-context and /wiki/span-trace-context-the-data-model.
For the broader log-pillar story — how Loki indexes logs, why high-label-cardinality kills its query performance, how structured JSON differs from logfmt and plain text in production — see /wiki/structured-vs-unstructured-logging and /wiki/log-backends-elasticsearch-loki-clickhouse. The cross-curriculum thread is that every observability primitive worth its operational cost is a sparse pointer attached to an aggregate — exemplars to histograms, trace_ids to logs, hash-pointers to content-addressed storage — and the design discipline is the same in each case.
# Reproduce this on your laptop
docker run -d --name loki -p 3100:3100 grafana/loki:latest
docker run -d --name tempo -p 3200:3200 -p 4318:4318 grafana/tempo:latest
docker run -d --name grafana -p 3000:3000 grafana/grafana:latest
python3 -m venv .venv && source .venv/bin/activate
pip install flask opentelemetry-api opentelemetry-sdk \
opentelemetry-exporter-otlp opentelemetry-instrumentation-flask \
opentelemetry-instrumentation-logging python-logging-loki
python3 log_trace_bridge.py &
for i in $(seq 1 200); do curl -s http://localhost:8080/checkout/$i > /dev/null; done
# in Grafana, configure the Loki and Tempo data sources with the YAML above,
# then explore: {service="checkout-api"} | json — click any trace_id
References
- W3C Trace Context specification — the canonical definition of
traceparentandtracestate; the format every OTel-compatible service must speak. - OpenTelemetry — Logs Bridge API and Logging Instrumentation — defines the bridge approach this article uses; covers both the pure-OTel logs path and the trace_id injection path.
- Grafana — Loki derived fields and Tempo tracesToLogsV2 — the configuration that makes trace_ids in logs clickable; the canonical wiring reference.
- Charity Majors et al., Observability Engineering, Chapter 9 — the case for cross-pillar correlation as the foundation of debuggability; trace_id-in-log is one of its primary primitives.
- Cindy Sridharan, Distributed Systems Observability (O'Reilly, 2018) — the foundational text on correlation and debuggability across pillars.
- Python
contextvars(PEP 567) — the async-safe context primitive that underlies modern Python trace-id propagation. - /wiki/exemplars-metrics-traces — internal: the metric-side equivalent of this article's mechanism.
- /wiki/drill-down-and-correlation — internal: the navigation discipline this article's wiring enables.