Configuring Accurate RED Metrics Independent of Sampling with OpenTelemetry Collector spanmetrics
If p99 latency is aggregated at 200ms yet user complaint tickets keep pouring in, your sampling configuration may be the culprit. In a 10% tail sampling environment where the error rate appears to be 2%, what is the actual error rate? There's no way to know precisely. Sampled traces represent only a fraction of total traffic, and metrics computed from that fraction distort reality.
The OpenTelemetry Collector's spanmetrics connector consumes every span before sampling is applied and generates RED (Rate, Errors, Duration) metrics. Even if only 10% of traces are stored for cost savings, metrics are measured accurately based on 100% of traffic. This article walks through the complete pipeline — step by step — for building an always-accurate RED dashboard in Prometheus + Grafana using a combination of the spanmetrics connector and forward connector, regardless of sampling rate.
This article targets readers who are already using the OpenTelemetry Collector in their pipeline and have a Prometheus and Grafana environment in place. If you are new to the OTel Collector, it is recommended to read the official Getting Started documentation first. Cardinality explosion prevention, PromQL query writing, and common mistakes in production are also covered.
Core Concepts
OTel Collector Component Structure
The OTel Collector is composed of four types of components. Understanding the overall structure makes pipeline configuration read naturally.
[Receiver] → [Processor] → [Exporter]
↕
[Connector] ← connects two pipelines
↕
[Receiver] → [Processor] → [Exporter]| Component | Role |
|---|---|
| Receiver | Receives data from external sources (OTLP, Jaeger, etc.) |
| Processor | Transforms and filters data (transform, tail_sampling, etc.) |
| Exporter | Sends data to external systems (Prometheus, Tempo, etc.) |
| Connector | Acts as an exporter for one pipeline and a receiver for another |
The service.pipelines block combines these components to define the data flow.
The Dual Role of the spanmetrics Connector
A connector is a component that acts as both an exporter for one pipeline and a receiver for another. It serves as a bridge connecting two pipelines.
The spanmetrics connector uses this dual role to convert trace data into metrics.
[traces pipeline]
otlp receiver → spanmetrics connector (exporter role)
↓ consumes spans → generates RED metrics
[metrics pipeline]
spanmetrics connector (receiver role) → prometheus exporterconnectors.spanmetrics.namespace: traces in the configuration file prepends a traces_ prefix to metric names. exporters.prometheus.namespace: "" is left empty so the Prometheus exporter does not append an additional prefix. The resulting final metric names are as follows.
| Metric Name | Type | Purpose |
|---|---|---|
traces_spanmetrics_calls_total |
Counter | Rate and Error calculation |
traces_spanmetrics_duration_milliseconds_bucket |
Histogram | Duration (p99, etc.) calculation |
target_info |
Gauge | Resource attribute metadata |
What Is tail sampling?
tail sampling is an approach where the decision to store a trace is made after the entire trace is complete. Unlike head sampling (decided immediately at ingestion time), it allows selective preservation of traces containing errors or slow traces. However, spans are buffered in memory during the
decision_waitperiod, which increases memory usage.
The Relationship Between Sampling and Metrics: Pipeline Order Is Critical
[Wrong order]
otlp → tail_sampling(10%) → spanmetrics → prometheus
↑ 90% discarded here → metrics are generated based on only 10%
[Correct order]
otlp → spanmetrics → metrics generated based on 100%
↘ forward → tail_sampling(10%) → trace backendCore principle: The spanmetrics connector must be positioned before the sampling processor. Because tail sampling is a stateful processor, branching within the same pipeline is not possible. Using a
forward connectorto split into two pipelines clearly enforces this order.
Trade-off Summary Before Adoption
| Item | Details |
|---|---|
| What you gain | Accurate RED metrics independent of sampling rate |
| What you lose | Increased memory usage, higher horizontal scaling complexity |
| Prerequisites | span name normalization (cardinality management), single Collector instance or Gateway architecture |
Practical Application
Now that the concepts are clear, let's walk through the actual configuration step by step.
Configuration Step 1: Full OTel Collector Pipeline Configuration
Below is a complete configuration that uses a forward connector to generate metrics from 100% of spans, then applies tail sampling in a separate pipeline. It includes the final integrated configuration with span name normalization via the transform processor.
This example is based on a Docker Compose environment.
tempo:4317uses the service name of the Tempo container.
# otel-collector-config.yaml
connectors:
spanmetrics:
histogram:
explicit:
# Go duration 문자열 형식으로 지정합니다
buckets: ["5ms", "10ms", "25ms", "50ms", "100ms", "250ms", "500ms", "1s", "2.5s", "5s", "10s"]
dimensions:
- name: http.method
- name: http.status_code
- name: http.route
- name: service.name
metrics_flush_interval: 15s
# 이 namespace가 메트릭명 앞에 "traces_" 접두사를 붙입니다
# 최종 메트릭명: traces_spanmetrics_calls_total
namespace: traces
forward: {} # tail sampling 이후 트레이스를 백엔드로 전달
exporters:
prometheus:
endpoint: "0.0.0.0:8889"
# 비워두면 prometheus exporter가 추가 접두사를 붙이지 않습니다
# (spanmetrics namespace "traces"와 충돌하지 않음)
namespace: ""
otlp/backend:
endpoint: tempo:4317 # Docker Compose 환경 기준
processors:
# span name의 가변 경로를 정규화해 카디널리티 폭발을 방지합니다
# spanmetrics보다 반드시 앞에 위치해야 합니다
transform:
trace_statements:
- context: span
statements:
- replace_pattern(name, "^/users/\\d+", "/users/{id}")
- replace_pattern(name, "^/orders/[a-f0-9\\-]{36}", "/orders/{id}")
tail_sampling:
decision_wait: 10s
policies:
- name: errors-policy
type: status_code
status_code: { status_codes: [ERROR] }
- name: slow-traces-policy
type: latency
latency: { threshold_ms: 1000 }
- name: probabilistic-policy
type: probabilistic
probabilistic: { sampling_percentage: 10 }
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
service:
pipelines:
# 1단계: 모든 스팬 수신 → span name 정규화 → 메트릭 생성 + forward 전달
traces/all:
receivers: [otlp]
processors: [transform] # spanmetrics보다 앞에 위치
exporters: [spanmetrics, forward]
# 2단계: forward에서 받아 tail sampling 후 백엔드 전송
traces/sampled:
receivers: [forward]
processors: [tail_sampling]
exporters: [otlp/backend]
# 메트릭 파이프라인: spanmetrics가 생성한 메트릭 → Prometheus
metrics:
receivers: [spanmetrics]
exporters: [prometheus]The role of each configuration block is summarized below.
| Configuration Item | Role |
|---|---|
traces/all |
Receives all spans and triggers 100%-based metric generation |
transform processor |
Normalizes span names before spanmetrics → controls cardinality |
spanmetrics connector |
Consumes spans → converts to RED metrics |
forward connector |
Forwards spans to the next pipeline after metric generation |
traces/sampled |
Reduces storage costs by applying tail sampling |
metrics |
Exposes generated metrics to Prometheus |
Configuration Step 2: Prometheus Scrape Configuration
Once the OTel Collector exposes metrics on 0.0.0.0:8889, Prometheus collects them periodically.
# prometheus.yml
scrape_configs:
- job_name: 'otel-spanmetrics'
scrape_interval: 15s
static_configs:
- targets: ['otel-collector:8889']It is recommended to set scrape_interval equal to or longer than metrics_flush_interval. If the spanmetrics connector refreshes metrics every 15 seconds but Prometheus scrapes every 5 seconds, the same values will be collected redundantly.
Configuration Step 3: Grafana RED Dashboard PromQL Queries
The following three queries are the core of the RED dashboard.
The span_kind="SPAN_KIND_SERVER" filter in the Rate and Error Rate queries aggregates only requests directly handled by the server. When service A calls service B, A generates a CLIENT span and B generates a SERVER span simultaneously. Without the filter, the same request would be counted twice.
# Rate: requests per second by service (SERVER spans only to prevent double-counting)
sum(
rate(traces_spanmetrics_calls_total{span_kind="SPAN_KIND_SERVER"}[5m])
) by (service_name)# Error Rate: error percentage by service (%)
# Numerator: errors among SERVER spans
# Denominator: ratio against all span_kinds → interpreted as error rate relative to total requests
# If span_kind consistency is needed, the same filter can be added to the denominator as well
sum(
rate(traces_spanmetrics_calls_total{status_code="STATUS_CODE_ERROR"}[5m])
) by (service_name)
/
sum(
rate(traces_spanmetrics_calls_total[5m])
) by (service_name)
* 100# Duration: p99 latency by service (ms)
histogram_quantile(
0.99,
sum(
rate(traces_spanmetrics_duration_milliseconds_bucket[5m])
) by (service_name, le)
)Pros and Cons Analysis
Advantages
| Item | Details |
|---|---|
| Sampling-independent | Metrics generated from 100% of spans before sampling — full numerical accuracy guaranteed |
| Vendor-neutral | Fully compatible with open-source stacks such as Prometheus and Grafana |
| No additional instrumentation required | RED metrics generated automatically without modifying existing trace instrumentation code |
| Fine-grained dimension configuration | dimensions allows explicit control over Prometheus label combinations |
| Histogram bucket customization | Precise latency analysis by defining buckets aligned with SLO criteria |
Disadvantages and Caveats
| Item | Details | Mitigation |
|---|---|---|
| Cardinality explosion | Time series count explodes when span names contain variable values | Normalize span names with the transform processor |
| Stateful component | Spans for the same service.name must be processed by the same Collector instance to maintain aggregation accuracy |
Route via a two-tier Agent → Gateway architecture |
| Scaling complexity | Horizontal scaling requires a two-tier Collector deployment | Use loadbalancingexporter + Gateway Collector |
| Memory usage | Aggregation state + tail sampling buffer → memory pressure under high traffic | Shorten metrics_flush_interval, remove unnecessary dimensions |
| Flush delay | Reflection delay of several seconds depending on metrics_flush_interval setting |
Adjust interval value when high real-time requirements exist |
Memory caution when using tail sampling concurrently: tail sampling buffers spans in memory during
decision_wait. Using it simultaneously with spanmetrics means aggregation state and trace buffers share memory, which can significantly increase usage.
Most Common Mistakes in Production
-
Placing spanmetrics after tail_sampling — Metrics are generated based on sampled traffic, which greatly reduces accuracy. spanmetrics must be exported first in the
traces/allpipeline without a sampling processor. -
Omitting span name normalization — This frequently occurs in services where REST API paths contain resource IDs. It is strongly recommended to use the
transformprocessor to replace variable paths with patterns before going to production.transformmust be registered as a processor in thetraces/allpipeline before the spanmetrics exporter. -
Overlooking state distribution issues when horizontally scaling the Collector — If spans from the same service are distributed across different Collector instances, aggregation becomes fragmented. It is recommended to configure hash-based routing by
service.nameusingloadbalancingexporterso that spans from the same service are always delivered to the same Gateway Collector.
Closing
By combining the spanmetrics connector and the forward connector, you can always collect accurate RED metrics based on 100% of traffic in Prometheus and visualize them in Grafana, regardless of sampling rate.
Three steps you can start with right now:
-
Install OTel Collector Contrib and write the configuration: Using the
docker run otel/opentelemetry-collector-contrib:latestimage, you can configure thespanmetrics,forward, andtail_samplingblocks based on theotel-collector-config.yamlin this article. After applying the configuration, it is recommended to verify that metrics are being collected withcurl localhost:8889/metrics | grep traces_spanmetrics. -
Add Prometheus scrape configuration: Add
job_name: 'otel-spanmetrics'andtargets: ['otel-collector:8889']toprometheus.yml, restart Prometheus, and verify that thetraces_spanmetrics_calls_totalmetric is being collected. -
Add Grafana dashboard panels: Adding the three Rate, Error Rate, and Duration PromQL queries from this article as individual Time series panels completes the basic RED dashboard. From there, you can evolve it by adding the
http.routedimension to thebyclause (by (service_name, http_route)) to drill down by endpoint.
This configuration is sufficient while operating the Collector as a single instance. When the time comes that traffic growth requires horizontal scaling, we will look at how to maintain tail sampling aggregation accuracy by introducing the two-tier Agent-Gateway architecture and loadbalancingexporter.
Next article: The two-tier Agent-Gateway architecture needed when horizontally scaling the Collector, and how to maintain tail sampling aggregation accuracy with
loadbalancingexporter
References
- spanmetrics connector README | opentelemetry-collector-contrib
- How to Use the Span Metrics Connector to Generate RED Metrics from Trace Data | OneUptime
- How to Build a Grafana RED Metrics Dashboard from OpenTelemetry Span Metrics | OneUptime
- Convert OpenTelemetry Traces to Metrics with SpanMetrics | Last9
- otelcol.connector.spanmetrics | Grafana Alloy documentation
- Tail sampling | Grafana OpenTelemetry documentation
- Scale Alloy tail sampling | Grafana OpenTelemetry documentation
- Span Metrics connector | Splunk Observability Cloud
- Metrics from traces | Grafana Tempo documentation
- Converting Traces to Metrics Using OTel Collector for Grafana Dashboards | nsalexamy
- Connectors | Red Hat build of OpenTelemetry 3.8