Controlling Span Noise and Cardinality Explosion with filterprocessor · transformprocessor in OTel Collector (OpenTelemetry Collector)

When teams first discover that health check endpoint spans account for 30–40% of their total trace volume, it's often a jarring realization. In Kubernetes environments, probe requests to /health, /readiness, and /liveness fire every few seconds, quietly consuming storage capacity in Jaeger or Grafana Tempo. Add attributes carrying per-request unique values like user.id or request.id, and you get a Cardinality Explosion that creates millions of unique time series, slowing down dashboard queries and causing observability costs to skyrocket.

The OpenTelemetry Collector (OTel Collector) provides two powerful processors that can address both problems simultaneously. By using filterprocessor to permanently remove low-value spans at the front of the pipeline and transformprocessor to clean up high-cardinality attributes, you can control both the quality and cost of data reaching your backend.

After reading this article, you'll be able to write YAML pipeline configurations that combine both processors to eliminate span noise and reduce cardinality. For those new to OTel Collector, we'll also walk through a complete configuration file example including receivers and exporters.

Core Concepts
Practical Application
Pros and Cons Analysis
Closing Thoughts
References

Core Concepts

OTel Collector Pipeline Structure

The OTel Collector flows telemetry data through a three-stage pipeline: Receive → Process → Export. Processors are the middle layer that transform data as it flows through, acting as the last checkpoint before data reaches backends like Jaeger, Prometheus, or Grafana Tempo.

[App Service] → Receiver(OTLP) → [filterprocessor] → [transformprocessor] → [batch] → Exporter

Processors operate at the span level. Each condition expression evaluates the attributes of individual spans, and only spans that satisfy the condition are dropped or transformed. The order of processing is determined by the order of the processors array in the service.pipelines block, and this order directly affects processing results.

filterprocessor: Condition-Based Permanent Deletion

The filterprocessor evaluates OTTL (OpenTelemetry Transformation Language) condition expressions and immediately removes matching telemetry from the pipeline.

yaml

processors:
  filter:
    error_mode: ignore
    traces:
      span:
        - attributes["http.route"] == "/health"
        - attributes["http.route"] == "/readiness"

What is error_mode: ignore? When a span lacks a specific attribute, an error can occur during condition evaluation. Setting it to ignore suppresses the error and allows the pipeline to continue. This is almost always recommended in production environments.

Understanding OR logic vs. AND combinations is important. Multiple conditions listed in the traces.span: array are evaluated with OR logic — if any one of the listed conditions is true, that span is dropped. In contrast, using the and keyword within a single condition means both conditions must be satisfied for the span to be dropped.

yaml

traces:
  span:
    - attributes["http.route"] == "/health"      # Condition A
    - attributes["http.route"] == "/readiness"   # Condition B — dropped if A or B (OR)
    - metric.name == "foo" and IsMatch(...)       # AND within one item — dropped only when both are true

transformprocessor: Attribute Transformation, Deletion, and Masking

The transformprocessor executes a list of OTTL statements in order, performing various transformations including adding, deleting, replacing, and truncating attributes. It also supports conditional execution via where clauses.

yaml

processors:
  transform:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          - delete_key(attributes, "user.id")
          - truncate_all(attributes, 256)
          - set(name, "normalized")
            where IsMatch(name, "^/api/v[0-9]+/.*")

What is OTTL (OpenTelemetry Transformation Language)? It's the query and transformation language shared by both processors, with an expression syntax similar to SQL. It provides built-in functions such as IsMatch(), delete_key(), truncate_all(), set(), and Concat().

YAML structure differences between the two processors: The filterprocessor directly lists OTTL condition expressions in the traces.span: array, while the transformprocessor uses a trace_statements[].context: span + statements: structure. This is because filter only performs condition evaluation, while transform processes an execution context and multiple statements in sequence — hence the different YAML designs. If you're new to both processors, keep this structural difference in mind as it can be confusing at first.

What Is Cardinality?

Cardinality is the number of unique combinations that attribute values in spans or metrics can take. For example, if the http.route attribute contains actual user IDs like /users/12345 or /users/67890, a new time series is created for each unique ID. With one million users, that means one million time series.

Cardinality Level	Example Attribute Value	Number of Time Series
Low Cardinality (recommended)	`http.method = "GET"`	A few (5–10)
Medium Cardinality	`http.route = "/api/v1/users/{id}"`	Tens to hundreds
High Cardinality (dangerous)	`user.id = "uuid-xxxx"`	Millions or more

http.status_code is low cardinality because it only takes a fixed set of a few dozen values. The key to medium cardinality is staying at the route pattern level (/api/v1/users/{id}); once actual dynamic values (/api/v1/users/12345) start appearing, it quickly transitions to high cardinality.

Division of Roles Between the Two Processors

Purpose	Tool	Example
Remove entire unnecessary spans	`filterprocessor`	Drop health check and internal probe spans
Clean up high-cardinality attributes	`transformprocessor`	Delete or mask `user.id`, `session.id`
Selectively remove high-cardinality data points	`filterprocessor`	Drop metrics for specific route patterns
Normalize span names	`transformprocessor`	Consolidate dynamic paths into method + route pattern

Practical Application

Example 1: Removing All Health Check and Internal Probe Spans

Health check endpoint spans often occupy a significant portion of total trace volume while providing no practical value beyond confirming service availability.

yaml

processors:
  filter:
    error_mode: ignore
    traces:
      span:
        - attributes["http.route"] == "/health"
        - attributes["http.route"] == "/readiness"
        - attributes["http.route"] == "/metrics"
        - attributes["http.route"] == "/ping"
 
  batch: {}
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [filter, batch]
      exporters: [otlp]

Configuration Item	Role
`error_mode: ignore`	Keeps the pipeline running when evaluation errors occur on spans missing attributes
Condition list (OR logic)	Immediately drops any span that matches at least one of the listed conditions
`batch` position	Placed after filter so only the reduced dataset is batch-processed

Example 2: Filtering After Cleaning High-Cardinality Attributes (transform → filter Combination)

This is a pattern of cleaning attributes first, then filtering. The transformprocessor removes high-cardinality attributes, and then the filterprocessor filters out remaining unwanted spans.

yaml

processors:
  transform:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          # Delete high-cardinality personally identifiable attributes (doubles as PII protection)
          - delete_key(attributes, "user.id")
          - delete_key(attributes, "session.id")
          - delete_key(attributes, "request.id")
          # Truncate all attribute values to a maximum of 256 characters
          - truncate_all(attributes, 256)
 
  filter:
    error_mode: ignore
    traces:
      span:
        # Remove non-HTTP spans that have no http attribute at all
        # IsPresent() is more explicit than == nil (clearly distinguishes between a nil value and a missing attribute)
        - not(IsPresent(attributes["http.request.method"]))
 
  batch: {}
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [transform, filter, batch]   # Order matters: transform first
      exporters: [otlp]

Why does pipeline order matter? With transform → filter ordering, filter conditions are evaluated after transform has already cleaned the attributes. Reversing the order means filter conditions are evaluated against the original, uncleaned attributes, which can produce unintended results.

Configuration Item	Role
`delete_key()`	Completely removes an attribute key-value pair
`truncate_all(attributes, 256)`	Limits all attribute values to 256 characters to prevent large payloads
`not(IsPresent(...))`	Explicitly checks whether an attribute is completely absent

Example 3: Controlling Metric Cardinality Explosion

This example selectively removes only data points where the http.route attribute contains a user ID pattern. It preserves the metrics themselves while dropping specific patterns that cause cardinality issues.

This example uses the and keyword to combine two conditions. Narrowing the scope to drop only when both a specific metric name and a route pattern match prevents unintended deletions.

yaml

processors:
  filter:
    error_mode: ignore
    metrics:
      datapoint:
        - metric.name == "http.server.request.duration" and
          IsMatch(attributes["http.route"], ".*/users/[0-9]+.*")
        - metric.name == "http.server.request.duration" and
          IsMatch(attributes["http.route"], ".*/orders/[0-9]+.*")
 
  batch: {}
 
service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [filter, batch]
      exporters: [prometheus]

Configuration Item	Role
`metric.name == "..."`	Applies conditions only to specific metrics
`IsMatch(..., regex)`	Matches high-cardinality route patterns using regex
`and` combination	Drops only when both conditions are met (prevents unintended deletions)

Example 4: Controlling Span Metric Cardinality via span.name Normalization

When span names contain dynamic values like /api/v1/users/12345, the cardinality of metrics derived from those spans explodes. Combining the HTTP method with a normalized route into a consistent name can dramatically reduce cardinality.

yaml

processors:
  transform:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          # Generate a low-cardinality span name by combining HTTP method and normalized route
          # Result example: "GET /api/v1/users/{id}"
          # http.route is already set in template format per OTel Semantic Conventions
          - set(name, Concat([attributes["http.request.method"], " ", attributes["http.route"]], ""))
            where IsMatch(name, "^/api/v[0-9]+/.*")
 
  batch: {}
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [transform, batch]
      exporters: [otlp]

In OTel Semantic Conventions, http.route is defined as a template form (/api/v1/users/{id}) rather than the actual path. Therefore, combining method and route produces a span name like GET /api/v1/users/{id} that distinguishes different endpoints while maintaining low cardinality. Be careful — simply replacing with just the method (GET) would cause different endpoints to share the same span name, which actually harms observability.

Complete Configuration File Example

For those new to OTel Collector, here is a minimal complete configuration file example including receivers and exporters. For instructions on running the Collector with Docker or Helm, refer to the official Getting Started guide.

yaml

# config.yaml — complete filter + transform combination example
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
 
processors:
  filter:
    error_mode: ignore
    traces:
      span:
        - attributes["http.route"] == "/health"
        - attributes["http.route"] == "/readiness"
        - attributes["http.route"] == "/ping"
 
  transform:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          - delete_key(attributes, "user.id")
          - delete_key(attributes, "session.id")
          - truncate_all(attributes, 256)
 
  batch:
    send_batch_size: 1000
    timeout: 5s
 
exporters:
  otlp:
    endpoint: "http://jaeger:4317"
    tls:
      insecure: true
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [filter, transform, batch]
      exporters: [otlp]

Pros and Cons Analysis

Advantages

Item	Details
Cost reduction	Directly reduces storage and query costs by eliminating unnecessary spans and metrics
Improved query performance	Lower cardinality means faster aggregation queries
Backend-agnostic	Processing at the Collector layer avoids lock-in to any specific backend
Real-time application	Changes take effect immediately by updating Collector configuration, without redeploying the app
Security and compliance	Deletes PII attributes before they reach the backend, preventing data leakage

Disadvantages and Caveats

Item	Details	Mitigation
Orphaned telemetry risk	Dropping a parent span leaves child spans without context, destroying trace integrity	Drop parent and child spans together, or combine with tail-sampling processor
OTTL performance cost	Regex patterns are CPU-intensive	Load testing is essential in high-traffic environments; prefer static conditions
Irreversibility of drops	Dropped data cannot be recovered	Use debug exporter to inspect what would be dropped before applying rules incrementally
Order sensitivity with samplers	Placing filterprocessor after the tail-sampling processor destroys trace integrity	See note below

Combination pattern with tail-sampling processor: Placing filterprocessor before the tail-sampling processor to remove noisy spans before making sampling decisions has become the standard pattern. The recommended order is filter → tail-sampling → batch. Since tail-sampling decides whether to sample an entire trace only after it has been fully collected, the more noise the filter removes upfront, the more meaningful the sampling decisions become.

Most Common Mistakes in Practice

Omitting error_mode: ignore — Condition evaluation errors on spans that lack certain attributes can halt the entire pipeline. It is recommended to set this on all processors in production.
Overusing regex conditions — IsMatch() has high CPU cost. If a static string comparison (==) can handle the case, use static conditions instead of regex.
Applying drop conditions without sufficient validation — Overly broad conditions cause unexpected data loss, and deleted data is unrecoverable. It is recommended to first verify the targeted data with debug exporter, then incrementally narrow conditions using the and keyword.

Closing Thoughts

Once you apply the two-stage combination of using filterprocessor to remove low-value spans at the front of the pipeline and transformprocessor to clean up high-cardinality attributes, the data reaching your backend becomes smaller, more queryable, and more meaningful.

Three steps you can start with right now:

Identify the spans that make up the largest share of your current pipeline. Enable the debug exporter in OTel Collector, or aggregate top endpoints by span volume in your backend using http.route or span.name as the grouping dimension.
Write a filterprocessor configuration to remove health check and internal probe spans first. Modify the YAML from Example 1 above to match your actual endpoint paths, then place it first in the processors array.
Identify high-cardinality attributes and delete or normalize them with transformprocessor. Remove attributes that carry unique values — like user.id, session.id, and request.id — using delete_key(), or search for OTTL Playground to pre-validate your transformation conditions before deploying.

Next article: A two-stage pipeline design combining OTel Collector's tail-sampling processor and filterprocessor to dynamically control normal trace sampling rates while preserving 100% of error and latency traces.

References

Essential

Advanced

Controlling Span Noise and Cardinality Explosion with filterprocessor · transformprocessor in OTel Collector (OpenTelemetry Collector) | DEV BAK - 기술블로그

DevOps

Controlling Span Noise and Cardinality Explosion with filterprocessor · transformprocessor in OTel Collector (OpenTelemetry Collector)

Core Concepts
Practical Application
Pros and Cons Analysis
Closing Thoughts
References

Core Concepts

OTel Collector Pipeline Structure

[App Service] → Receiver(OTLP) → [filterprocessor] → [transformprocessor] → [batch] → Exporter

filterprocessor: Condition-Based Permanent Deletion

The filterprocessor evaluates OTTL (OpenTelemetry Transformation Language) condition expressions and immediately removes matching telemetry from the pipeline.

yaml

processors:
  filter:
    error_mode: ignore
    traces:
      span:
        - attributes["http.route"] == "/health"
        - attributes["http.route"] == "/readiness"

What is error_mode: ignore? When a span lacks a specific attribute, an error can occur during condition evaluation. Setting it to ignore suppresses the error and allows the pipeline to continue. This is almost always recommended in production environments.

yaml

traces:
  span:
    - attributes["http.route"] == "/health"      # Condition A
    - attributes["http.route"] == "/readiness"   # Condition B — dropped if A or B (OR)
    - metric.name == "foo" and IsMatch(...)       # AND within one item — dropped only when both are true

transformprocessor: Attribute Transformation, Deletion, and Masking

yaml

processors:
  transform:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          - delete_key(attributes, "user.id")
          - truncate_all(attributes, 256)
          - set(name, "normalized")
            where IsMatch(name, "^/api/v[0-9]+/.*")

What is OTTL (OpenTelemetry Transformation Language)? It's the query and transformation language shared by both processors, with an expression syntax similar to SQL. It provides built-in functions such as IsMatch(), delete_key(), truncate_all(), set(), and Concat().

YAML structure differences between the two processors: The filterprocessor directly lists OTTL condition expressions in the traces.span: array, while the transformprocessor uses a trace_statements[].context: span + statements: structure. This is because filter only performs condition evaluation, while transform processes an execution context and multiple statements in sequence — hence the different YAML designs. If you're new to both processors, keep this structural difference in mind as it can be confusing at first.

What Is Cardinality?

Cardinality Level	Example Attribute Value	Number of Time Series
Low Cardinality (recommended)	`http.method = "GET"`	A few (5–10)
Medium Cardinality	`http.route = "/api/v1/users/{id}"`	Tens to hundreds
High Cardinality (dangerous)	`user.id = "uuid-xxxx"`	Millions or more

Division of Roles Between the Two Processors

Purpose	Tool	Example
Remove entire unnecessary spans	`filterprocessor`	Drop health check and internal probe spans
Clean up high-cardinality attributes	`transformprocessor`	Delete or mask `user.id`, `session.id`
Selectively remove high-cardinality data points	`filterprocessor`	Drop metrics for specific route patterns
Normalize span names	`transformprocessor`	Consolidate dynamic paths into method + route pattern

Practical Application

Example 1: Removing All Health Check and Internal Probe Spans

Health check endpoint spans often occupy a significant portion of total trace volume while providing no practical value beyond confirming service availability.

yaml

processors:
  filter:
    error_mode: ignore
    traces:
      span:
        - attributes["http.route"] == "/health"
        - attributes["http.route"] == "/readiness"
        - attributes["http.route"] == "/metrics"
        - attributes["http.route"] == "/ping"
 
  batch: {}
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [filter, batch]
      exporters: [otlp]

Configuration Item	Role
`error_mode: ignore`	Keeps the pipeline running when evaluation errors occur on spans missing attributes
Condition list (OR logic)	Immediately drops any span that matches at least one of the listed conditions
`batch` position	Placed after filter so only the reduced dataset is batch-processed

Example 2: Filtering After Cleaning High-Cardinality Attributes (transform → filter Combination)

This is a pattern of cleaning attributes first, then filtering. The transformprocessor removes high-cardinality attributes, and then the filterprocessor filters out remaining unwanted spans.

yaml

processors:
  transform:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          # Delete high-cardinality personally identifiable attributes (doubles as PII protection)
          - delete_key(attributes, "user.id")
          - delete_key(attributes, "session.id")
          - delete_key(attributes, "request.id")
          # Truncate all attribute values to a maximum of 256 characters
          - truncate_all(attributes, 256)
 
  filter:
    error_mode: ignore
    traces:
      span:
        # Remove non-HTTP spans that have no http attribute at all
        # IsPresent() is more explicit than == nil (clearly distinguishes between a nil value and a missing attribute)
        - not(IsPresent(attributes["http.request.method"]))
 
  batch: {}
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [transform, filter, batch]   # Order matters: transform first
      exporters: [otlp]

Why does pipeline order matter? With transform → filter ordering, filter conditions are evaluated after transform has already cleaned the attributes. Reversing the order means filter conditions are evaluated against the original, uncleaned attributes, which can produce unintended results.

Configuration Item	Role
`delete_key()`	Completely removes an attribute key-value pair
`truncate_all(attributes, 256)`	Limits all attribute values to 256 characters to prevent large payloads
`not(IsPresent(...))`	Explicitly checks whether an attribute is completely absent

Example 3: Controlling Metric Cardinality Explosion

This example uses the and keyword to combine two conditions. Narrowing the scope to drop only when both a specific metric name and a route pattern match prevents unintended deletions.

yaml

processors:
  filter:
    error_mode: ignore
    metrics:
      datapoint:
        - metric.name == "http.server.request.duration" and
          IsMatch(attributes["http.route"], ".*/users/[0-9]+.*")
        - metric.name == "http.server.request.duration" and
          IsMatch(attributes["http.route"], ".*/orders/[0-9]+.*")
 
  batch: {}
 
service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [filter, batch]
      exporters: [prometheus]

Configuration Item	Role
`metric.name == "..."`	Applies conditions only to specific metrics
`IsMatch(..., regex)`	Matches high-cardinality route patterns using regex
`and` combination	Drops only when both conditions are met (prevents unintended deletions)

Example 4: Controlling Span Metric Cardinality via span.name Normalization

yaml

processors:
  transform:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          # Generate a low-cardinality span name by combining HTTP method and normalized route
          # Result example: "GET /api/v1/users/{id}"
          # http.route is already set in template format per OTel Semantic Conventions
          - set(name, Concat([attributes["http.request.method"], " ", attributes["http.route"]], ""))
            where IsMatch(name, "^/api/v[0-9]+/.*")
 
  batch: {}
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [transform, batch]
      exporters: [otlp]

Complete Configuration File Example

yaml

# config.yaml — complete filter + transform combination example
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
 
processors:
  filter:
    error_mode: ignore
    traces:
      span:
        - attributes["http.route"] == "/health"
        - attributes["http.route"] == "/readiness"
        - attributes["http.route"] == "/ping"
 
  transform:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          - delete_key(attributes, "user.id")
          - delete_key(attributes, "session.id")
          - truncate_all(attributes, 256)
 
  batch:
    send_batch_size: 1000
    timeout: 5s
 
exporters:
  otlp:
    endpoint: "http://jaeger:4317"
    tls:
      insecure: true
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [filter, transform, batch]
      exporters: [otlp]

Pros and Cons Analysis

Advantages

Item	Details
Cost reduction	Directly reduces storage and query costs by eliminating unnecessary spans and metrics
Improved query performance	Lower cardinality means faster aggregation queries
Backend-agnostic	Processing at the Collector layer avoids lock-in to any specific backend
Real-time application	Changes take effect immediately by updating Collector configuration, without redeploying the app
Security and compliance	Deletes PII attributes before they reach the backend, preventing data leakage

Disadvantages and Caveats

Item	Details	Mitigation
Orphaned telemetry risk	Dropping a parent span leaves child spans without context, destroying trace integrity	Drop parent and child spans together, or combine with tail-sampling processor
OTTL performance cost	Regex patterns are CPU-intensive	Load testing is essential in high-traffic environments; prefer static conditions
Irreversibility of drops	Dropped data cannot be recovered	Use debug exporter to inspect what would be dropped before applying rules incrementally
Order sensitivity with samplers	Placing filterprocessor after the tail-sampling processor destroys trace integrity	See note below

Combination pattern with tail-sampling processor: Placing filterprocessor before the tail-sampling processor to remove noisy spans before making sampling decisions has become the standard pattern. The recommended order is filter → tail-sampling → batch. Since tail-sampling decides whether to sample an entire trace only after it has been fully collected, the more noise the filter removes upfront, the more meaningful the sampling decisions become.

Most Common Mistakes in Practice

Omitting error_mode: ignore — Condition evaluation errors on spans that lack certain attributes can halt the entire pipeline. It is recommended to set this on all processors in production.
Overusing regex conditions — IsMatch() has high CPU cost. If a static string comparison (==) can handle the case, use static conditions instead of regex.
Applying drop conditions without sufficient validation — Overly broad conditions cause unexpected data loss, and deleted data is unrecoverable. It is recommended to first verify the targeted data with debug exporter, then incrementally narrow conditions using the and keyword.

Closing Thoughts

Three steps you can start with right now:

Identify the spans that make up the largest share of your current pipeline. Enable the debug exporter in OTel Collector, or aggregate top endpoints by span volume in your backend using http.route or span.name as the grouping dimension.
Write a filterprocessor configuration to remove health check and internal probe spans first. Modify the YAML from Example 1 above to match your actual endpoint paths, then place it first in the processors array.
Identify high-cardinality attributes and delete or normalize them with transformprocessor. Remove attributes that carry unique values — like user.id, session.id, and request.id — using delete_key(), or search for OTTL Playground to pre-validate your transformation conditions before deploying.

Next article: A two-stage pipeline design combining OTel Collector's tail-sampling processor and filterprocessor to dynamically control normal trace sampling rates while preserving 100% of error and latency traces.

References

Essential

Advanced

Table of Contents

Core Concepts

OTel Collector Pipeline Structure

filterprocessor: Condition-Based Permanent Deletion

transformprocessor: Attribute Transformation, Deletion, and Masking

What Is Cardinality?

Division of Roles Between the Two Processors

Practical Application

Example 1: Removing All Health Check and Internal Probe Spans

Example 2: Filtering After Cleaning High-Cardinality Attributes (transform → filter Combination)

Example 3: Controlling Metric Cardinality Explosion

Example 4: Controlling Span Metric Cardinality via span.name Normalization

Complete Configuration File Example

Pros and Cons Analysis

Advantages

Disadvantages and Caveats

Most Common Mistakes in Practice

Closing Thoughts

References

Table of Contents

Core Concepts

OTel Collector Pipeline Structure

filterprocessor: Condition-Based Permanent Deletion

transformprocessor: Attribute Transformation, Deletion, and Masking

What Is Cardinality?

Division of Roles Between the Two Processors

Practical Application

Example 1: Removing All Health Check and Internal Probe Spans

Example 2: Filtering After Cleaning High-Cardinality Attributes (transform → filter Combination)

Example 3: Controlling Metric Cardinality Explosion

Example 4: Controlling Span Metric Cardinality via span.name Normalization

Complete Configuration File Example

Pros and Cons Analysis

Advantages

Disadvantages and Caveats

Most Common Mistakes in Practice

Closing Thoughts

References

Recommended Posts

A 2-Stage Pipeline Design Using OpenTelemetry Collector Tail Sampling to Retain 100% of Error & Latency Traces While Cutting Observability Costs by 74% (OTel Collector)

Configuring Accurate RED Metrics Independent of Sampling with OpenTelemetry Collector spanmetrics

OpenTelemetry Collector Horizontal Scaling: Maintaining Tail Sampling Accuracy with Agent-Gateway Architecture and loadbalancingexporter

OpenTelemetry Collector Tail-based Sampling: How to Preserve 100% of Errors & Slow Requests While Cutting Storage Costs by 70%

How to Never Miss Errors with Grafana Tempo TraceQL: A Practical Guide for Sampling Environments

100% Error Span Collection, Up to 95% Cost Reduction — Grafana Alloy + OpenTelemetry Tail-Based Sampling Practical Guide