Troubleshoot out-of-memory errors

Learn about out-of-memory (OOM) issues and how to troubleshoot them.

Set the max attribute size to help control out of memory errors

Tempo queriers can run out of memory when fetching traces that have spans with very large attributes. This issue has been observed when trying to fetch a single trace using the tracebyID endpoint.

To avoid these out-of-memory crashes, use max_span_attr_byte to limit the maximum allowable size of any individual attribute. Any key or values that exceed the configured limit are truncated before storing.

Use the tempo_distributor_attributes_truncated_total metric to track how many attributes are truncated.

   # Optional
    # Configures the max size an attribute can be. Any key or value that exceeds this limit will be truncated before storing
    # Setting this parameter to '0' would disable this check against attribute size
    [max_span_attr_byte: <int> | default = '2048']

Refer to the configuration for distributors documentation for more information.

Max trace size

Traces which are long-running (minutes or hours) or large (100K - 1M spans) will spike the memory usage of each component when it is encountered. This is because Tempo treats traces as single units, and keeps all data for a trace together to enable features like structural queries and analysis.

When reading a large trace, it can spike the memory usage of the read components:

query-frontend
querier
ingester
metrics-generator

When writing a large trace, it can spike the memory usage of the write components:

ingester
compactor
metrics-generator

Start with a smaller trace size limit of 15MB, and increase it as needed. With an average span size of 300 bytes, this allows for 50K spans per trace.

Always ensure that the limit is configured, and the largest recommended limit is 60 MB.

Configure the limit in the per-tenant overrides:

overrides:
    'tenant123':
        max_bytes_per_trace: 1.5e+07

Refer to the [Overrides](# https://grafana.com/docs/tempo/<TEMPO_VERSION>/configuration/#standard-overrides) documentation for more information.

Large attributes

Very large attributes, 10KB or longer, can spike the memory usage of each component when they are encountered. Tempo’s Parquet format uses dictionary-encoded columns, which works well for repeated values. However, for very large and high cardinality attributes, this can require a large amount of memory.

A common source of large attributes is auto-instrumentation in these areas:

HTTP
- Request or response bodies
- Large headers
  - http.request.header.<key>
- Large URLs
  - http.url
  - url.full
Databases
- Full query statements
- db.statement
- db.query.text
Queues
- Message bodies

When reading these attributes, they can spike the memory usage of the read components:

query-frontend
querier
ingester
metrics-generator

When writing these attributes, they can spike the memory usage of the write components:

ingester
compactor
metrics-generator

You can automatically limit attribute sizes using [max_span_attr_byte]((https://grafana.com/docs/tempo/<TEMPO_VERSION>/configuration/#set-max-attribute-size-to-help-control-out-of-memory-errors). You can also use these options: