Metrics-generator in Grafana Cloud Traces
The Tempo metrics-generator can derive metrics from traces as they are ingested. When used in Grafana Cloud, the metrics-generator writes metrics directly to the hosted Prometheus instance in the same stack.
For more information about the metrics-generator and the metrics it creates, refer to the metrics-generator documentation.
Note
The metrics-generator is in active development. Currently, in very rare cases, data available for TraceQL search is not recorded in span or service graph metrics. Tempo is making architectural changes to promote span, service graph, and TraceQL metrics to have the same durability guarantees as TraceQL search. We look forward to rolling this out as soon as possible!
Enable metrics-generator
Metrics-generation is disabled by default. You can enable it for use with Application Observability defaults in Application Observability, or contact Grafana Support to enable metrics-generation for your organization with custom settings.
Enabling metrics-generator using Application Observability has limitations.
By default, Application Observability configures the metrics-generator to only generate metrics for the SERVER
and CONSUMER
span kinds.
Application Observability can be further configured to also generate metrics for CLIENT
and PRODUCER
span kinds, refer to Include web applications and mobile devices for more details.
If you need to generate span metrics for the INTERNAL
span kind, contact Support.
Constraints and good to know
- The active series sent to the hosted Prometheus instance is billed like regular metrics.
- Metrics can only be sent to a hosted Prometheus instance in the same region.
- If traces are down-sampled before reaching Tempo, the metrics will be lower than reality.
- All generated metrics are aggregated by default.
Aggregated metrics
Grafana Cloud uses Adaptive Metrics to aggregate away operational labels added by the open source Tempo metrics-generator. This reduces the number of time series produced by the metrics-generator, and therefore reduces the cost of enabling metrics generation for Grafana Cloud users.
In most cases, this aggregation should be completely unnoticeable to users.
If you require unaggregated metrics generated by Grafana Cloud Traces, contact Grafana Support for help removing the aggregation rules from Adaptive Metrics.
Monitor the metrics-generator
The grafanacloud-usage
data source exposes several metrics about the metrics-generator.
Amount of active series:
grafanacloud_traces_instance_metrics_generator_active_series{}
Amount of active series being limited:
grafanacloud_traces_instance_metrics_generator_series_dropped_per_second{}
Amount of spans that are discarded by the metrics-generator before the spans are processed:
grafanacloud_traces_instance_metrics_generator_discarded_spans_per_second
This metric has a reason label:
outside_metrics_ingestion_slack
: The time between the creation of the span and when it was ingested was too large and the span is deemed outdated. Processing this span and including it a current metrics sample would skew the data.
How this works
When the amount of active series in Tempo reaches a configurable limit, no new active series are added. Grafana Cloud Traces keeps updating the existing series. The series exceeding the limit are dropped.
Configuration options
You can configure the following settings for metrics-generator in Grafana Cloud Traces. Contact Grafana Support to modify any of these settings.
Configuration | Description |
---|---|
Enabled processor | The metrics processors to enable; options include service graphs and/or span metrics. |
Max active series | The maximum amount of active series. |
Collection interval | How often samples are collected from the active series. Defaults to every 60s or 1 DPM. |
Histogram buckets | The buckets used for the histograms generated by the metrics-generator. This can be configured per processor. |
Dimensions | Additional dimensions to be added to the generated metrics. If this dimension is present in the span attributes, it’s included as a label in the metrics. This can be configured per processor. |
Note
Characters that aren’t valid Prometheus labels are sanitized. For example, the trace attributek8s.namespace
becomes the Prometheus labelk8s_namespace
.