OTLP: OpenTelemetry Protocol format considerations
OpenTelemetry doesn’t specify a data storage approach and leaves it to the observability backend to implement. Grafana Cloud converts metrics, logs, and traces sent to the Grafana Cloud OTLP endpoint to be compatible with Grafana databases.
The Grafana Cloud OTLP endpoint supports the OTLP/HTTP with binary protocol buffer encoding.
Metrics
Grafana Cloud stores metrics in Mimir, a Prometheus compatible database, and maps OpenTelemetry metrics following the official specification for OTLP Metric points to Prometheus.
Exponential histograms
By default, Grafana Cloud doesn’t accept OpenTelemetry exponential histograms.
To accept exponential histograms in Grafana Cloud, follow the send Prometheus native histograms documentation to enable Prometheus native histogram metrics for your environment.
When enabled, Grafana cloud converts OpenTelemetry exponential histograms into Prometheus native histograms following the official specifications for exponential histograms.
Metric and label name conversion
Prometheus metric and label names need to adhere to the regular expression [a-zA-Z_:][a-zA-Z0-9_:]*
which contains alphanumeric, underscore, and colon characters. Grafana converts fullstops .
and dashes -
in OpenTelemetry metric or label names to underscores _
to store them in Prometheus.
Example
If you sent the following OTLP data:
requests.duration{http.response.status_code=500, http.route="/api/orders"}
Grafana converts it for Prometheus storage:
requests_duration{http_response_status_code="500", http_route="/api/orders"}
Resource attributes added to target_info
metric
As described in the Resource Attributes specification, we add resource attributes to a special target_info
metric, with the exception of service.instance.id
, service.name
and service.namespace
.
The value <service.namespace>/<service.name>
(or <service.name>
if namespace is empty), is added as job
label and service.instance.id
is added as instance
label to every metric as well as the target_info
metric.
The job
and instance
labels are sometimes called identifying labels.
Use the on(job, instance)
operator to join any metric with the target_info
metric to retrieve the resource attributes for that metric.
For example to get the service.version
attribute for a metric called requests_total
, use the following PromQL:
requests_total * on(job, instance) group_left(service_version) target_info
Alternatively you can promote (copy) service.instance.id
, service.name
, service.namespace
or any other resource attribute to the labels of metrics as show for example in A practical guide to data collection with OpenTelemetry and Prometheus.
Suffixes added to metric names
By default Prometheus adds suffixes to metric names to comply with the Prometheus metric types name conventions. Grafana Cloud follows the official specification for OTLP Metric points to Prometheus.
The suffix generation procedure is as follows:
- Split the OTel unit into two parts, before and after any
/
character. - Add
_<UNIT>
to the suffix if the metric name doesn’t contain the first unit part. - Add
_per_<UNIT>
to the suffix if there’s a second unit part and the metric name doesn’t contain it. - Add
_total
to the suffix if the OTel metric type is monotonic sum. - Add
ratio
to the suffix if the OTel metric type is gauge and the unit is1
.
Note
Grafana Labs recommends you keep the default Prometheus metric name suffix generation enabled. If you need it disabled contact support.
Example
Grafana Cloud converts the OpenTelemetry monotonic sum system.io
with unit By
into the Prometheus name system_io_bytes_total
, and _bytes_total
is the generated suffix.
Metrics ingestion limits
Limit | Value | When you exceed the limit |
---|---|---|
The maximum length of a metric name. | 2048 bytes | Names that exceed the limit cause an ingestion exception. |
The maximum number of resource attributes. | 40 | Grafana Cloud drops limits above resource attribute limit. |
The maximum number of metric attributes. | 40 | Grafana Cloud drops limits above metric attribute limit. |
The maximum length for resource attribute and metric attribute names. | 2048 bytes | Names that exceed the limit cause an ingestion exception. |
The maximum length for resource attribute and metric attribute values. | 2048 bytes | Values that exceed the limit cause an ingestion exception. |
When you send batch metrics, Prometheus ingests only the metrics that don’t exceed limits and returns exceptions for the metrics that exceed limits.
Logs
Grafana Cloud converts OpenTelemetry logs and stores them in the Loki V3 format to leverage structured metadata and labels.
The conversion process is as follows:
OpenTelemetry log record field | Loki field |
---|---|
Timestamp | timestamp |
ObservedTimestamp | metadata[observed_timestamp] |
TraceId | metadata[trace_id] |
SpanId | metadata[span_id] |
TraceFlags | metadata[flags] |
SeverityText | metadata[severity_text] , the detected_level label is available |
SeverityNumber | metadata[severity_number] |
Body | The Loki log message. __line__ in LogQL functions, for example line_format . |
InstrumentationScope | metadata[scope_name] |
Attributes | metadata[xyz] Where xyz is the _ version of the OpenTelemetry attribute name, for example the OpenTelemetry attribute thread.name converted to thread_name Loki metadata. |
Resource | Loki promotes these resource attributes to Loki labels and persists others as Loki message metadata: cloud.availability_zone , cloud.region , container.name , deployment.environment , k8s.cluster.name , k8s.container.name , k8s.cronjob.name , k8s.daemonset.name , k8s.deployment.name , k8s.job.name , k8s.namespace.name , k8s.pod.name , k8s.replicaset.name k8s.statefulset.name , service.instance.id , service.name , service.namespace . |
Note
To configure the resource attributes that Loki promotes as labels contact support.
Logs ingestion limits
Limit | Value | When you exceed the limit |
---|---|---|
The maximum length of a log line. | 256 Kilobytes (KB) | Log lines that exceed the limit cause an ingestion exception. |
The maximum number of resource attributes promoted to Loki labels. | 15 | Log messages that exceed the limit cause an ingestion exception. |
The maximum number of resource attributes and logs attributes stored in Loki structured metadata. | 128 attributes and for an overall max of 64 KB | Log messages that exceed the limit cause an ingestion exception. |
The maximum length of a resource attribute value when promoted to Loki labels. | 2048 bytes | Values that exceed the limit cause an ingestion exception. |
When you send batch logs, Loki ingests only the logs that don’t exceed limits and returns exceptions for the logs that exceed limits.
Loki versions before V3
Before Loki V3 introduced structured metadata, Loki converted OpenTelemetry logs in Alloy otelcol.exporter.loki
or in the OpenTelemetry Collector Loki Exporter and stored them as JSON in the message of log lines.
OpenTelemetry log record field | Loki field |
---|---|
Timestamp | timestamp |
ObservedTimestamp | Not available. |
TraceId | traceid field of the Loki JSON log message. |
SpanId | spanid field of the Loki JSON log message. |
TraceFlags | Not available |
SeverityText | Simultaneously mapped to the severity field of the JSON log message and the level and detected_level label fields of the Loki log record. |
SeverityNumber | Not available. |
Body | body field of the Loki JSON log message. |
InstrumentationScope | instrumentation_scope_name field of the JSON log message. |
Attributes | JSON fields of the Loki log message |
Resource | Loki stores resource attributes as JSON fields of the Loki log message with the prefix resources_ , for example resources_k8s_namespace_name , and promotes service.name , service.namespace , and service.instance.id to labels as follows job=[${service.namespace}/]${service.name} and instance=${service.instance.id} . |
Loki adds a static label exporter=OTLP . |
To promote more resource attributes and log attributes to Loki labels, use the hints loki.resource.labels
and loki.attribute.labels
as documented on OpenTelemetry Collector Loki Exporter and Alloy otelcol.exporter.loki
.
Traces
Traces limits
Limit | Value | When you exceed the limit |
---|---|---|
The maximum volume of traces. | 5 megabytes (MB) per trace, ingest rate of 15 MB/s and bursts of 20 MB/s. | Spans that exceed the limit cause an ingestion exception. |
The maximum number of resource attributes. | Limits are on the size of the traces. | |
The maximum number of span attributes. | Limits are on the size of the traces. | |
The maximum length of resource attribute and span attribute names. | 1024 bytes | Grafana Cloud truncates attribute names that exceed the limit. |
The maximum length of resource attribute and span attribute values. | 2048 bytes | Grafana Cloud truncates attribute names that exceed the limit. |
The maximum size of spans in a batch. | Limits are on the size of the traces and the volume of bytes per second | Grafana Cloud rejects spans that exceed the trace size limit and throws a partial exception. |