Collect logs through Kubernetes stdout with the OpenTelemetry Collector
If the best practice to send logs with OpenTelemetry is to use the OTLP protocol, some use cases prevent using this pattern and require outputting logs to files or stdout. Common use cases that prevent using the OTLP protocol include:
- Lack of support of OTLP for logs by the OpenTelemetry SDK. OpenTelemetry SDKs for Go, Python, Ruby, JavaScript/Node.js, or PHP don’t provide stable implementation of OTLP for logs.
- Organizational constraints, often related to reliability practices, that require usage of files for logs
You can collect file-based logs with the OpenTelemetry Collector. The following logs are emitted through Kubernetes stdout, but you can apply the same pattern to logs emitted to files.
Architecture to collect logs through Kubernetes stdout with the OpenTelemetry Collector
For proper correlation with traces and metrics, you should contextualize logs with the same resource attributes and with the trace and span IDs, which means:
- Enrich logs with the same identifying resource attributes, for example
service.name
,service.namespace
,service.instance.id
, anddeployment.environment
, and withtrace_id
andspan_id
- Go through the same metadata enrichment pipeline in the OpenTelemetry Collector, for example the Kubernetes Attributes Processor or the Resource Detection Processor
If this common enrichment is provided out-of-the-box when exporting logs through OTLP, you must add these attributes to the log lines when collected through files or stdout.
Pros and cons of JSON and unstructured text to enrich logs with contextualization metadata
To carry over the resource attributes in the log lines, you must adhere to one of the following patterns:
Export unstructured logs and parse them with regular expressions, for example:
2024-09-17T11:29:54 INFO [nio-8080-exec-1] c.e.OrderController : Order completed - service.name=order-processor, service.instance.id=i-123456, span_id=1d5f8ca3f9366fac...
Export structured format logs like JSON logs and parse them with native parsers of the chosen format, for example:
{"timestamp": "2024-09-17T11:29:54", "level": "INFO", "body":"Order completed", "logger": "c.e.OrderController", "service_name": "order-processor", "service_instance_id": "i-123456", "span_id":"1d5f8ca3f9366fac"...}
Both patterns have pros and cons:
- | JSON logs | Unstructured logs |
---|---|---|
Correlation | +++ | +++ |
Human Readability | The verbosity of JSON can seriously erode readability | Contextualization attributes can be appended at the end of the log line preserving readability |
Reliability of the parsing | It’s simple to define robust JSON parsing rules | Parsing unstructured text with regular expressions is fragile, particularly due to multi-line log messages like stack traces, to the point where it requires monitoring parsing failures |
Emit contextualized JSON logs with Java
Most popular logging frameworks, such as Log4j or SLF4J/Logback in Java, support emitting JSON formatted logs.
The integration of OpenTelemetry with logging libraries requires specifying the resource attributes available in the log line.
For the sake of readability and limiting verbosity, we recommend just adding attributes that are required to filter and correlate logs, such as service.name
, service.namespace
, deployment.environment
, and service.instance.id
.
The following example uses Spring Boot to enrich logs with OpenTelemetry attributes and emit them formatted in JSON through stdout.
Note
You can find the full code of the example in the Docker LGTM repository.
- Create a Spring Boot application (3.3.0 or newer) using the default logging instrumentation with the Logback library and the stdout output
- Instrument the Spring Boot application with the OpenTelemetry Java Agent following the setup flow defined by the Grafana Cloud integration for Java:
- Open the Grafana Cloud home page in a web browser
- Navigate to Connections > Add new Connection
- Select Java OpenTelemetry and follow the setup instructions
- Enrich Logback logs with OpenTelemetry resource attributes and log attributes using:
export OTEL_INSTRUMENTATION_COMMON_MDC_RESOURCE_ATTRIBUTES=service.namespace,service.name,service.instance.id,service.version,deployment.environment
- Change the Logback SpringBoot configuration to emit JSON logs with OpenTelemetry contextualization and add the following
logback-spring.xml
undersrc/main/resources
:
<!-- tested with Logback 1.5.0 and Spring Boot 3.3.0 -->
<configuration>
<include resource="org/springframework/boot/logging/logback/defaults.xml"/>
<include resource="org/springframework/boot/logging/logback/console-appender.xml"/>
<root level="INFO">
<appender-ref ref="CONSOLE_JSON"/>
</root>
<appender name="CONSOLE_JSON" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="ch.qos.logback.classic.encoder.JsonEncoder">
<withFormattedMessage>true</withFormattedMessage>
<withMessage>false</withMessage>
<withArguments>false</withArguments>
<withSequenceNumber>false</withSequenceNumber>
<withNanoseconds>false</withNanoseconds>
</encoder>
</appender>
</configuration>
Deploy the Spring Boot application on Kubernetes and verify that the application logs are outputted to the container stdout stream with JSON formatting and OpenTelemetry contextualization
Example of
kubectl logs my_pod_name | jq
:{ "timestamp": 1727346005788, "level": "INFO", "threadName": "http-nio-8080-exec-5", "loggerName": "com.grafana.example.RollController", "context": { "name": "default", "birthdate": 1727345887787, "properties": { } }, "mdc": { "trace_id": "97a39974ba2dfc9275e4d31dc2730ee4", "trace_flags": "01", "service.name": "dice", "service.instance.id": "0fb18318-06c0-4893-9a8b-353c00b45227", "service.version": "1.1", "span_id": "08d6d83d645e3ad2", "service.namespace": "shop", "deployment.environment": "staging" }, "formattedMessage": "Anonymous player is rolling the dice: 6", "throwable": null }
Configure the OpenTelemetry Collector instance with the file log receiver
- Add a file log receiver
- Add the container parser operator
- Add the json parser operator
- Map
body
toattributes.formattedMessage
- Map
timestamp.parse_from
toattributes.timestamp
- Map
severity.parse_from
toattributes.level
- Map
trace.trace_id.parse_from
toattributes.mdc.trace_id
- Map
trace.span_id.parse_from
toattributes.mdc.span_id
- Map
trace.trace_flags.parse_from
toattributes.mdc.trace_flags
- Map
- Add the move operator
- Move top level fields, such as
threadName
, to their corresponding OTel attributes, such asthread.name
- Move entries from the
mdc
field to the resource entry of the log entry
- Move top level fields, such as
- Remove all unnecessary fields using the attributes operator
- You can find the full configuration in the reference OTel collector configuration
Note
You can find the full code of the example in the Docker LGTM repository.