Configure Kafka exporter to generate Prometheus metrics
You can configure Kafka server to generate Prometheus metrics either using an external exporter or JMX exporter.
To configure Kafka exporter to generate Prometheus metrics, complete the following steps:
Select one of the following methods:
To set up the Kafka exporter, refer to Exporter.
To set up JMX Exporter, refer to JMX Exporter.
If Kafka is running, you can use the following script as an alternative way to configuring the exporter:
KAFKA_OPTS="$KAFKA_OPTS -javaagent:./jmx_prometheus_javaagent-0.16.1.jar=8080:./kafka-2_0_0.yml" kafka-server-start /usr/local/etc/kafka/server.properties
To confirm you configured Kafka exporter correctly, ensure the following metrics are available in Prometheus.
kafka_topic_partitions{topic="__consumer_offsets"}
kafka_topic_partition_current_offset gauge kafka_topic_partition_current_offset{partition="0",topic="__consumer_offsets"}
To confirm you configured JMX exporter correctly, ensure the following metrics are available in Prometheus.
kafka_producer_topic_record_send_total
kafka_producer_record_send_total
kafka_consumer_records_consumed_total_records_total
kafka_consumer_fetch_manager_bytes_consumed_total
RED metric KPIs
This section lists RED metrics KPIs.
Request rate
Asserts automatically tracks the following KPIs for your RED metrics.
- Kafka JMX RED metrics KPIs
- Producer requests
rate(kafka_server_brokertopicmetrics_totalproducerequests_total[5m])
- Producer records
rate(kafka_server_brokertopicmetrics_messagesin_total[5m])
- Consumer requests
rate(kafka_server_brokertopicmetrics_totalfetchrequestspersec_count{topic!=""}[5m])
- Producer requests
- Kafka Exporter RED Metrics KPI
- Produced messages
avg_over_time((delta(kafka_topic_partition_current_offset{topic!=""}[1m]) > 0 or delta(kafka_topic_partition_current_offset{topic!=""}[1m]) * 0) / 60 [5m])
- Consumed messages
avg_over_time((delta(kafka_consumergroup_current_offset{topic!=""}[1m]) > 0 or delta(kafka_consumergroup_current_offset{topic!=""}[1m]) * 0) / 60 [5m])
- Produced messages
Error ratios
- Producer errors
rate(kafka_server_brokertopicmetrics_failedproducerequests_total{topic!=""}[5m])/rate(kafka_server_brokertopicmetrics_totalproducerequests_total[5m])
- Consumer errors
rate(kafka_server_brokertopicmetrics_total_failedfetchrequestspersec_count{topic!=""}[5m])/ rate(kafka_server_brokertopicmetrics_totalfetchrequestspersec_count{topic!=""}[5m])
Latency
- P99 - Consumer Request
kafka_network_requestmetrics_totaltimems{request="Fetch", quantile="0.99"} / 1000
- P99 - Consumer Group
kafka_network_requestmetrics_totaltimems{request=~".*Group", quantile="0.99"}) / 1000
- P99 - Producer Request
kafka_network_requestmetrics_totaltimems{request="Produce",quantile="0.99"} / 1000
- P99 - Broker Request
kafka_controller_controllerchannelmanager_requestrateandqueuetimems{quantile="0.99"} /1000
RED metrics alerts
Asserts automatically tracks the short-term and long-term trends for request and latency for anomaly detection. Similarly, you can set thresholds for latency averages and P99 to record breaches. Asserts tracks error ratios against availability goals (default, 99.9%) and breaches (default, 10%).
KPI | Alerts |
---|---|
Request Rate | RequestRateAnomaly |
Error Ratio | ErrorRatioBreach ErrorBuildup - availability goal 99.9% |
Latency P99 | LatencyP99ErrorBuildup |
Failure Alerts
- KafkaTopicsUnderReplicatedPartitions
kafka_topic_partition_under_replicated_partition > 0
- KafkaOfflinePartitions
kafka_controller_kafkacontroller_offlinepartitionscount > 0
- KafkaActiveController
kafka_controller_kafkacontroller_activecontrollercount != 1
- KafkaUnderMinIsrPartitions
kafka_cluster_partition_underminisr > 0
Dashboards
The following dashboard shows information about Kafka server metrics:
- Messages Produced
- Messages Consumes
- Lag by Consumer
- Partitions for Topics
RED metrics - Producer
This section lists Producer RED metrics.
Requests
- Producer Record
rate(kafka_producer_record_send_total[5m])
- Producer Requests
rate(kafka_producer_request_total[5m])
Error ratio
- Producer Record
rate(kafka_producer_record_error_total[5m])``
/
rate(kafka_producer_record_send_total[5m])
Latency
- Average
max without(asserts_request_context)(kafka_producer_request_latency_avg/1000)
RED metrics - Consumer
This section lists Producer RED metrics.
Requests
- Consumer Record
rate(kafka_consumer_records_consumed_total_records_total[5m])
- Consumer Requests
rate(kafka_consumer_fetch_total_requests_total[5m])
- Consumer Fetch Requests
rate(kafka_consumer_fetch_manager_fetch_total[5m])
- Consumer Fetch Record
rate(kafka_consumer_fetch_manager_records_consumed_total[5m])
Latency
- Average
max without(asserts_request_context) (kafka_producer_request_latency_avg/1000)
Alerts
KPI | Alerts |
---|---|
Request Rate | RequestRateAnomaly |
Error Ratio | ErrorRatioAnomaly |
Latency Average | LatencyAverageBreach LatencyAverageAnomaly |
Dashboards
The following dashboard captures information about both producer and consumer of Kafka client.
- Topics connected to producer/consumer
- Producer records
- Producer requests
- Producer latency
- Consumer records
- Consumer Lag