Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.
Collecting metrics and logs from Grafana Mimir
You can collect logs and metrics from a Mimir or GEM cluster. To set up dashboards and alerts, see Installing Grafana Mimir dashboards and alerts or Grafana Cloud: Self-hosted Grafana Mimir integration .
It is easier and best to monitor a cluster if it was installed via the Grafana Mimir Helm chart. It is also possible to use this integration if Mimir was deployed another way. For more information, see Collect metrics and logs without the Helm chart.
Collect metrics and logs from the Helm chart
To set up the collection of metrics and logs, follow the steps that are based on the version of the Helm chart that you deployed:
- For a stable release:
- >= 3.x.x: See Collect metrics and logs via the Helm chart
- < 3.x.x: See Collect metrics and logs via Grafana Agent
- For non-Helm installations or installations of the deprecated enterprise-metrics Helm chart, see Collect metrics and logs without the helm chart.
Collect metrics and logs via the Helm chart
Starting from version 3.0.0
, the Helm chart sends metrics to a Prometheus-compatible server and sends logs to a Loki
cluster. The chart can also scrape additional metrics from kube-state-metrics, kubelet, and cAdvisor.
The Helm chart does not collect node_exporter metrics. For more information
about node_exporter, see Additional resources metrics.
This section guides you through the process for setting up metrics and logs collection via the Grafana Agent operator. The Mimir Helm chart can install and use the Grafana Agent operator. Due to how Helm works, before the chart can use the operator, you need to manually install the Custom Resource Definitions (CRDs) for the Agent operator.
Using the Agent operator for metrics and logs collection is our recommended approach. However, if you prefer not to use the Agent operator or already have an existing Grafana Agent you’d like to use for metrics and logs collection, follow the instructions for collecting metrics and logs via Grafana Agent instead.
Credentials
If Prometheus and Loki are running without authentication, then you scan skip this section. Metamonitoring supports multiple ways of authentication for metrics and logs. If you are using a secret such as an API key to authenticate with Prometheus or Loki, then you need to create a Kubernetes secret with that secret.
This is an example secret:
apiVersion: v1
kind: Secret
metadata:
name: metamonitoring-credentials
data:
prometheus-api-key: FAKEACCESSKEY
loki-api-key: FAKESECRETKEY
For information about how to create a Kubernetes secret, see Creating a Secret.
Helm chart values
Finally, merge the following YAML configuration into your Helm values file, and replace the values for url
, username
, passwordSecretName
, and passwordSecretKey
with the details of the Prometheus and Loki clusters, and the secret that you created. If your
Prometheus and Loki servers are running without authentication, then remove the auth
blocks from the YAML below.
If you already have the Agent operator installed in your Kubernetes cluster, then set installOperator: false
.
metaMonitoring:
serviceMonitor:
enabled: true
grafanaAgent:
enabled: true
installOperator: true
logs:
remote:
url: "https://example.com/loki/api/v1/push"
auth:
username: "12345"
passwordSecretName: "metamonitoring-credentials"
passwordSecretKey: "prometheus-api-key"
metrics:
remote:
url: "https://example.com/api/v1/push"
auth:
username: "54321"
passwordSecretName: "metamonitoring-credentials"
passwordSecretKey: "loki-api-key"
scrapeK8s:
enabled: true
kubeStateMetrics:
namespace: kube-system
labelSelectors:
app.kubernetes.io/name: kube-state-metrics
Send metrics back into Mimir or GEM
You can also send the collected metamonitoring metrics to the installation of Mimir or GEM.
When you leave the metamonitoring.grafanaAgent.metrics.remote.url
field empty,
then the chart automatically fills in the address of the GEM gateway Service
or the Mimir NGINX Service.
If you have deployed Mimir, and metamonitoring.grafanaAgent.metrics.remote.url
is not set,
then the metamonitoring metrics are be sent to the Mimir cluster.
You can query these metrics using the HTTP header X-Scope-OrgID: metamonitoring
If you have deployed GEM, then there are two alternatives:
If are using the
trust
authentication type (mimir.structuredConfig.auth.type=trust
), then the same instructions apply as for Mimir.If you are using the enterprise authentication type (
mimir.structuredConfig.auth.type=enterprise
, which is also the default whenenterprise.enabled=true
), then you also need to provide a Secret with the authentication token for the tenant.The token should be to an access policy withmetrics:write
scope. To set up the Secret, refer to Credentials. Assuming you are using the GEM authentication model, the Helm chart values should look like the following example.
metaMonitoring:
serviceMonitor:
enabled: true
grafanaAgent:
enabled: true
installOperator: true
metrics:
remote:
auth:
username: metamonitoring
passwordSecretName: gem-tokens
passwordSecretKey: metamonitoring
Collect metrics and logs via Grafana Agent
Older versions of the Helm chart need to be manually instrumented. This means that you need to set up a Grafana Agent that collects logs and metrics from Mimir or GEM. To set up Grafana Agent, see Set up Grafana Agent. Once your Agent is deployed, use the example Agent configuration to configure the Agent to scrape Mimir or GEM.
Caveats
Managing your own Agent comes with some caveats:
- You will have to keep the Agent configuration up to date manually as you update the Mimir Helm chart. While we will try to keep this article up to date, we cannot guarantee that the example Agent configuration will always work.
- The static configuration makes some assumptions about the naming of the chart, such as that you have not overridden
the
fullnameOverride
in the Helm chart. - The static configuration cannot be selective in the PersistentVolumes metrics it collects from Kubelet, so it will scrape metrics for all PersistentVolumes.
- The static configuration hardcodes the value of the
cluster
label on all metrics and logs. This means that the configuration cannot account for multiple installations of the Helm chart.
If possible, upgrade the Mimir Helm chart to version 3.0 or higher and use the built-in Grafana Agent operator. Using the Agent operator allows the chart to automatically configure the Agent, eliminating the aforementioned caveats.
Example Agent configuration
In the following example Grafana Agent configuration file for collecting logs and metrics, replace url
, password
, and username
in
the logs
and metrics
blocks with the details of your Prometheus and Loki clusters.
logs:
configs:
- clients:
- basic_auth:
password: xxx
username: xxx
url: https://example.com/loki/api/v1/push
name: integrations
positions:
filename: /tmp/positions.yaml
scrape_configs:
- job_name: integrations/grafana-mimir-logs
kubernetes_sd_configs:
- role: pod
pipeline_stages:
- cri: {}
relabel_configs:
- action: keep
regex: mimir-distributed-.*
source_labels:
- __meta_kubernetes_pod_label_helm_sh_chart
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- action: replace
replacement: $1
separator: /
source_labels:
- __meta_kubernetes_namespace
- __meta_kubernetes_pod_container_name
target_label: job
- action: replace
regex: ""
replacement: k8s-cluster
separator: ""
source_labels:
- cluster
target_label: cluster
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: name
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
target_config:
sync_period: 10s
metrics:
configs:
- name: integrations
remote_write:
- basic_auth:
password: xxx
username: xxx
url: https://example.com/api/prom/push
scrape_configs:
- job_name: integrations/grafana-mimir/kube-state-metrics
kubernetes_sd_configs:
- role: pod
metric_relabel_configs:
- action: keep
regex: (.*-mimir-)?alertmanager.*|(.*-mimir-)?compactor.*|(.*-mimir-)?distributor.*|(.*-mimir-)?(gateway|cortex-gw|cortex-gw).*|(.*-mimir-)?ingester.*|(.*-mimir-)?querier.*|(.*-mimir-)?query-frontend.*|(.*-mimir-)?query-scheduler.*|(.*-mimir-)?ruler.*|(.*-mimir-)?store-gateway.*
separator: ""
source_labels:
- deployment
- statefulset
- pod
relabel_configs:
- action: keep
regex: kube-state-metrics
source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_name
- action: replace
regex: ""
replacement: k8s-cluster
separator: ""
source_labels:
- cluster
target_label: cluster
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
job_name: integrations/grafana-mimir/kubelet
kubernetes_sd_configs:
- role: node
metric_relabel_configs:
- action: keep
regex: kubelet_volume_stats.*
source_labels:
- __name__
relabel_configs:
- replacement: kubernetes.default.svc.cluster.local:443
target_label: __address__
- regex: (.+)
replacement: /api/v1/nodes/${1}/proxy/metrics
source_labels:
- __meta_kubernetes_node_name
target_label: __metrics_path__
- action: replace
regex: ""
replacement: k8s-cluster
separator: ""
source_labels:
- cluster
target_label: cluster
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: false
server_name: kubernetes
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
job_name: integrations/grafana-mimir/cadvisor
kubernetes_sd_configs:
- role: node
metric_relabel_configs:
- action: keep
regex: (.*-mimir-)?alertmanager.*|(.*-mimir-)?compactor.*|(.*-mimir-)?distributor.*|(.*-mimir-)?(gateway|cortex-gw|cortex-gw).*|(.*-mimir-)?ingester.*|(.*-mimir-)?querier.*|(.*-mimir-)?query-frontend.*|(.*-mimir-)?query-scheduler.*|(.*-mimir-)?ruler.*|(.*-mimir-)?store-gateway.*
source_labels:
- pod
relabel_configs:
- replacement: kubernetes.default.svc.cluster.local:443
target_label: __address__
- regex: (.+)
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
source_labels:
- __meta_kubernetes_node_name
target_label: __metrics_path__
- action: replace
regex: ""
replacement: k8s-cluster
separator: ""
source_labels:
- cluster
target_label: cluster
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: false
server_name: kubernetes
- job_name: integrations/grafana-mimir/metrics
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: keep
regex: .*metrics
source_labels:
- __meta_kubernetes_pod_container_port_name
- action: keep
regex: mimir-distributed-.*
source_labels:
- __meta_kubernetes_pod_label_helm_sh_chart
- action: replace
regex: ""
replacement: k8s-cluster
separator: ""
source_labels:
- cluster
target_label: cluster
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: replace
separator: ""
source_labels:
- __meta_kubernetes_pod_label_name
- __meta_kubernetes_pod_label_app_kubernetes_io_component
target_label: __tmp_component_name
- action: replace
separator: /
source_labels:
- __meta_kubernetes_namespace
- __tmp_component_name
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_pod_node_name
target_label: instance
global:
scrape_interval: 15s
wal_directory: /tmp/grafana-agent-wal
Collect metrics and logs without the Helm chart
You can still use the dashboards and rules in the monitoring-mixin, even if Mimir or GEM is not deployed via the Helm chart or if you are using the deprecated enterprise-metrics Helm chart for GEM. As a starting point, use the Agent configuration from Collect metrics and logs via Grafana Agent. You might need to modify it. For more information, see dashboards and alerts requirements.
Service discovery
The Agent configuration relies on Kubernetes service discovery and pod labels to constrain the collected metrics and
logs to ones that are strictly related to the Helm chart. If you are deploying Grafana Mimir on something other than Kubernetes,
then replace the kubernetes_sd_configs
block with a block from
the Agent configuration that can discover the Mimir processes.