Help build the future of open source observability software Open positions

Check out the open source projects we support Downloads

We cannot remember your choice unless you click the consent notice at the bottom.

How to monitor Kubernetes clusters with the Prometheus Operator

How to monitor Kubernetes clusters with the Prometheus Operator

2023-01-19 11 min

Kubernetes has become the preferred tool for DevOps engineers to deploy and manage containerized applications on one or multiple servers. These compute nodes are also known as clusters, and their performance is crucial to the success of an application. 

If a Kubernetes cluster isn’t performing optimally, the application’s availability and performance will suffer, leading to unhappy users and even revenue loss. Fortunately, several tools in the Kubernetes ecosystem can help you effectively monitor your Kubernetes cluster, including Prometheus.

In this guide, you’ll learn about the benefits of using Kubernetes operators. You’ll also learn how to deploy and use the Prometheus Operator to configure and manage Prometheus instances in your Kubernetes cluster. And finally, you’ll discover how to deploy Grafana in your Kubernetes cluster to help analyze and visualize the health of your Kubernetes clusters.

An illustration featuring the Kubernetes logo

Why monitor Kubernetes clusters with Prometheus?

Prometheus is an open source tool that provides monitoring and alerting for cloud native environments. It collects metrics data from these environments and stores them as time series data, recording information with a timestamp in memory or local disk.

You can install and configure Prometheus on your Kubernetes cluster using a Kubernetes operator like Prometheus Operator.

What are Kubernetes operators?

Kubernetes operators are software extensions that extend the functionalities of the Kubernetes API in configuring and managing other Kubernetes applications. They make use of custom resources to manage these applications and their components.

Kubernetes controllers

In Kubernetes, there are control loops called Kubernetes controllers that monitor the state of the Kubernetes cluster to make sure it’s similar to or equal to the desired state. For instance, if you want to deploy a new containerized application, you create a deployment object in the cluster with the necessary details using the kubectl client or the Kubernetes dashboard. The Kubernetes controller receives information about this new object and then performs the required action to deploy your application to the cluster. Likewise, when you edit the object, the controller receives the update and performs the necessary action.

Kubernetes operators

Kubernetes operators use this Kubernetes controller pattern in a slightly different way. They’re application-specific, meaning different applications can have their own operators (which can be found on OperatorHub.io). The operators monitor the applications in the cluster and perform the required tasks to ensure the application functions in the desired state, which you configure in your custom resource definition (CRD).

Deploying Prometheus instances manually without using the Prometheus Operator or a Prometheus Helm chart can be hectic and time-consuming. You’d need to configure certain Kubernetes objects like ConfigMap, secrets, deployments, and services, which the Prometheus Operator abstracts on your behalf. The Prometheus Operator also helps you manage dynamic updates of resources, such as Prometheus rules with zero downtime, among other benefits.

Benefits of using Kubernetes operators

Operators provide multiple benefits to the Kubernetes ecosystem.

1. Packaging, deploying, and managing applications

Kubernetes operators provide smoother application deployment and management. For instance, cloud native applications with complex installation procedures and maintenance steps can be handled more easily, because you can create custom resources to automate and simplify complex installation procedures. Operators also improve the management of stateful applications like databases, so it’s easier to perform database backups and configure database applications. Using operators enables you to automate these and other complex processes.

2. Performing basic operations using kubectl

Since Kubernetes operators extend the functionality of the Kubernetes API server, they also allow you to use the kubectl client to run basic commands for custom resources such as GET and LIST, letting you manage your custom resources with commands you are already familiar with in the terminal.

3. Facilitating resource monitoring

Because Kubernetes operators perform custom tasks on an application based on the CRDs, they continuously monitor the application and cluster to check for anything that is out of place. If there is a misalignment in the application, like if the application isn’t in its expected state, the operators correct it automatically. This saves the Kubernetes administrator the trouble of identifying and correcting problems on the cluster.

How to deploy and configure the Prometheus Operator in Kubernetes

As noted earlier, Prometheus works well for monitoring your application and resources in your Kubernetes cluster. Below, we’ll show you how to use the Prometheus Operator to deploy and configure the Prometheus instance in your Kubernetes cluster.

Prerequisites

You need the following for this guide:

  1. A running Kubernetes cluster. You can use minikube, which supports Linux, Mac, and Windows, to set up a local cluster on your computer.
  2. A kubectl client installed on your computer.

Deploying the Prometheus Operator

You’re going to deploy Prometheus Operator in your cluster, then configure permission for the deployed operator to scrape targets in your cluster.

You can deploy by applying the bundle.yaml file from the Prometheus Operator GitHub repository:

kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/bundle.yaml

If you run into the error The CustomResourceDefinition "prometheuses.monitoring.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes, then run the following command:

kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/bundle.yaml --force-conflicts=true --server-side=true

This command allows you to deploy Prometheus Operator CRD without Kubernetes altering the deployment when it encounters any conflicts while installing the operator in your cluster. In this scenario your cluster complains of a long metadata annotation that is beyond the limit provided by Kubernetes. However, --force-conflicts=true allows the installation to continue without the warning stopping it. Additionally, --force-conflicts only works with --server-side. For more information on –server-side and --force-conflicts click here.

Once you’ve entered the command, you should get an output similar to this:

customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
serviceaccount/prometheus-operator created
service/prometheus-operator created

When you run kubectl get deployments, you can see that the Prometheus Operator is deployed:

NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
prometheus-operator   1/1     1            1           6m1s

Next, you’ll install role-based access control (RBAC) permissions to allow the Prometheus server to access the Kubernetes API to scrape targets and gain access to the Alertmanager cluster. To achieve this, you’ll deploy a ServiceAccount. Create a file named prometheus_rbac.yaml and paste the following content into it:

yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/metrics
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["get"]
- apiGroups:
  - networking.k8s.io
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default

Then, apply the file to your cluster:

kubectl apply -f prometheus_rbac.yaml

You should get the following response:

serviceaccount/prometheus created
clusterrole.rbac.authorization.k8s.io/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created

Deploying Prometheus

Now that you have deployed the Prometheus Operator into your cluster, you’re going to create a Prometheus instance using a Prometheus CRD defined in a YAML file. Create a file named prometheus_instance.yaml and paste the following:

yaml
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
spec:
  serviceAccountName: prometheus
  serviceMonitorSelector: {}
  resources:
    requests:
      memory: 300Mi

Then, run the following:

kubectl apply -f prometheus_instance.yaml

In the YAML file above, the serviceAccountName references the ServiceAccount you created earlier for the ClusterRoleBinding. serviceMonitorSelector points to a ServiceMonitor that allows Prometheus to monitor specific services in your cluster. If you wanted Prometheus to select all existing ServiceMonitors, you’d assign {} to the serviceMonitorSelector key; if you didn’t want it to select any ServiceMonitor, you wouldn’t include it in the Prometheus CRD.

To select a particular ServiceMonitor, use something like this:

serviceMonitorSelector:
    matchLabels:
      app: frontend

After you apply the prometheus_instance.yaml file, a prometheus-operated service is created, which you can see when you run kubectl get svc.

Now, access the server by forwarding a local port to the Prometheus service:

kubectl port-forward svc/prometheus-operated 9090:9090

Open the URL http://localhost:9090 in your browser to see a page similar to the image below:

Prometheus UI

Configuring Prometheus to monitor services and applications

In a cluster with many different applications running simultaneously, you might not want a single Prometheus instance scraping metrics from all of them. You likely have specific applications in mind that you want to scrape metrics from, and you can specify them with the ServiceMonitor CRD.

The ServiceMonitor CRD enables you to customize how you want Prometheus to operate in terms of scraping metrics. You can configure Prometheus to scrape metrics from all services that match the label Frontend or Staging. You can even configure the interval rate at which Prometheus scrapes metrics from a set of services. The ServiceMonitor CRD gives you enormous power over how Prometheus monitors services in your cluster.

Create a ServiceMonitor CRD

Next, you’re going to create an example ServiceMonitor CRD. Create a file named service_monitor.yaml and paste the following:

yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    name: prometheus
  name: prometheus
spec:
  endpoints:
  - interval: 30s
    targetPort: 9090
    path: /metrics
  namespaceSelector:
    any: true
  selector:
    matchLabels:
      operated-prometheus: "true"

The code tells Prometheus to scrape services with the operated-prometheus: "true" label every thirty seconds, which you can access on the /metrics endpoint. Additionally, this means that Prometheus will scrape its own data and monitor its own health. It is also important to note that you should change the key-values of the matchLabels which is operated-prometheus: “true” to the service labels you want to monitor. The matchLabels key-value is just an example for this article.

Apply it to your Kubernetes cluster by running the following command:

kubectl apply -f service_monitor.yaml

Configuring Prometheus Operator using CRD objects

The Prometheus Operator provides different types of CRDs you can apply in your Kubernetes cluster to manage your Prometheus application. Some already mentioned in this guide include the Prometheus CRD and the ServiceMonitor CRD. Other options include PrometheusRules, Alertmanager, and PodMonitor. You can read more on the other CRD objects in the Github documentation.

Deploying Grafana in your Kubernetes cluster

Next, let’s review how to deploy Grafana in Kubernetes. Grafana allows you to query, visualize, alert on, and understand your metrics no matter where they are stored.

In your terminal, run the following command:

kubectl create deployment grafana --image=docker.io/grafana/grafana:latest 

You can run kubectl get deployments to confirm that Grafana has been deployed:

NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
grafana               1/1     1            1           7m27s

Next, you’ll create a service for the Grafana deployment:

kubectl expose deployment grafana --port 3000

Forward the port 3000 to the service by running the command below:

kubectl port-forward svc/grafana 3000:3000

Open http://localhost:3000 in your browser to access your Grafana dashboard. Log in with admin as the username and password. This redirects you to a page where you’ll be asked to change your password; afterward, you’ll see the Grafana homepage:

A screenshot of the Grafana user homepage

Click Data Sources, then select Prometheus and fill in your configuration details, such as your Prometheus Server URL, access, auth type, and scrape intervals.

You can’t use http://localhost:9090 as your HTTP URL because Grafana won’t have access to it. You must expose prometheus using a NodePort or LoadBalancer.

Create a file named expose_prometheus.yaml and paste the following:

yaml
apiVersion: v1
kind: Service
metadata:
  name: prometheus
spec:
  type: NodePort
  ports:
  - name: web
    nodePort: 30900
    port: 9090
    protocol: TCP
    targetPort: web
  selector:
    prometheus: prometheus

Run kubectl apply -f expose_prometheus.yaml. Grafana will be able to pull the metrics from http://<node_ip>:30900. To view the <node_ip>, run kubectl get nodes -o wide.

Enter http://<node_ip>:30900 in the URL box, then click Save & Test:

A screenshot showing the addition of Prometheus as a data source in Grafana

Creating a Grafana dashboard to monitor Kubernetes events

Let’s create a dashboard that shows a graph for the total number of Kubernetes events handled by a Prometheus pod. Hover over the panel on the left of the screen and select Dashboards > New dashboard, then select Add a new panel.

Select Prometheus as your Data source, choose prometheus_sd_kubernetes_events_total for the metric and prometheus-prometheus-0 in the labels input or you can choose any metric and labels you want to monitor. Then, click Run queries:

A screenshot showing the addition of metrics to monitor in Grafana

On the right-hand side of the dashboard panel configuration, you can configure details such as graph styles, the position of legends, and the title of the graph:

A screenshot showing how to configure a graph in Grafana

Click Apply once you’re done. You’ll be redirected to a new page containing your dashboard. You can add more dashboard panels by clicking the Add panel button or finish creating your dashboard by clicking the Save dashboard button:

A Grafana dashboard panel displays data related to monitoring pod events.

Click the Save button to save your dashboard with a custom name:

A screenshot shows how to save your dashboard in Grafana.

Now that your dashboard is configured, you can add as many panels as you want to monitor different metrics scraped from your Prometheus instance in your cluster:

A screenshot shows multiple metrics monitored in Grafana.

Conclusion

Kubernetes administrators and users often struggle to deploy and manage complex applications, but Kubernetes operators help. Application-specific operators simplify the deployment and management processes from the Kubernetes administrator to the Kubernetes API. The Prometheus Operator in particular makes managing Prometheus easier, with its CRDs that enable you to more easily scrape cluster and application metrics for monitoring.

While you could view an operator as yet another abstract layer to maintain in an already complex system, the efficiency it offers is not to be lightly dismissed. Automating configuration and operational processes frees up time that can be put to better, more innovative use.

If you’re interested in monitoring your Kubernetes clusters but don’t want to do it all on your own, we offer Kubernetes Monitoring in Grafana Cloud — the full solution for all levels of Kubernetes usage that gives you out-of-the-box access to your Kubernetes infrastructure’s metrics, logs, and Kubernetes events as well as prebuilt dashboards and alerts. Kubernetes Monitoring is available to all Grafana Cloud users, including those in our generous free tier. If you don’t already have a Grafana Cloud account, you can sign up for a free account today!