Menu
Grafana Cloud

Introduction to Kubernetes Monitoring

The multiple and dynamic components of Kubernetes pose significant challenges to any team’s ability to problem solve and proactively manage a fleet.

Reactive problem solving means:

  • Quick identification of issues
  • Prioritization of problem severity
  • Streamlined root cause analysis
  • An efficent workflow

Proactive management requires:

  • Preventing issues that cause performance problems
  • Efficient use of resources and managing their costs
  • Insight into future usage and cost

Kubernetes Monitoring provides the tools for both reactive and proactive strategies.

Reactive response benefits

Quick issue identification, alerts, data correlation, and other features are built into Kubernetes Monitoring to streamline troubleshooting.

Priority issues at forefront

The Kubernetes Overview page provides a high-level look at counts for Kubernetes objects, CPU and memory usage by Cluster, and firing alerts for containers and Pods. You can filter this view by Clusters and namespaces, then identify issues that require attention to begin your problem solving.

Snapshot of counts, Cluster CPU and memory usage, deployed container images, and firing container alerts
Snapshot of counts, Cluster CPU and memory usage, deployed container images, and firing container alerts

Real-time alerts

Real-time alerts inform you as soon as problems begin. You can jump from alert to runbook for a quick solution, create your own alerts, and copy a built-in alert to customize it.

Logs and metrics correlation

While Kubernetes doesn’t provide a native storage solution for logs, Kubernetes Monitoring uses Grafana Loki as its log aggregator. Since Loki and Prometheus share labels, you can correlate metrics and logs to identify root causes faster without configuring and using multiple technologies.

Proactive management benefits

The features available in Kubernetes Monitoring enable you to create and implement a proactive strategy with a data-driven approach.

Early error detection

You can use built-in alerting for anomolies such as CPU throttling to learn which settings need fine tuning. Network bandwidth and saturation is available by object. The time range selector in Kubernetes Monitoring provides a look into the history of an object, which reveals patterns such as spikes. Outlier Pod detection can uncover Pods with CPU usage differences that may lead to issues.

Cost visibility and management

Nodes, load balancers, and Persistent Volumes usually incur a separate cost from your provider, making it important to keep track of them. Auto-scaling architectures let you adapt in real-time to changing demand, but can lead to rapidly spiraling costs. Kubernetes Monitoring provides visibility into these costs to identify where cost can be reduced. With cost prediction, you can view potential, future costs.

Resource efficiency management

You can mitigate the threat of an unstable infrastructure by monitoring resource usage to:

  • Ensure that there are enough allocated resources and decrease the risk of Pod eviction, as well as prevent performance degradation of your microservices and applications.
  • Eliminate unused or stranded resources.

Then you can make scheduling adjustments, such as setting affinities and anti-affinities, to enhance performance and reliability.

Resource usage forecasts

By looking at a prediction of resource usage, you can better forecast for a project or activity.

What is out of the box

The out-of-the-box features that are part of Kubernetes Monitoring include:

Get started

Get started by using a streamlined configuration process with Grafana Kubernetes Monitoring Helm chart. When you configure with the Helm chart, there’s no manual set up, and the chart includes automatic updates for all components that it installs.

Other configuration methods

There are other available methods you can use to configure Kubernetes Monitoring for your infrastructure data.

To configure data about an application running in Kubernetes, refer to Application metrics.