Help build the future of open source observability software Open positions

Check out the open source projects we support Downloads

We cannot remember your choice unless you click the consent notice at the bottom.

Ask us anything: Should I run Prometheus in a container?

Ask us anything: Should I run Prometheus in a container?

2019-05-07 2 min

At Grafana Labs, we field questions about best practices from customers all the time. One company recently asked whether it should run a containerized Prometheus environment rather than a VM-based one. We thought we’d share our answer here too.

So: Should you run Prometheus in a container?

If you’re monitoring services in Kubernetes, you probably want to run Prometheus in Kubernetes, and therefore as a container. This is because Prometheus needs to be able to directly connect to every target to scrape metrics. Kubernetes gives every Pod a unique IP address, but typically these are only accessible within the Kubernetes cluster.

In general, you want your Prometheus servers to run as close to your services as possible. If you have multiple private networks that can’t talk to each other, then you will most likely need multiple Prometheus servers, one for each private network – or two for HA.

Beyond that, Prometheus is a single binary with no dependencies. It runs equally well inside or outside a container. Prometheus scales “up” very well, and for large Kubernetes clusters, it’s common to dedicate an entire node to Prometheus – even if it’s still a container.

Some of the other components in the system, such as node_exporter (for node-level metrics), are a little trickier to run in containers. They need direct access to lots of kernel interfaces. So in those cases, running directly on the host may be preferable. We run them as a DaemonSet on our Kubernetes cluster, with some special config to map through the right interfaces.

Now comes the question: Who monitors the monitor? What if the Kubernetes cluster is facing issues, and Prometheus can’t notify you of the issues? In that case, you should make Prometheus emit an alert that always fires (a.k.a. a Dead Man’s Switch), and be notified by an external service if the alert hasn’t fired for long. This makes sure that Prometheus is working correctly and the entire alerting pipeline is working.

Got a question for us about monitoring best practices? Email us at help@grafana.com.

To learn more, see all of our Prometheus-related blog posts.