Menu
Grafana Cloud

Note

Fleet Management is currently in public preview. Grafana Labs offers limited support, and breaking changes might occur prior to the feature being made generally available. For bug reports or questions, fill out our feedback form.

Check the status of your fleet

Grafana Fleet Management provides a health status indicator so you can see at a glance if your collectors are healthy. You can find a collector’s health status in the Status column on the Inventory tab in the Fleet Management interface.

The health status indicator reflects the current state of the collector:

  • Green (Healthy) means the collector is healthy.
  • Yellow (Warning) means the collector is potentially unhealthy.
  • Red (Error) means the collector is unhealthy.
  • Gray (Unknown) means the collector is not reporting data.

You can filter your fleet by status by clicking on the Status dropdown on the Inventory tab.

How health status is determined

The health status is controlled by three factors:

  • Has the collector made a GetConfig API request in the last 30 minutes?
  • Is the collector reporting an up metric with the correct collector_id label?
  • Does the collector have any active alerts?

Note

The health status does not check for configuration errors.

The Fleet Management service fetches active alerts from the Grafana Prometheus instance. Alerts must exist in your stack’s Prometheus Alertmanager to be discoverable by the health status check. Refer to Create new alerts for guidance on labeling alerts and the Alertmanager documentation for tips on using Mimirtool to configure Alertmanager.

Green status

A green health status indicates that the collector is operational. At minimum, an operational collector:

  • Has no active alerts.
  • Made a GetConfig API request in the last 30 minutes or reported an up metric.

Note

A false-positive healthy status can result if the collector_id label in an alert does not match the id argument in the remotecfg block of the collector. If the labels do not match, the alerts cannot be attributed to the collector. When Fleet Management checks the state of a mismatched collector, the service finds no active alerts.

Yellow status

A yellow health status warns that there might be an issue with the collector. It can have two causes:

  • The collector has an active, non-critical alert; or
  • The collector has not made a GetConfig request in the last 30 minutes and it is not reporting an up metric.

Note

If your collector is not self-reporting its own metrics with the collector_id label, you might see a yellow heartbeat even if the collector is healthy. Collectors in your Fleet Management Inventory should automatically be assigned the self_monitoring_metrics pipeline. If you see a yellow health status, make sure the pipeline is active.

Red status

A red health status indicates that the collector has an active, critical alert.

Gray status

A gray health status indicates that the collector has not reported telemetry. It has never had a heartbeat, doesn’t have an up metric within the data retention period, and has no active alerts.

Next steps

If your collector is unhealthy or potentially unhealthy, refer to Troubleshoot an unhealthy collector for help diagnosing the problem.