Investigate incidents with Sift
Sift is a powerful diagnostic assistant powered by Grafana Machine Learning that performs automated investigations on your infrastructure telemetry, helping you identify critical details during incidents.
This topic explains how to use Sift to accelerate your incident investigation process in Grafana IRM.
Before you begin
To use Sift with Grafana IRM, you need:
- Access to an active incident in Grafana IRM
- A Grafana Cloud stack with Grafana Machine Learning enabled
- Kubernetes telemetry in your Grafana Cloud stack (currently required for most Sift capabilities)
About Sift investigations
Sift analyzes your telemetry data to identify potential issues related to your infrastructure and applications. When used during an incident, Sift can:
- Detect anomalies in your metrics data
- Identify suspicious patterns in logs and traces
- Suggest possible root causes for observed issues
- Recommend next steps for investigation
For more details about how Sift works and what checks are performed, refer to the Sift Machine Learning documentation.
Start a Sift investigation
Note
Sift investigations are currently focused on Kubernetes-centered stacks and require a
cluster
andnamespace
to perform checks. Future versions will support additional monitoring environments.
You can leverage Sift’s capabilities in Grafana IRM in two ways:
Run a Sift investigation from an incident
To initiate a Sift investigation directly from an incident:
- Navigate to the incident details page.
- Locate the Suggestions panel in the right sidebar of the incident timeline.
- Click Start Sift investigation.
- Enter the required information:
- Cluster: The Kubernetes cluster to investigate
- Namespace: The Kubernetes namespace to investigate
- Click Start investigation.
Note
When a Sift investigation is triggered from an incident, the time range is automatically set from the incident start time to the current time.
Add dashboards to the incident timeline
When you add dashboards to an incident timeline, Sift can extract context from them:
- Add a dashboard to the incident timeline following the timeline documentation.
- Ensure the dashboard includes
cluster
andnamespace
variables or references. - Sift will use this information to perform relevant investigations tied to the incident.
Manage Sift suggestions
After Sift completes its checks, the results appear in the Suggestions panel in the right sidebar of the incident timeline.
View Sift suggestions
To review Sift’s findings:
- Navigate to the incident details page.
- Locate the Suggestions panel in the right sidebar.
- Click the view details icon (eye symbol) on a suggestion to see the complete analysis.
Add suggestions to the timeline
To share important Sift findings with other responders:
- Locate the suggestion you want to share in the Suggestions panel.
- Click the + icon next to the suggestion.
- The suggestion will be added to the incident timeline, where all participants can see it.
Adding suggestions to the timeline helps provide context and valuable information to all stakeholders involved in resolving the incident.
Remove suggestions
If a suggestion isn’t relevant to the current incident:
- Locate the suggestion you want to remove in the Suggestions panel.
- Click the trash can icon next to the suggestion.
- The suggestion will be removed from the list.