Grafana Incident

Automate, document, and learn from your incident management, all in Grafana Cloud

Incident timeline UI in Grafana Incident

Grafana Incident, which is part of Grafana IRM, is an incident management tool that automates routine tasks so your team remains focused on resolving incidents (and improving on them) faster.

All the info you need, documented for you

Learn from comprehensive incident timelines to mitigate future problems. No matter when or where you declare an incident, you’ll capture all the crucial information, from start to finish.

Your single source of truth

No more scrambling to find the right information scattered across platforms. Centralize communication and avoid confusion during incidents.

Focus on what matters most

Automate manual administrative work so engineers can stay on task and remediate issues faster.

Declare incidents quickly and easily

You can trigger an incident from any Grafana visualization, the moment you find a concerning issue.

  • Embed the right context, such as relevant dashboards and metrics, for responders.
  • Assign appropriate severity levels when declaring an incident to prioritize response efforts, and invite participants to collaborate.
  • Go from monitoring to active incident response within the Grafana ecosystem, and create incident channels and collaboration spaces directly from your visualizations.
Declare incident UI in Grafana Incident

Document everything that matters

Automatically generate a single source of truth for each incident so you can better track ongoing events and improve future responses.

  • Maintain a centralized record for each incident so all team members have access to consistent, accurate information.
  • Capture key decisions and updates throughout the incident lifecycle, including relevant Grafana panels and critical Slack notifications.
  • Automatically organize events, communications, and actions into a clear, chronological timeline to track the incident’s progression and response efforts.
  • Convert the incident timeline into a structured post-incident review (PIR) document, and streamline the review process with pre-populated incident data and key events.
Incident response management timeline UI in Grafana Incident

Analyze and improve your incident management operations

Get a comprehensive overview into your incident ops performance so you can continuously improve your processes with data-driven insights.

  • Get a high-level view of your incident management workflows, and gain valuable insights into the trends and patterns in your incident frequency and resolutions.
  • Track key metrics such as mean time to resolution (MTTR) and mean time to detection (MTTD), identify bottlenecks, and analyze performance by various dimensions, including label, severity, or status.
  • Gather data on incident frequency and types to optimize your observability and response strategies.
Incidents insights in Grafana Incident

Reduce MTTR with machine learning

Surface problems to reduce MTTR and MTTD with Sift, a powerful diagnostic assistant in Grafana Cloud.

  • Automatically scan metrics and logs to provide a holistic view of your system’s health, and correlate data from multiple sources to identify complex, interconnected issues.
  • Begin an incident with a Sift Check and automatically gather relevant context and data to jumpstart the incident response process.
  • Sift continuously improves its detection capabilities based on feedback and outcomes and adapts to evolving system behaviors and new types of issues over time.
Grafana Incident UI showing error pattern logs

Incident response and management on the go

With the IRM mobile app, you can handle critical situations from anywhere.

Personalized notifications:

  • Receive push notifications tailored to your personal preferences.
  • Override “do not disturb” settings for critical emergencies.

On-call schedules at your fingertips:

  • Review on-call rotation details anytime, anywhere.
  • Quickly check upcoming shifts and team availability.
  • Easily request shift swaps with your team.

Incident details on demand:

  • Acknowledge, respond to, or escalate incidents directly from your mobile device.
  • Access comprehensive incident information to make informed decisions.
Grafana OnCall app alert groups

Get started with incident response and management in Grafana Cloud

2

Set up integrations to your favorite apps, such as Slack, where you can add the Grafana Incident chatbot to the relevant channel.

3

Configure notifications

Decide how each user will receive notifications and create escalations.

4

Set up on-call schedules and start declaring incidents

Establish on-call schedules within the UI and declare your first drill incident.

For full implementation details and best practices,

Get Grafana IRM in Grafana Cloud

Detect, respond, and learn. Grafana IRM simplifies the incident workflow to help you focus on managing incidents, not your tools.

Cloud Free

No payment. Ever.
Best suited for early stage and small teams with up to 3 active IRM users per month.
Easiest way to get started

Cloud Pro

Pay as you go
Best suited for growing teams that need to scale above 3 active IRM users and unlock 8x5 support.

Cloud Advanced

Premium bundle
Best suited for teams looking to scale above 3 active IRM users and unlock 24x7 support.

Easily connect to more Grafana Cloud tools

Grafana Alerting

Unify alert management across your entire stack with powerful, flexible rules and notifications.

Grafana SLO

Define service level objectives and create error-budget alerts to catch issues before your customers do.

Ready to get started?