Menu
Grafana Cloud

Create and manage Asserts SLOs

Asserts Service Level Objectives (SLO) provides a framework for measuring the quality of service you provide to users. Use SLOs to collect data on the reliability of your systems over time and as a result, help engineering teams reduce alert fatigue, focus on reliability, and provide better service to your customers. For more information about SLOs, refer to Introduction to Grafana SLO.

This topic describes the types of SLOs you can create, and how to create and manage Asserts SLOs.

Tip

The SLO builder is interactive and displays performance data while you are creating the SLO. Take advantage of the interactivity to fine-tune your SLO by experimenting with different time ranges, targets, and thresholds.

SLO types

You can select from among two types of Asserts SLOs:

  • Availability: Measures the percent of time that a service is available. The system calculates availability as: (Number of All Events Query/Number of Bad Events Query) * 100. This value is then compared against the target percent that you define to determine if you are meeting your SLO.
  • Latency/Occurrence: Measures how responsive your system is. This type of SLO is based on a single measurement rather than a ratio of measurements. Latency/occurrence SLOs notify you when the time it takes to complete transactions is greater than the threshold you set.

Simple and advanced SLOs

You can define a simple or an advanced SLO.

  • Simple: The PromQL expression that defines the SLO builds automatically as you select values from the drop-down menu.
  • Advanced: You have the freedom to manually define a PromQL expression that can include metrics other than Asserts metrics.

Before you begin

To create an Asserts SLO, you need to:

  • Configure Asserts and have metrics flowing into Grafana Cloud
  • Possess knowledge of and have experience with defining SLOs, SLIs, SLAs, and error budgets
  • Have an understanding of PromQL

Create a simple SLO

To create a simple SLO, complete the following steps.

  1. Sign in to Grafana and select Asserts > Asserts SLO.

  2. Click Add SLO.

  3. In the Basics section, select Availability or Latency/Occurrence.

  4. Enter an SLO name.

  5. Use the following table to complete the fields in the Service and APIs section.

    FieldDescription
    JobSelect one or more services for which you are creating the SLO.
    Request TypeThe Request Type field lists values based on the selected job.
    Request ContextSelect one or more API calls.
    Error TypeRequired only for an availability SLOs.
  6. In the RCA workbench section, enter a search expression.

    The search expression defines which entities populate the RCA workbench when you troubleshoot an SLO.

    This value automatically populates based on the job you selected. You can override this value by selecting another search expression.

  7. Use the following table to complete the fields in the Target Objectives section.

    FieldDescription
    Compliance Window (days)The time period over which the system burns down the error budget until the budget resets. For example, if the compliance window is 7 days and the target is 99%, then you are committing to burn no more than 1% of your error budget over a seven-day period of time.
    Target (%)The percent of time that you are committing to a service being available.
    Threshold (seconds)Required only for latency/occurrence SLOs. The threshold value defines the maximum amount of time for transactions between services to take before you are notified. For example, if the target is 99% and the threshold is 0.10, the system notifies you when transactions take more than .10 seconds to complete less than 99% of the time.
  8. Click Add new SLO.

Create an advanced availability SLO

To create an advanced availability SLO, complete the following steps.

  1. Sign in to Grafana and select Asserts > Asserts SLO.

  2. Click Add SLO.

  3. Click the Advanced tab.

  4. Click Availability.

  5. Enter an SLO name.

  6. Use the following table to complete the fields in the Service and APIs section.

    FieldDescription
    All Events QueryManually enter the all events query that queries for successful transactions.
    Bad Events QueryManually enter the bad events query that queries for error transactions.
  7. In the RCA workbench section, enter a search expression.

    The search expression defines which entities populate the RCA workbench when you troubleshoot an SLO.

    This value automatically populates based on the job identified in the query. You can override this value by selecting another search expression.

  8. Use the following table to complete the fields in the Target Objectives section.

    FieldDescription
    Compliance Window (days)The time period over which the system burns down the error budget until the budget resets. For example, if the compliance window is 7 days and the target is 99%, then you are committing to burn no more than 1% of your error budget over a seven-day period of time.
    Target (%)The percent of time that you are committing to a service being available.
  9. Click Add new SLO.

Create an advanced latency/occurrence SLO

To create an advanced SLO, complete the following steps.

  1. Sign in to Grafana and select Asserts > Asserts SLO.

  2. Click Add SLO.

  3. Click the Advanced tab.

  4. Click Latency/Occurrence.

  5. Enter an SLO name.

  6. In the SLI Details section, enter a measurement query.

    This query defines the service that you want to track.

  7. In the RCA workbench section, enter a search expression.

    The search expression defines which entities populate the RCA workbench when you troubleshoot an SLO.

    This value automatically populates based on the job identified in the query. You can override this value by selecting another search expression.

  8. Use the following table to complete the fields in the Target Objectives section.

    FieldDescription
    Compliance Window (days)The time period over which the system burns down the error budget until the budget resets. For example, if the compliance window is 7 days and the target is 99%, then you are committing to burn no more than 1% of your error budget over a seven-day period of time.
    Target (%)The target number of good minutes in the compliance window, expressed as a percentage.
    Threshold (seconds)Defines the maximum amount of time for transactions between services to take before you are notified. For example, if the target is 99% and the threshold is 0.10, the system notifies you when transactions take more than .10 seconds to complete less than 99% of the time.
  9. Click Add new SLO.

    The P99 latency calculates for each minute. For example, if the threshold is 100 milliseconds, each minute that the P99 falls within the threshold is a good minute and each minute when the P99 is above this threshold is a bad minute.

    The following table summarizes the SLO in terms of tolerated bad minutes or expected good minutes in a day.

    BadGoodSLO30 day error budgetFast burn 2% in 1 hourSlow burn 5% in 6 hoursFast burn alert behavior
    1143999.933036 seconds2 minutes 30 secondsTriggers for every bad minute. Each alert lasts for 5 minutes.
    5143599.651503 minutes7 minutes 30 secondsTriggers on the 4th and every subsequent bad minute in a 1 hour window. Each alert lasts for 5 minutes.
    10143099.313006 minutes15 minutesTriggers on the 7th and every subsequent bad minute in a 1 hour window. Each alert lasts for 5 minutes.
    15142598.954509 minutes22 minutes 30 secondsTriggers on the 10th and every subsequent bad minute in a 1 hour window. Each alert lasts for 5 minutes.
    30141097.9190018 minutes45 minutesTriggers when there are 2 bad minutes in the last 5 minutes and 19 bad minutes in the last 1 hour. Each alert lasts for 5 minutes.
    36140497.50108021 minutes 36 seconds54 minutesTriggers when there are 2 bad minutes in the last 5 minutes and 22 bad minutes in the last 1 hour. Each alert lasts for 5 minutes.
    60138095.83180036 minutes90 minutesTriggers when there are 2 bad minutes and 37 bad minutes in the last 1 hour. Each alert lasts for 5 minutes.

View and edit SLOs

You can edit an SLO at any time.

To view and edit an SLO, complete the following steps:

  1. Sign in to Grafana and select Asserts > Asserts SLO.

    You can see SLO performance on the SLO page.

  2. To edit an SLO, click the edit icon next to the SLO.

    The edit page opens where you can make changes.

  3. Click Update SLO.