Create and manage Asserts SLOs
Asserts Service Level Objectives (SLO) provides a framework for measuring the quality of service you provide to users. Use SLOs to collect data on the reliability of your systems over time and as a result, help engineering teams reduce alert fatigue, focus on reliability, and provide better service to your customers. For more information about SLOs, refer to Introduction to Grafana SLO.
This topic describes the types of SLOs you can create, and how to create and manage Asserts SLOs.
Tip
The SLO builder is interactive and displays performance data while you are creating the SLO. Take advantage of the interactivity to fine-tune your SLO by experimenting with different time ranges, targets, and thresholds.
SLO types
You can select from among two types of Asserts SLOs:
- Availability: Measures the percent of time that a service is available. The system calculates availability as:
(Number of All Events Query/Number of Bad Events Query) * 100
. This value is then compared against the target percent that you define to determine if you are meeting your SLO. - Latency/Occurrence: Measures how responsive your system is. This type of SLO is based on a single measurement rather than a ratio of measurements. Latency/occurrence SLOs notify you when the time it takes to complete transactions is greater than the threshold you set.
Simple and advanced SLOs
You can define a simple or an advanced SLO.
- Simple: The PromQL expression that defines the SLO builds automatically as you select values from the drop-down menu.
- Advanced: You have the freedom to manually define a PromQL expression that can include metrics other than Asserts metrics.
Before you begin
To create an Asserts SLO, you need to:
- Configure Asserts and have metrics flowing into Grafana Cloud
- Possess knowledge of and have experience with defining SLOs, SLIs, SLAs, and error budgets
- Have an understanding of PromQL
Create a simple SLO
To create a simple SLO, complete the following steps.
Sign in to Grafana and select Asserts > Asserts SLO.
Click Add SLO.
In the Basics section, select Availability or Latency/Occurrence.
Enter an SLO name.
Use the following table to complete the fields in the Service and APIs section.
Field Description Job Select one or more services for which you are creating the SLO. Request Type The Request Type field lists values based on the selected job. Request Context Select one or more API calls. Error Type Required only for an availability SLOs. In the RCA workbench section, enter a search expression.
The search expression defines which entities populate the RCA workbench when you troubleshoot an SLO.
This value automatically populates based on the job you selected. You can override this value by selecting another search expression.
Use the following table to complete the fields in the Target Objectives section.
Field Description Compliance Window (days) The time period over which the system burns down the error budget until the budget resets. For example, if the compliance window is 7 days and the target is 99%, then you are committing to burn no more than 1% of your error budget over a seven-day period of time. Target (%) The percent of time that you are committing to a service being available. Threshold (seconds) Required only for latency/occurrence SLOs. The threshold value defines the maximum amount of time for transactions between services to take before you are notified. For example, if the target is 99% and the threshold is 0.10, the system notifies you when transactions take more than .10 seconds to complete less than 99% of the time. Click Add new SLO.
Create an advanced availability SLO
To create an advanced availability SLO, complete the following steps.
Sign in to Grafana and select Asserts > Asserts SLO.
Click Add SLO.
Click the Advanced tab.
Click Availability.
Enter an SLO name.
Use the following table to complete the fields in the Service and APIs section.
Field Description All Events Query Manually enter the all events query that queries for successful transactions. Bad Events Query Manually enter the bad events query that queries for error transactions. In the RCA workbench section, enter a search expression.
The search expression defines which entities populate the RCA workbench when you troubleshoot an SLO.
This value automatically populates based on the job identified in the query. You can override this value by selecting another search expression.
Use the following table to complete the fields in the Target Objectives section.
Field Description Compliance Window (days) The time period over which the system burns down the error budget until the budget resets. For example, if the compliance window is 7 days and the target is 99%, then you are committing to burn no more than 1% of your error budget over a seven-day period of time. Target (%) The percent of time that you are committing to a service being available. Click Add new SLO.
Create an advanced latency/occurrence SLO
To create an advanced SLO, complete the following steps.
Sign in to Grafana and select Asserts > Asserts SLO.
Click Add SLO.
Click the Advanced tab.
Click Latency/Occurrence.
Enter an SLO name.
In the SLI Details section, enter a measurement query.
This query defines the service that you want to track.
In the RCA workbench section, enter a search expression.
The search expression defines which entities populate the RCA workbench when you troubleshoot an SLO.
This value automatically populates based on the job identified in the query. You can override this value by selecting another search expression.
Use the following table to complete the fields in the Target Objectives section.
Field Description Compliance Window (days) The time period over which the system burns down the error budget until the budget resets. For example, if the compliance window is 7 days and the target is 99%, then you are committing to burn no more than 1% of your error budget over a seven-day period of time. Target (%) The target number of good minutes in the compliance window, expressed as a percentage. Threshold (seconds) Defines the maximum amount of time for transactions between services to take before you are notified. For example, if the target is 99% and the threshold is 0.10, the system notifies you when transactions take more than .10 seconds to complete less than 99% of the time. Click Add new SLO.
The P99 latency calculates for each minute. For example, if the threshold is 100 milliseconds, each minute that the P99 falls within the threshold is a good minute and each minute when the P99 is above this threshold is a bad minute.
The following table summarizes the SLO in terms of tolerated bad minutes or expected good minutes in a day.
Bad Good SLO 30 day error budget Fast burn 2% in 1 hour Slow burn 5% in 6 hours Fast burn alert behavior 1 1439 99.93 30 36 seconds 2 minutes 30 seconds Triggers for every bad minute. Each alert lasts for 5 minutes. 5 1435 99.65 150 3 minutes 7 minutes 30 seconds Triggers on the 4th and every subsequent bad minute in a 1 hour window. Each alert lasts for 5 minutes. 10 1430 99.31 300 6 minutes 15 minutes Triggers on the 7th and every subsequent bad minute in a 1 hour window. Each alert lasts for 5 minutes. 15 1425 98.95 450 9 minutes 22 minutes 30 seconds Triggers on the 10th and every subsequent bad minute in a 1 hour window. Each alert lasts for 5 minutes. 30 1410 97.91 900 18 minutes 45 minutes Triggers when there are 2 bad minutes in the last 5 minutes and 19 bad minutes in the last 1 hour. Each alert lasts for 5 minutes. 36 1404 97.50 1080 21 minutes 36 seconds 54 minutes Triggers when there are 2 bad minutes in the last 5 minutes and 22 bad minutes in the last 1 hour. Each alert lasts for 5 minutes. 60 1380 95.83 1800 36 minutes 90 minutes Triggers when there are 2 bad minutes and 37 bad minutes in the last 1 hour. Each alert lasts for 5 minutes.
View and edit SLOs
You can edit an SLO at any time.
To view and edit an SLO, complete the following steps:
Sign in to Grafana and select Asserts > Asserts SLO.
You can see SLO performance on the SLO page.
To edit an SLO, click the edit icon next to the SLO.
The edit page opens where you can make changes.
Click Update SLO.