Help build the future of open source observability software Open positions

Check out the open source projects we support Downloads

We cannot remember your choice unless you click the consent notice at the bottom.

Introducing Grafana Machine Learning for Grafana Cloud, with metrics forecasting

Introducing Grafana Machine Learning for Grafana Cloud, with metrics forecasting

2021-10-13 6 min

At GrafanaCONline in June, we talked about the future of machine learning at Grafana Labs. Four months later, we are excited to introduce Grafana Machine Learning for Grafana Cloud, with our metrics forecasting capability. It’s available now to all customers on Pro or Advanced plans. If you’re not already using Grafana Cloud, you can sign up for a free 14-day trial of Grafana Cloud Pro here.

In this blog post, we’ll go over some use cases and real-world examples for Grafana Machine Learning.

Using machine learning to solve real-world problems

Adaptive alerts

It’s hard to create useful alerts that stay useful over time. Static thresholds that made sense in the past no longer do. They can’t adapt to the context like expected busy or quiet periods.

Imagine, for example, a food delivery app that has lots of usage at lunch and dinner times, but is pretty quiet in the early hours of the morning. The same threshold doesn’t work well for both scenarios, and could lead to missing incidents and/or noisy alerts.

What if we could learn from our metrics in the past, and create alerts that adapt to our data and context over time?

Grafana Machine Learning lets you train a model to learn the patterns within your systems, and use it to make confident predictions into the future.

Capacity planning

Most capacity planning is reactive; it spins up resources to meet demand. Having predictions into the future lets you plan ahead. This can be particularly useful if preparing the resources is expensive or takes a significant amount of time.

Detect the unexpected

When you know what is likely to happen, you can infer when things fall outside of these expectations. Detecting anomalies early can let you get ahead of potential problems so they don’t take you by surprise.

The screenshot above shows a real example of Grafana Machine Learning in action. The green line is the actual data; the blue line represents the predicted values into the future.

The shaded blue areas show the confidence levels of the model. As you can see, in this case it becomes less confident as time goes on.

How MediaKind is leveraging Grafana Machine Learning

MediaKind uses Grafana Cloud to provide the observability needed to ensure its systems are consistently up and running. The scale of the operations means there are a multitude of metrics on a plethora of dashboards. As the company’s operations grow, so does the data, which ultimately makes it increasingly difficult to manage. In the words of Principal Systems Architect Richard Chin, the problem is “too many graphs that need scanning by skilled people on a daily basis.” To meet this challenge head on, and to deliver on MediaKind’s unwavering commitment to quality of service, Chin and his team are constantly seeking out new ways to elevate service standards to their customers.After receiving early access to Grafana Machine Learning as part of the beta tester program, the team at MediaKind have leveraged it to train ML models to rapidly identify network packet loss as the root cause of downstream video errors. The models have learned the unique characteristics of each channel, and were able to alert users to unusual activity.

Chin told us that the models spotted anomalies that would be time-consuming for a human to notice, helping the team to reduce noise in existing dashboards and spotlight issues in a way not previously possible.

“Grafana Machine Learning was very easy to understand and set up," he said. “Machine Learning can be a complex area with many parameters to tune, but conceptually and practically Grafana Machine Learning was easy to understand by all our engineers who worked with it, even those with no previous Machine Learning experience. Plus, since it was provided by Grafana Cloud, we did not have to worry about setting up the service or scaling it. We see this as a very useful tool for intelligent anomaly detection, and it will certainly become one of the tools that our SREs will use to increase their productivity and reduce their daily toil.”

About MediaKind

MediaKind is a global change leader of media technology and services. Its mission is to deliver transformation by building a continuously better media universe alongside its customers and partners. Drawing on a pioneering industry heritage and fueled by innovation, MediaKind embraces and champions new standards, methodologies, and next-generation, immersive live and on-demand media experiences worldwide. Its end-to-end media solutions portfolio includes Emmy award-winning video compression for contribution and direct-to-consumer distribution, advertising and content personalization, high-efficiency cloud DVR, and TV and video delivery platforms. For more information, please visit: www.mediakind.com.

A continuously learning algorithm, fully managed in Grafana Cloud

We know things don’t stay the same for long, especially when you’re growing. The ML models periodically retrain on a sliding window of data. This allows them to remain “open-minded” and evolve along with your system rather than get trapped in the past.

With Grafana Machine Learning, you bring the data you already have and use the tool you already use, and we take care of the rest. This way you can easily add forecasts to your metrics, while we handle the infrastructure to crunch the numbers, generate predictions, and keep everything up-to-date.

Frequently asked questions

How much does it cost?

There’s nothing more to pay if you keep within the (pretty generous) free quota. For customers who really want to scale things up, we’re ready to have that conversation. Please contact us or ask your account executive, support engineer, or technical account manager.

What if I’m on the free Grafana Cloud plan?

To experiment with the ML capabilities, you need to upgrade your plan to Pro. (You can do this in the Cloud Portal Subscription page.) You can always downgrade again later if you wish.

How do I get started?

The best way to see metric forecasting in action is to start forecasting those metrics!

Head over to your instance of Grafana Cloud and look for the Machine Learning icon in the left nav to get started.

Let’s chat

If you get stuck, or have any questions for the team, please get in touch. We’d also love to hear about your use cases and examples of where ML has helped you.

The ML team usually hangs out in the #machine-learning channel on the Grafana Labs Community Slack. You can get your invitation from https://slack.grafana.com/.