What is Adaptive Telemetry, and how can it reduce MTTR, noise, and cost?
As your applications scale, so too does the flood of logs, metrics, profiles, and traces—along with the costs to store and manage them. Collecting everything might feel like the safest bet, but it often leaves you buried in noise and struggling to find the signals that matter, all while costs spiral out of control.
Thankfully, there’s a better way with Adaptive Telemetry, a range of features in Grafana Cloud that optimizes which signals get stored and ensures you only keep the most valuable telemetry. Through classification and prioritization, Adaptive Telemetry (which includes Adaptive Metrics and Adaptive Logs) helps you lower costs without sacrificing the insights you need to keep your systems running smoothly.
In this blog, we’ll look closer at how Adaptive Telemetry works, share real-world examples of how organizations are utilizing it today, and show how you can use it to help achieve cost-efficient, effective observability.
The problem with the ‘collect everything’ mindset
Before we dig deeper into Adaptive Telemetry, let’s first look at the problem we’re trying to solve.
Telemetry is essential for understanding system performance. But when you collect everything indiscriminately, problems emerge. Instead of clarity, you get chaos. Instead of insights, you get overwhelmed.
This “collect everything” approach often does more harm than good.
- You drown in data. Unfiltered telemetry creates data overload. Your team spends hours wading through endless metrics, struggling to identify what actually matters. Instead of driving informed decisions, the data paralyzes progress.
- Costs surge. Storing and processing excessive telemetry isn’t free. It drains budgets, inflates operational costs, and steals resources from critical projects. Worse, the expense often outweighs the value the data provides.
- Signals get buried in noise. Collecting everything makes it harder to pinpoint what’s relevant. High-value metrics get lost in the shuffle, and your team’s focus is diluted. Instead of empowering decisions, the data becomes a distraction.
This approach is unsustainable, especially as you scale.
How Adaptive Telemetry works
Adaptive Telemetry fundamentally changes how you manage observability data, automating classification and optimization so you can regain control and focus on telemetry that actually delivers value. It also gives you the flexibility you need for your team and system, whether you want to be more hands off or completely in control.
Here’s how it works.
Classification based on use
Adaptive Telemetry classifies incoming data based on how it’s used. Is it powering an alert? Displayed on a dashboard? Queried ad hoc? Not all of that data is valuable, so signals that support critical operations are prioritized, while those with limited utility are identified as low value. This precision allows your team to focus on actionable data.
Automated optimization
Once classified, low-value data is automatically aggregated, sampled, dropped, or reduced.
This process keeps your observability efficient, aligning costs with actual value. The system’s feedback loop continuously learns from your usage patterns, adjusting its behavior to adapt to changing needs.
As your environment evolves, Adaptive Telemetry evolves with it—without manual intervention.
Flexibility for your needs
Every organization has unique requirements, and Adaptive Telemetry is built accordingly.
Want to ensure specific signals are always preserved? Use Exemptions to override recommendations and retain critical data. Prefer a more hands-on approach? Review and approve optimization suggestions before they’re applied.
Get the most from your logs and metrics
Adaptive Telemetry currently includes two forms: Adaptive Metrics and Adaptive Logs. They operate slightly differently, but the net benefit is still the same: delivering just the telemetry you need and removing the unneeded clutter and associated costs.
Adaptive Metrics
Adaptive Metrics was our first foray into this space. Introduced in 2023, it aggregates unused and partially used metrics into lower cardinality versions of themselves to reduce costs. To date, Adaptive Metrics has delivered a 35% reduction in metrics costs on average for more than 1,500 organizations.
As you can see from the workflow above, Adaptive Metrics analyzes how metrics are used—across dashboards, alerts, and queries—and classifies them to optimize data retention, ensuring you store only what matters.
Adaptive Logs
Introduced at ObservabilityCON 2024, Adaptive Logs uses AI/ML techniques to analyze observability data at a scale that wouldn’t be feasible with manual processes. It identifies commonly ingested log patterns and creates a set of customized recommendations for dropping unused telemetry.
The mechanics of Adaptive Logs and Adaptive Metrics are fairly similar. They both analyze usage and create recommendations that translate to reduced ingestion and cost savings. However, Adaptive Logs differs slightly in that it groups logs into patterns responsible for high log volumes. It also drops a percentage of logs, rather than aggregating them like Adaptive Metrics does.
Each of the lines in the chart above represents a hypothetical pattern generated by Adaptive Logs. In the middle example, you can see the logs are queried frequently, so the recommendation is to keep them all. For patterns that are queried infrequently or not at all, Adaptive Logs recommends dropping a certain percentage of those logs.
Worried about losing data? We’ve got you covered
It’s only natural to worry about losing critical telemetry data, especially when your systems depend on accurate and reliable observability during incidents.
Imagine this: it’s 3 a.m. and your systems are in chaos. Alerts are firing, dashboards are blank, and your team is scrambling to understand what’s happening. But you can’t find the signal you need because it was lost in a cost-cutting effort to unilaterally aggregate low-value data.
In those critical moments, every byte of telemetry matters—and losing the wrong data could cost your business downtime, trust, and revenue. Thankfully, this nightmare scenario doesn’t have to happen—this is exactly why Exemptions exist in Adaptive Telemetry.
Exemptions let you identify and preserve the signals you can’t afford to lose—whether it’s specific metrics tied to SLAs or critical labels needed for detailed investigations. You have full control to ensure that, even as you optimize costs, your most important data is always available when you need it.
With Exemptions, you don’t have to choose between saving money and safeguarding your telemetry.
Whether it’s protecting vital metrics or retaining key labels, Exemptions give you the confidence to trust your observability—even in the most critical scenarios.
Adaptive Telemetry user stories
Adaptive Telemetry is already delivering results for organizations that manage high telemetry volumes and high-cardinality environments. Here are some key outcomes achieved by engineering teams using this powerful feature.
- Lower costs, uncompromised coverage. At TeleTracking, an integrated healthcare operations platform provider, engineers overcame some initial fears about critical data loss and the company was able to reduce its observability costs by 50% without losing visibility.
- Scaling observability without scaling costs. Identity security provider SailPoint reduced its metrics volume by 33% and used the savings to expand its observability stack. SailPoint reinvested in Grafana Cloud, adopting tools like Frontend Observability to enhance monitoring capabilities.
- Empowering teams to focus on what matters. Dell Technologies, which faced overwhelming alert fatigue and excessive data from their legacy observability tools, used Adaptive Telemetry to eliminate unused metrics, reduce noise, and improve productivity so their engineers could make quicker, smarter decisions.
These real-world outcomes showcase how Adaptive Telemetry helps organizations optimize costs, scale observability, and empower their teams. Whether you’re aiming to reduce expenses, improve team efficiency, or enhance system insights, Adaptive Telemetry delivers value that scales with your needs.
The future of observability Is Adaptive
Adaptive Telemetry is just getting started.
While it already helps you manage metrics and logs effectively, the future holds even more. Soon, Adaptive Telemetry will extend to traces and profiles, giving you the tools to ensure that every piece of telemetry you store is worth the investment.
This evolution reflects our mission to help you maximize the value of your observability data.
Ready to take control of your telemetry? Start using Adaptive Telemetry in Grafana Cloud today and see the difference for yourself.
Grafana Cloud is the easiest way to get started with metrics, logs, traces, dashboards, and more. We have a generous forever-free tier and plans for every use case. Sign up for free now!