Webinar

How LATAM Airlines’ observability strategy took flight with Grafana Cloud

You are registered for this webinar Thanks for registering
You'll receive an email confirmation, and a reminder on the day of the event. You'll receive an email when the on-demand video is available.
How LATAM Airlines’ observability strategy took flight with Grafana Cloud

Company: LATAM Airlines
Industry: Travel & Transportation

LATAM Airlines is the largest airline group in Latin America, serving over 150 destinations across multiple countries and operating approximately 1,400 flights daily. The company transports 75 to 80 million passengers annually, with half of its operations centered in Brazil. Over the past several years, LATAM has undergone significant digital transformation to enhance customer experiences and operational efficiency.

Challenge

LATAM faced significant hurdles in managing its complex digital ecosystem, which included over 500 interconnected services. During its transition to a new backend digital experience, the airline struggled with delayed issue detection, frequent customer complaints about minor failures, and inefficient prioritization of fixes. The time to detect incidents was prolonged, and teams were overwhelmed by noisy alerts, often receiving reports from customers before internal systems flagged issues. Furthermore, the organization needed a solution to map user interface interactions to backend service failures in real-time, ensuring better reliability and customer satisfaction.

Solution

From an organizational standpoint, LATAM merged its commercial and IT teams into a single e-business unit to foster collaboration and shared goals across diverse teams. At first, LATAM introduced the Failed Customer Interaction (FCI) tool to track and map UI interactions to service failures. This approach provided a one-to-one mapping of UI actions to backend services, allowing teams to pinpoint issues more effectively. However, the data they had access to was multiple hours old, meaning they couldn’t address issues in real-time.

As a result, LATAM turned to Grafana Cloud Metrics and Grafana Cloud Logs to enable real-time monitoring and analysis of their FCIs. By incorporating Grafana SLO for 600+ UI interactions, the airline established clear reliability goals and automated workflows to streamline telemetry data collection, CI/CD processes, and observability practices. Adding in Grafana IRM, they also reduced noise and were able to focus more on the signals that truly matter.

Impact

LATAM Airlines achieved substantial improvements in operational efficiency and customer experience:

  • 87% reduction in mean time to detect (MTTD) for severity 1 and 2 incidents.
  • 61% decrease in noisy alerts, reducing disruptions for on-call teams.
  • 23% reduction in mean time to repair (MTTR)

But beyond the hard metrics, LATAM’s adoption of Grafana Cloud – especially their reliance on Grafana IRM and SLO – has helped improve team alignment with shared goals and enhanced developer workflows, resulting in reduced downtime and improved system reliability, enhanced customer satisfaction, streamlined internal operations, and a new, proactive observability culture.

Conclusion

Looking ahead, LATAM plans to enhance telemetry correlation, simplify its developer workflow, and adopt observability-driven development practices across its teams. These efforts aim to further reduce incident response times – with the goal of reaching a 50% reduction in MTTR – and embed observability as a core principle in the design and deployment of applications, ensuring continued innovation and reliability in serving millions of passengers annually.


Your guide

Carlos Hernandez Saavedra
Carlos Hernandez Saavedra
Head of Cloud & SRE
LATAM Airlines
Resources