ObservabilityCON on the Road Bay Area 2023

That’s a wrap on ObservabilityCON on the road Bay Area

Be the first to know updates on future ObservabilityCON events

The annual open source observability conference from Grafana Labs

Open source observability updates

Hear what’s new in the open source observability landscape and connect with community members

Technical tips and tricks

See how recent developments in the LGTM stack can help you optimize telemetry costs, shift left with performance testing, and streamline incident response

Expert advice

Get your technical questions about Prometheus, OpenTelemetry, and other open source observability topics answered by Grafana Labs experts

Community success stories

Learn from users who are transforming their approach to observability, managing exploding telemetry data volumes, and slashing MTTR

Agenda

9:00 AM 10:00 AM
Registration + Expo
Enjoy a light breakfast while networking with attendees and exploring Grafana demos in the Expo hall
10:00 AM 10:45 AM
ObservabilityCON on the Road Keynote
Live in Bay Area, it’s ObservabilityCON on the Road! Join Jen Villa for a look at the latest trends across the observability ecosystem and developments in the open and composable LGTM (Loki-Grafana-Tempo-Mimir) observability stack.
- Jen Villa, Director of Product, Grafana Labs
- Tom Wilkie, Chief Technology Officer, Grafana Labs
10:45 AM 11:15 AM
How to reduce observability costs by controlling metrics growth
The shift to cloud native architectures and increased developer autonomy to instrument applications has caused an explosive growth of metrics data and cardinality. But more data does not mean better observability: It can be difficult to balance scale while managing costs. Learn how Grafana Cloud can help lower your observability bills. By regulating metrics growth through governance and unique customized aggregations, you optimize your costs and pay closer to what you actually use.
- David Ryder, Principal Solutions Engineer, Grafana Labs
11:15 AM 11:45 AM
From Grafana 2.1.0 to today: the long, winding, and successful monitoring journey at Slack
With a mission of ensuring that thousands of companies and millions of users could continuously and seamlessly connect, the monitoring team at Slack has used Grafana in production since v2.1, beginning in December 2015. Join George Luong, Engineering Manager in charge of Slack’s Monitoring team, as he walks through the company’s monitoring journey from the before times, as they deployed unintuitive dashboarding systems, to their current Grafana Enterprise-powered system, which provides dashboards for everyone from front-line developers to C-suite executives. Along the way, see how they’ve turned to Grafana Alerting to power 50% of all alerts at Slack, adopted a dashboards-as-code ethos, and even built out a Global Health Dashboard so they can address any issue before it hits customers.
Watch this session.
- George Luong, Engineering Manager, Slack
11:45 AM 12:45 PM
Lunch
12:45 PM 1:15 PM
How to manage the tradeoff between cost and coverage for observability logs
Logs have always been, and continue to be, a critical part of the troubleshooting workflow for observability and SRE teams. But the reputation of legacy solutions being expensive on budgets and resources – for storing and querying – results in teams making tradeoffs on which services and applications to instrument logging. This creates observability blind spots.
Learn how Grafana Loki can help you optimize multiple stages of your logging lifecycle – from supporting multiple log formats and smaller indexing, to blazing fast querying and long-term retention. Uniquely built on Prometheus architecture, Grafana Loki is a cost-effective logging solution purpose built for observability.
- Christine Wang, Solutions Engineer Manager, Grafana Labs
- Jordan Rushing, Senior Software Engineer, Grafana Labs
1:15 PM 1:45 PM
The journey to unified and intelligent observability infrastructure at Roblox
With roughly 100 million global active users, Roblox understands that its success relies on healthy and scalable infrastructure. And that means having a healthy and scalable observability solution.
Over the past two years, the Roblox team has made significant changes to its observability platform – including telemetry metrics, logging, and tracing – resulting in memorable migration stories, some hard-earned lessons, and a great vision for the future. Join Director of Engineering Xiaofeng Han and Principal Engineer Ying Dai as they cover the journey of evolving observability and engineering practices at Roblox, collaborating with Grafana Labs while adopting Grafana for visualizing metrics and logs and Grafana Tempo for traces, and changing the culture at Roblox for the better.
Watch this session.
- Xiaofeng Han, Director of Engineering, Roblox
- Ying Dai, Principal Engineer, Roblox
1:45 PM 2:15 PM
How to avoid the most common Kubernetes monitoring mistakes
Which metrics should you collect? What dashboards are best suited to effectively monitor Kubernetes clusters? How do you measure resource utilization for capacity planning? Often, this is all a game of trial and error – and one that business-critical services cannot afford.
Learn how Grafana Cloud’s K8s monitoring solution was built so you can avoid the guessing game and kickstart your K8s observability strategy in minutes. In this session, we break down the most common Kubernetes monitoring mistakes and share best practices on how to set up Kubernetes monitoring the optimal way.
- Éamon Ryan, Senior Principal Field Engineer, Grafana Labs
2:15 PM 2:45 PM
Afternoon break
2:45 PM 3:15 PM
How to prioritize critical resources through SLO-driven Incident Response & Management
Prioritizing performance issues based on business impact is often challenging because it is hard to quantify the right SLO/SLI levels.
Learn how you can build an SLO framework that unifies data from multiple disparate tools and is integrated into a seamless incident management workflow with Grafana OnCall and Incident. You can then prioritize responding to the most critical error budget burndowns, allowing your critical developers to focus on innovation rather than firefighting performance issues.
- Bob Cotton, Distinguished Engineer, Grafana Labs
3:15 PM 3:45 PM
How to prevent issues from hitting customers by using load testing and tracing together
With the increased demand to ship new features, developers can overlook the importance of catching and resolving performance issues before they hit end users in production. Learn how Grafana’s unique solution of k6 performance testing correlated with Grafana Tempo distributed tracing can help developers easily prevent performance problems before they become an issue.
- Wei Li, Product Marketing Manager, Grafana Labs
- Zach Leslie, Software Engineer, Grafana Labs
3:45 PM 4:45 PM
Reception + Expo
Join us for appetizers and beverages while networking with attendees and meeting Grafana experts in the Expo hall