Dashboards & Dragons: Crafting SLOs To Tame the AI Platform Cha... Alexa Griffith & Ankita Chaudhari

Alexa Griffith, Ankita Chaudhari

KubeCon + CloudNativeCon Europe 2025 · Session

In "Dashboards & Dragons: Crafting SLOs To Tame the AI Platform Chaos at Scale," Alexa Griffith and Ankita Chaudhari from Bloomberg delve into the critical role of Service Level Objectives (SLOs) in managing the inherent complexities and unique demands of modern Artificial Intelligence (AI) platforms, particularly those supporting Generative AI (Gen AI) at an enterprise scale. The talk addresses the burgeoning challenges faced by platform teams responsible for maintaining the reliability, performance, and user trust in these rapidly evolving environments.

AI review

This KubeCon talk from Bloomberg provides a robust, real-world blueprint for taming the "GenAI Hydra" through rigorous application of Service Level Objectives (SLOs) and advanced observability. It cuts through the typical AI hype by focusing on concrete, user-centric metrics like Time to First Token (TTFT) and Inter Token Latency (ITL), alongside sophisticated burn rate alerting, to manage the complex reliability challenges of enterprise-scale AI platforms. The speakers, clearly operating in the trenches, demonstrate how to shift from reactive monitoring to a proactive, "observability as a…

Watch on YouTube