Dashboards & Dragons: Crafting SLOs To Tame the AI Platform Cha... Alexa Griffith & Ankita Chaudhari
Alexa Griffith, Ankita Chaudhari
KubeCon + CloudNativeCon Europe 2025 · Session
In "Dashboards & Dragons: Crafting SLOs To Tame the AI Platform Chaos at Scale," Alexa Griffith and Ankita Chaudhari from Bloomberg delve into the critical role of Service Level Objectives (SLOs) in managing the inherent complexities and unique demands of modern Artificial Intelligence (AI) platforms, particularly those supporting Generative AI (Gen AI) at an enterprise scale. The talk addresses the burgeoning challenges faced by platform teams responsible for maintaining the reliability, performance, and user trust in these rapidly evolving environments.
AI review
This KubeCon talk from Bloomberg provides a robust, real-world blueprint for taming the "GenAI Hydra" through rigorous application of Service Level Objectives (SLOs) and advanced observability. It cuts through the typical AI hype by focusing on concrete, user-centric metrics like Time to First Token (TTFT) and Inter Token Latency (ITL), alongside sophisticated burn rate alerting, to manage the complex reliability challenges of enterprise-scale AI platforms. The speakers, clearly operating in the trenches, demonstrate how to shift from reactive monitoring to a proactive, "observability as a…