Unlocking How To Efficiently, Flexibly, Manage and Schedule Seven AI Chi... Xiao Zhang & Mengxuan Li
Xiao Zhang, Mengxuan Li
KubeCon + CloudNativeCon Europe 2025 · Session
This session, presented by Xiao Zhang and Mengxuan Li from Dynamia Point AI, delves into critical challenges faced by organizations leveraging AI workloads in Kubernetes: persistently low GPU utilization and the complexities of managing diverse, heterogeneous AI hardware. The core problem highlighted is that traditional Kubernetes deployments often treat GPUs as non-shareable, leading to significant waste of expensive computing resources. This issue is compounded in environments where multiple vendors' AI chips are used, each requiring its own specialized management solution.
AI review
This session introduces Hami, a CNCF Sandbox project, as a critically important and technically ingenious solution for the pervasive issues of low GPU utilization and complex heterogeneous AI chip management within Kubernetes. By transparently virtualizing and sharing GPU resources through a novel in-container library (“humor”) and offering a unified management plane for diverse hardware, Hami directly addresses significant economic and operational inefficiencies. The talk details its advanced scheduling capabilities and broad compatibility, making it a must-see for anyone serious about…