LAVA: Lifetime-Aware VM Allocation with Learned Distributions and Adaptation to Mispredictions

Jianheng Ling, Pratik Worah, Yawen Wang, Martin Maas, Kathryn S. McKinley

Conference on Machine Learning and Systems 2025 · Day 4 · Session 12: Edge and Cloud Systems

This talk introduces **LAVA** (Lifetime-Aware VM Allocation), a novel approach to virtual machine (VM) scheduling within large-scale cloud environments, specifically addressing the challenges faced by Google's Borg Prime scheduler. Presented by a team from Google, including Jianheng Ling and Kathryn S. McKinley, the work focuses on leveraging VM lifetime predictions to optimize resource utilization, reduce operational costs, and enhance system performance. The core problem tackled is the inefficient allocation of VMs on hosts, particularly in scenarios where the vast majority of VMs are short-lived but consume a disproportionately small fraction of total computing resources.

AI review

Solid systems ML paper from Google on lifetime-aware VM scheduling, with real production numbers and a sensible core insight. The work is clearly genuine — they built it, shipped it, and have a year of production data. But the writeup is doing a lot of padding around a relatively compact set of ideas, and the engineering specifics stay frustratingly high-level. Engineers curious about ML-in-the-scheduler design will get something here, but it won't change how most of them build.