CUDA de Grâce: Owning AI Cloud Infrastructure with GPU exploits
Valentina Palmiotti, Samuel Lovejoy
Hexacon 2025 · Day 2 · Main Stage
In an era defined by the explosive growth of Artificial Intelligence (AI) and Machine Learning (ML), the underlying compute infrastructure, particularly Graphics Processing Units (GPUs), has become a critical, yet often overlooked, security frontier. This talk, "CUDA de Grâce," delivered by Valentina Palmiotti and Samuel Lovejoy of IBM X-Force Offensive Research (XOR), sheds light on a significant vulnerability in NVIDIA's proprietary CUDA driver stack. Their research demonstrates how a sophisticated kernel exploit targeting NVIDIA GPU drivers can lead to a complete compromise of AI cloud infrastructure, enabling container escape and cross-tenant attacks on multi-tenant serverless compute nodes.
AI review
Palmiotti and Lovejoy delivered exactly the kind of talk this industry desperately needs: original, technically rigorous kernel exploitation research targeting a massive, under-audited attack surface with direct real-world impact. They found a genuine race condition double-free in the NVIDIA GPU driver, built a reliable exploitation chain against a live Azure AI cloud environment, escaped a container, and owned the host — all in a space where essentially nobody else has done serious public binary exploitation work. This isn't threat modeling theater or a vendor deck with a CVE number stapled…