Shared-GPU Security Learnings from Fly.io

Matthew Braun

fwd:cloudsec North America 2025 · Day 2 · Track 1 - Crystal

Matthew Braun, a security practitioner at **Fly.io**, presented a rare behind-the-scenes look at the security challenges of offering shared GPU compute to customers on a public cloud built on bare-metal infrastructure. Unlike hyperscalers who design custom hardware and silicon for GPU isolation, Fly.io had to solve GPU multi-tenancy security using commodity data center hardware, open-source hypervisors, and careful attention to PCIe, IOMMU, and firmware-level attack surfaces. The talk covers the threat model of giving untrusted users direct access to data-center-grade GPUs, the hypervisor selection journey from Firecracker to Cloud Hypervisor, the decision to use VFIO pass-through instead of Nvidia's MIG or vGPU, and the deep hardware security considerations around PCIe peer-to-peer attacks, ACS/ATS configurations, NVLink, and GPU firmware persistence.

AI review

This talk goes places most cloud security talks never touch — IOMMU groups, PCIe transaction layer packets, ACS bypass, GPU firmware persistence, and NVLink as a lateral movement vector. The operational reality of discovering unauthorized NVLink cables installed by remote hands is the kind of physical-layer security finding that reminds you hardware still matters. Genuinely unique content in the cloud security space.

Watch on YouTube