Getting Started with Distributed Multi-GPU Libraries for Scalable AI and HPC | NVIDIA GTC 2025

Mads Kristensen

NVIDIA GTC 2025 · Session

In this insightful talk from NVIDIA GTC, Mads Kristensen, a Senior Software Engineer at NVIDIA, demystifies the landscape of multi-GPU programming for scalable AI and High-Performance Computing (HPC). The presentation addresses the growing necessity of leveraging multiple GPUs, driven primarily by the exponential increase in data sizes that often outpace the memory capacity of a single GPU, as well as the need to accelerate computation time. Kristensen provides a comprehensive overview of the various pathways developers can take to transition their existing sequential projects to a multi-GPU environment, highlighting the trade-offs between ease of use, generality, and fine-grained control.

AI review

A competent survey of the multi-GPU programming landscape from someone who clearly lives in this space, but the article (and likely the talk) stays mostly at the taxonomy level — here are your options, here are the trade-offs — without going deep enough on any single path to help an engineer actually make a decision or reproduce the work. The pitfalls section is the most useful part, and the benchmark methodology is explicitly disclaimed as 'not fair,' which at least is honest.

Watch on YouTube