Getting Started with CUDA and Parallel Programming | NVIDIA GTC 2025 Session

NVIDIA CUDA Team

NVIDIA GTC 2025 · Session

In this NVIDIA GTC session, Stephen Jones, a seasoned CUDA Architect at NVIDIA, delivered a compelling and insightful talk titled "Getting Started with CUDA and Parallel Programming." Far from being a traditional "how-to" guide for writing low-level GPU kernels, Jones's presentation fundamentally reframed the approach to parallel programming on NVIDIA GPUs. His core message, delivered with a touch of humor and pragmatism, was that for the vast majority of developers, the goal should be to *avoid* direct parallel programming as much as possible, leveraging the extensive and increasingly sophisticated **CUDA** software stack.

AI review

Stephen Jones is clearly the real deal — a CUDA architect who's been inside the stack since 2008 and has the mental models to prove it. The talk does a good job explaining the layered abstraction philosophy and introducing CuTile as a meaningful new programming model. But the article summary reads more like a well-structured recap than evidence of a talk with real engineering teeth: no runnable code, no architectural details of CuTile itself, and the one concrete result (Llama 8B within 10% of cuDNN) arrives without enough scaffolding to evaluate. Good foundational content for engineers new…

Watch on YouTube