Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

International Conference on Machine Learning 2025 · Test of Time Award

This article delves into the transformative impact of **Batch Normalization (BN)**, a technique that earned Sergey Ioffe and Christian Szegedy the prestigious ICML 2025 Test of Time Award. Presented by Ioffe, this talk offers a retrospective on the paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," published at ICML 2015. The presentation not only revisits the original motivations and immediate successes of BN but also provides a refined understanding of its mechanisms, incorporating insights gained over the past decade. The committee recognized BN for its profound and widespread influence, noting that it, or its derivatives, has become an indispensable component in nearly every deep learning system, from early convolutional neural networks to contemporary architectures.

AI review

This is a Test of Time retrospective on Batch Normalization — a technique whose practical significance is unambiguous and whose selection for this award is defensible on impact grounds alone. The talk is honest about what the original paper got wrong (the ICS framing), incorporates Santurkar et al.'s landscape-smoothing reinterpretation, and gestures toward Batch Renormalization and unpublished work on unbiased population gradients. Evaluated as a 2025 research contribution, however, the talk does not clear the bar for new theoretical or empirical results. It consolidates, clarifies, and…