Theoretical Limitations of Ensembles in the Age of Overparameterization

Niclas Dern, John Cunningham, Geoff Pleiss

International Conference on Machine Learning 2025 · Oral

This talk, presented by Geoff Pleiss, explores the theoretical underpinnings of **deep ensembles** in the contemporary landscape of **overparameterized models**. The research, primarily driven by Niclas Dern as an intern at the Vector Institute, with contributions from John Cunningham at Columbia, addresses a critical question in machine learning: how do ensembles of very large models compare to a single, equivalently large model, especially concerning their predictive capabilities and the nature of their uncertainty estimates? The motivation stems from empirical observations suggesting a surprising equivalence between these two approaches when operating under a fixed computational or parameter budget. This work is significant because **ensembles of neural networks** are widely adopted in practice for their purported benefits in boosting accuracy, enhancing robustness, and providing crucial **uncertainty estimates**, particularly in safety-critical applications. By rigorously investigating the theoretical limits of ensembles in the overparameterized regime, the authors challenge conventional wisdom and offer a new perspective on their functional role.

AI review

A clean theoretical contribution that gives a principled explanation for a real empirical puzzle: why do large ensembles and large single models behave so similarly? The core result — that an infinite ensemble of overparameterized fixed-width RFRs converges to the same kernel ridgeless regressor as a single infinitely wide RFR — is stated precisely, holds under weak distributional assumptions, and provides genuine mechanism rather than post-hoc narrative. The recharacterization of ensemble variance as a capacity-difference signal rather than Bayesian uncertainty is the sharpest single…