All-Purpose Mean Estimation over R: Optimal Sub-Gaussianity with Outlier Robustness and Low Moments Performance

Jasper Lee, Walter McKelvie, Maoyuan Song, Paul Valiant

International Conference on Machine Learning 2025 · Oral

This talk, presented by Jasper Lee from UC Davis, delves into the foundational problem of **mean estimation** over the real numbers, a seemingly simple task that is, in fact, remarkably complex in practice. Co-authored with Walt McKelvie, Raymond Song, and Paul Valiant, the presentation challenges conventional wisdom surrounding the ubiquitous **sample mean** and highlights the shortcomings of traditional robust estimators like **median-of-means**. The core message is that while many believe mean estimation is a solved problem, achieving robust, statistically optimal guarantees in finite-sample settings, especially in the presence of outliers or heavy-tailed data, remains a significant challenge.

AI review

Lee et al. establish that the LV21 estimator — already known to achieve finite-sample sub-Gaussian error with sharp constants — simultaneously achieves near-optimal robustness to adversarial corruption, optimal rates under finite z-th moments for z in (1,2), instance-by-instance optimality, and asymptotic efficiency, all without requiring knowledge of any of these parameters. This is a genuine theoretical contribution: a unification result showing that one carefully designed estimator dominates across previously distinct regimes. The work is rigorous in spirit, the claims are non-trivial…