Flowing Datasets with Wasserstein over Wasserstein Gradient Flows

Clément Bonet, Christophe Vauthier, Anna Korba

International Conference on Machine Learning 2025 · Oral

This talk introduces a novel and computationally efficient framework for comparing and transforming labeled datasets, a critical challenge in modern machine learning. Presented by Clément Bonet, Christophe Vauthier, and Anna Korba at ICML 2025, the work proposes representing labeled datasets as **probability distributions over probability distributions**, a hierarchical structure that naturally accounts for both individual sample characteristics and inter-class relationships. To enable meaningful comparisons and transformations within this complex space, the authors introduce the **Wasserstein over Wasserstein (WoW) distance**, an optimal transport metric that endows the space with a formal Riemannian structure. This structure, in turn, allows for the definition of **WoW gradient flows**, enabling principled optimization for tasks such as dataset distillation and data augmentation.

AI review

Bonet, Vauthier, and Korba present a mathematically coherent framework for treating labeled datasets as elements of P(P(R^d)) — distributions over distributions — equipped with a Wasserstein-over-Wasserstein metric that admits a formal Riemannian structure. The central theoretical contribution is well-motivated: lifting the optimal transport geometry one level up to handle the hierarchical structure of labeled data, then deriving gradient flows in that space using MMD with Sliced Wasserstein kernels. The complexity improvement over OTDD (O(c^2 n log n) vs. O(n^2 c^2)) is real and practically…