From Principle to Practice: Vertical Data Minimization for Machine Learning
Robin Staab, Nikola Jovanovic, Mislav Balunovic, Martin Vechev
IEEE Symposium on Security and Privacy 2024 · Day 3 · Continental Ballroom 6
This talk, presented by Robin Staab and his colleagues from the SML Lab at ETH Zurich, introduces a novel approach to data privacy known as **Vertical Data Minimization (VDM)** for machine learning. The core premise of VDM is to reduce the granularity of collected personal data attributes by generalizing them, rather than simply collecting fewer data points (horizontal minimization). This methodology directly addresses the stringent requirements of privacy regulations like Europe's GDPR Article 5(c) and the US AI Bill of Rights, which mandate that personal data collection be "adequate, relevant, and limited to what is necessary." The work specifically targets the untrusted collector setting, focusing on data collected during model deployment and making minimal assumptions about client capabilities.
AI review
This research on Vertical Data Minimization (VDM) for ML inference is a critical, high-impact contribution. It introduces a genuinely novel approach to data minimization, distinct from existing PETs, and provides a state-of-the-art algorithm (PAD) along with a robust adversarial evaluation framework. This directly addresses pressing regulatory and privacy challenges, offering actionable solutions for real-world deployments.