DP-fy your DATA: How to (and why) synthesize Differentially Private Synthetic Data: Practical components for DP synthetic data system
Natalia Ponomareva, Sergei Vassilvitskii, Peter Kairouz, Alex Bie
International Conference on Machine Learning 2025 · Tutorial
This talk, part of a broader tutorial on differentially private (DP) synthetic data, shifts focus from the algorithmic foundations to the critical system-level considerations required for deploying DP in practice. Presented by Peter Kairouz, with contributions from Natalia Ponomareva, Sergei Vassilvitskii, and Alex Bie, the session delves into the intricate engineering challenges and auditing mechanisms necessary to ensure that synthetic data generated with differential privacy truly upholds its privacy guarantees in real-world applications. It highlights that simply "flipping on the DP switch" is insufficient; a robust system architecture, careful definition of privacy units, and continuous auditing are paramount.
AI review
A competent and well-organized tutorial on the systems engineering of differentially private synthetic data, with some genuinely useful empirical demonstrations and an honest acknowledgment of open problems. The talk is honest about what it is — practical guidance, not theory — and the auditing methodology using canary data and ROC-curve-framed membership inference is clearly presented. However, this is not a research contribution in any formal sense: the core ideas (user-level DP-SGD, per-user gradient clipping, membership inference auditing via likelihood ratio tests) are prior work, and…