Overconfidence is a Dangerous Thing: Mitigating Membership Inference Attacks by Enforcing Less Confident Prediction

Zitao Chen

Network and Distributed System Security (NDSS) Symposium 2024 · Day 3 · Privacy & ML

Machine learning models, increasingly pervasive in sensitive domains from healthcare diagnostics to financial services, inherently process vast amounts of private user data. This widespread deployment, however, comes with a critical privacy vulnerability: the unintentional leakage of sensitive training information through **Membership Inference Attacks (MIAs)**. These attacks allow an adversary to determine whether a specific individual's data was used to train a given model, potentially exposing highly personal attributes like health status or financial records. Existing defenses against MIAs often force a difficult trade-off, either providing strong theoretical privacy guarantees at the cost of significant model accuracy degradation, or offering empirical protection with limited effectiveness or a reliance on additional, often unavailable, public datasets.