It's Simplex! Disaggregating Measures to Improve Certified Robustness

Andrew C. Cullen, Paul Montague, Shijie Liu, Sarah M. Erfani, Benjamin I.P. Rubinstein

IEEE Symposium on Security and Privacy 2024 · Day 2 · Continental Ballroom 5

In the realm of machine learning security, **adversarial examples** pose a significant threat, capable of subtly altering input data to mislead classification models. While numerous reactive defenses have emerged, they often fall short against novel attack vectors, providing no generalizable security guarantees. This critical gap motivates the pursuit of **certified robustness**, a proactive approach that mathematically guarantees a lower bound on the distance to any potential adversarial example, thereby establishing a measurable radius of security around a given input.

AI review

This research delivers a significant leap in certified robustness for ML, introducing a new, more efficient certification scheme and an ensemble approach that leverages existing techniques for best-in-class performance. The novel Simplex of Possible Outputs provides an invaluable diagnostic tool, moving beyond aggregate metrics to offer granular insights into model robustness. This is essential work for anyone building provably secure AI systems.

Watch on YouTube