Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks

Xinyu Zhang, Hanbin Hong, Yuan Hong, Peng Huang, Binghui Wang, Zhongjie Ba

IEEE Symposium on Security and Privacy 2024 · Day 2 · Continental Ballroom 5

This talk introduces Text-CRS, a groundbreaking framework designed to provide **certified robustness** against **textual adversarial attacks** on deep learning language models. Presented by Xinyu Zhang and co-authors, Text-CRS addresses a critical vulnerability in modern AI: the susceptibility of models to misclassification when faced with subtle, human-imperceptible alterations to input data. While empirical defenses have emerged, they often fall short against adaptive or unseen attacks, highlighting the need for provable guarantees.

AI review

Text-CRS presents a groundbreaking, first-of-its-kind framework for certified robustness against common word-level textual adversarial attacks. By ingeniously applying randomized smoothing within the word embedding space and introducing novel theoretical guarantees, this work fundamentally shifts the paradigm from empirical defenses to provable security in NLP, making it a critical advancement for AI security.

Watch on YouTube