Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks

Xinyu Zhang, Hanbin Hong, Yuan Hong, Peng Huang, Binghui Wang, Zhongjie Ba

IEEE Symposium on Security and Privacy 2024 · Day 2 · Continental Ballroom 5

The talk "Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks" presented by Xinyu Zhang and co-authored by Hanbin Hong, Yuan Hong, Peng Huang, Binghui Wang, and Zhongjie Ba, introduces a groundbreaking approach to enhance the security of natural language processing (NLP) models. This work tackles the pervasive issue of adversarial attacks, which can subtly manipulate text inputs to trick deep learning models into making incorrect classifications. Unlike previous empirical defenses that often fail against adaptive or unseen attacks, Text-CRS offers a **certified robustness** guarantee, ensuring that model predictions remain stable even under a defined range of adversarial perturbations.

AI review

This work delivers the first generalized certified robustness framework for word-level textual adversarial attacks, a critical advancement for NLP security. It cleverly adapts randomized smoothing to discrete text, offering provable guarantees against substitution, reordering, insertion, and deletion. This is a genuinely novel and impactful solution to a long-standing vulnerability.

Watch on YouTube