LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors

Chengkun Wei

Network and Distributed System Security (NDSS) Symposium 2024 · Day 3 · LLM Security

In the rapidly evolving landscape of large language models (LLMs), **prompt-tuning** has emerged as a highly efficient and scalable paradigm for adapting powerful pretrained models to diverse downstream tasks. This talk, "LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors," delivered by Chengkun Wei, delves into a critical security vulnerability inherent in this approach: the susceptibility to **task-agnostic backdoors**. These insidious backdoors, implanted within the foundational pretrained model itself, pose a significant threat because they can compromise *any* downstream task, regardless of how the model is subsequently prompt-tuned.