JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation

Shenyi Zhang

34th USENIX Security Symposium (USENIX Security '25) · Day 3 · Vulnerabilities in LLMs: Privacy, Safety, and Defense