THEMIS: Regulating Textual Inversion for Personalized Concept Censorship

Yutong Wu

Network and Distributed System Security (NDSS) Symposium 2025 · Day 1 · AI Safety

This talk introduces THEMIS, a novel framework designed to inject concept censorship capabilities into **Textual Inversion** embeddings, a popular technique for personalizing image generation models. Presented by Shiao from Jojan University on behalf of the paper's authors, the research addresses a critical emerging challenge: the potential for malicious users to exploit the flexibility and widespread availability of personalized generative AI models to create harmful or undesirable content. By enabling creators of Textual Inversions to proactively "backdoor" their embeddings, THEMIS offers a mechanism to prevent the generation of specific, blacklisted concepts when combined with certain prompts.

AI review

THEMIS is a competent, clearly scoped piece of ML security research that addresses a real problem — malicious exploitation of Textual Inversion embeddings — with a technically coherent backdoor-injection mechanism. The work is publishable and defensible, but it sits in a crowded neighborhood of backdoor-as-defense literature and doesn't land a punch hard enough to be memorable at a top security venue.

Watch on YouTube