Model Immunization from a Condition Number Perspective

Amber Yijia Zheng, Cedar Site Bai, Brian Bullins, Raymond A. Yeh

International Conference on Machine Learning 2025 · Oral

In an era defined by the rapid advancement and widespread accessibility of generative AI, the responsible deployment of powerful open-weight models like **Stable Diffusion** and **DeepFloyd** presents a significant challenge. While these models democratize content creation, their inherent flexibility also introduces the risk of malicious adaptation—the intentional fine-tuning of a model to generate harmful, unethical, or copyrighted content. This talk, presented by Cedar Site Bai from Purdue University, along with collaborators Amber Yijia Zheng, Brian Bullins, and Raymond A. Yeh, introduces a novel framework for **model immunization** designed to counteract this threat.

AI review

A competent and technically honest contribution that applies condition number analysis to the problem of model immunization against malicious fine-tuning. The mathematical framing is clean and the regularizers are well-motivated, but the work is significantly constrained by the linear probing assumption, the L2 regression lens through which the theory is derived, and an unresolved tension with modern adaptive optimizers that substantially weakens the practical claim. The core idea — differentially manipulating condition numbers across task Hessians — is principled and opens a reasonable…