GAP-Diff: Protecting JPEG-Compressed Images from Diffusion-based Facial Customization

Haotian Zhu

Network and Distributed System Security (NDSS) Symposium 2025 · Day 1 · AI Safety

The proliferation of **text-to-image diffusion models** has ushered in an era of unprecedented creative potential, allowing users to generate highly customized and realistic images from simple text prompts. However, this powerful technology is not without its perils. A critical security and privacy concern has emerged with the misuse of these models for **facial customization**, where malicious actors can generate convincing deepfakes or altered images using as few as three to five identity images of a target individual. These fabricated images can depict individuals in various scenarios, seasons, or contexts, posing significant risks to personal privacy, reputation, and security, as highlighted by BBC reports on the technology's misuse.

AI review

GAP-Diff tackles a real and underappreciated failure mode — adversarial protective noise collapsing under JPEG compression — and the GAN-based, frequency-aware solution is technically coherent. Solid academic work, but it's narrow in scope, the arms-race dynamics are largely unaddressed, and the presentation itself is clearly a transcript of a proxy delivery, which dulls the impact.

Watch on YouTube