Transferable Multimodal Attack on Vision-Language Pre-training Models

Haodi Wang, Kai Dong, Zhilei Zhu, Haotong Qin, Aishan Liu, Xiaolin Fang

IEEE Symposium on Security and Privacy 2024 · Day 2 · Continental Ballroom 5

This talk introduces a novel framework for generating highly transferable adversarial examples against **Vision-Language Pre-training Models (VLPMs)**, a critical class of deep learning models that combine computer vision and natural language processing. Adversarial attacks involve making subtle, often imperceptible, modifications to input data to induce incorrect predictions from machine learning models, exposing their vulnerabilities. The concept of *transferability* is particularly potent, as it means an adversarial example crafted for one model can successfully deceive other models, even those with different architectures or training data, operating in a black-box setting.

AI review

This work introduces a critical, novel attack framework (TMM) demonstrating highly transferable adversarial examples against Vision-Language Pre-training Models, including large generative LVMs. By strategically targeting both modality-consistent and discrepancy features, the research exposes a fundamental security weakness in multimodal AI. It's a wake-up call for anyone building or deploying these models, highlighting the urgent need for new defense strategies.

Watch on YouTube