Bridging the Task Gap: Multi-Task Adversarial Transferability in CLIP and Its Derivatives
Kuanrong Liu 1,2, Siyuan Liang 2, Cheng Qian 3, Ming Zhang 3, Xiaochun Cao 1
2 National University of Singapore
3 National Key Laboratory of Science and Technology on Information System Security
Published on arXiv
2509.23917
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
MT-AdvCLIP improves the average adversarial transfer success rate across multiple tasks by over 39% against various CLIP-derived models without increasing the perturbation budget.
MT-AdvCLIP
Novel technique introduced
As a general-purpose vision-language pretraining model, CLIP demonstrates strong generalization ability in image-text alignment tasks and has been widely adopted in downstream applications such as image classification and image-text retrieval. However, it struggles with fine-grained tasks such as object detection and semantic segmentation. While many variants aim to improve CLIP on these tasks, its robustness to adversarial perturbations remains underexplored. Understanding how adversarial examples transfer across tasks is key to assessing CLIP's generalization limits and security risks. In this work, we conduct a systematic empirical analysis of the cross-task transfer behavior of CLIP-based models on image-text retrieval, object detection, and semantic segmentation under adversarial perturbations. We find that adversarial examples generated from fine-grained tasks (e.g., object detection and semantic segmentation) often exhibit stronger transfer potential than those from coarse-grained tasks, enabling more effective attacks against the original CLIP model. Motivated by this observation, we propose a novel framework, Multi-Task Adversarial CLIP (MT-AdvCLIP), which introduces a task-aware feature aggregation loss and generates perturbations with enhanced cross-task generalization capability. This design strengthens the attack effectiveness of fine-grained task models on the shared CLIP backbone. Experimental results on multiple public datasets show that MT-AdvCLIP significantly improves the adversarial transfer success rate (The average attack success rate across multiple tasks is improved by over 39%.) against various CLIP-derived models, without increasing the perturbation budget. This study reveals the transfer mechanism of adversarial examples in multi-task CLIP models, offering new insights into multi-task robustness evaluation and adversarial example design.
Key Contributions
- Systematic empirical analysis revealing that adversarial examples from fine-grained tasks (detection, segmentation) exhibit stronger cross-task transfer potential than those from coarse-grained tasks against the base CLIP model.
- MT-AdvCLIP framework with a task-aware feature aggregation loss that generates adversarial perturbations with enhanced cross-task generalization without increasing the perturbation budget.
- Demonstrated 39%+ average improvement in adversarial transfer success rate across multiple tasks and CLIP-derived models compared to prior methods such as Co-Attack.
🛡️ Threat Analysis
The core contribution is a novel adversarial attack framework (MT-AdvCLIP) that generates gradient-based input perturbations causing misclassification and erroneous outputs at inference time across multiple tasks (image-text retrieval, object detection, semantic segmentation). The paper studies cross-task adversarial transferability and proposes a task-aware feature aggregation loss to improve transfer success — this is squarely an input manipulation / adversarial example attack.