FedAPT: Federated Adversarial Prompt Tuning for Vision-Language Models
Kun Zhai 1, Siheng Chen 2, Xingjun Ma 1, Yu-Gang Jiang 1
Published on arXiv
2509.06992
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
FedAPT consistently outperforms five state-of-the-art federated and adversarial prompt tuning methods across 15 image classification datasets, with superior robustness in non-IID and cross-domain scenarios.
FedAPT
Novel technique introduced
Federated Prompt Tuning (FPT) is an efficient method for cross-client collaborative fine-tuning of large Vision-Language Models (VLMs). However, models tuned using FPT are vulnerable to adversarial attacks, leading to misclassification in downstream tasks. In this work, we introduce Federated Adversarial Prompt Tuning (\textbf{FedAPT}), a novel method designed to enhance the adversarial robustness of FPT. We identify a key issue in FedAPT under non-independent and identically distributed (non-IID) settings: a \textit{class information gap} between clients and the global model. Clients rely solely on limited local label information to generate adversarial samples for training, while the global model must defend against adversarial attacks from global labels. To address this issue, we propose a \textbf{class-aware prompt generator} that generates visual prompts from text prompts. This generator is guided by a \emph{Global Label Embedding} (serving as a ``beacon") which encodes cross-client label information to create more globally-aligned visual prompts. Additionally, we propose a \textbf{cross-layer generator sharing} strategy to enhance prompt coupling across different layers of the model, further boosting adversarial robustness. Extensive experiments on multiple image classification datasets demonstrate the superiority of FedAPT in improving adversarial robustness, outperforming existing methods by a large margin. FedAPT also exhibits exceptional generalization in cross-domain and cross-dataset scenarios, indicating its effectiveness in real-world applications.
Key Contributions
- Identifies the 'class information gap' problem in federated adversarial training under non-IID data distributions, where client-local adversarial samples fail to cover global class decision boundaries
- Proposes a class-aware cross-attention prompt generator guided by a Global Label Embedding (beacon) that encodes cross-client label information to generate globally-aligned visual prompts
- Proposes a cross-layer generator sharing strategy that couples prompts across CLIP layers, improving adversarial robustness while reducing trainable parameters
🛡️ Threat Analysis
The paper's primary contribution is a defense against adversarial examples that cause misclassification in VLMs at inference time. FedAPT uses adversarial training (generating adversarial samples to improve robustness) as its core mechanism, directly targeting the inference-time input manipulation threat.