A Vision-Language Pre-training Model-Guided Approach for Mitigating Backdoor Attacks in Federated Learning
Keke Gai 1, Dongjue Wang 1, Jing Yu 2, Liehuang Zhu 1, Qi Wu 3
Published on arXiv
2508.10315
Model Poisoning
OWASP ML Top 10 — ML10
Key Finding
Achieves average 2.03% ASR reduction on CIFAR-10 and 1.35% on CIFAR-10-LT while improving main task accuracy by 7.92% and 0.48% respectively over existing defenses.
CLIP-Fed
Novel technique introduced
Defending backdoor attacks in Federated Learning (FL) under heterogeneous client data distributions encounters limitations balancing effectiveness and privacy-preserving, while most existing methods highly rely on the assumption of homogeneous client data distributions or the availability of a clean serve dataset. In this paper, we propose an FL backdoor defense framework, named CLIP-Fed, that utilizes the zero-shot learning capabilities of vision-language pre-training models. Our scheme overcomes the limitations of Non-IID imposed on defense effectiveness by integrating pre-aggregation and post-aggregation defense strategies. CLIP-Fed aligns the knowledge of the global model and CLIP on the augmented dataset using prototype contrastive loss and Kullback-Leibler divergence, so that class prototype deviations caused by backdoor samples are ensured and the correlation between trigger patterns and target labels is eliminated. In order to balance privacy-preserving and coverage enhancement of the dataset against diverse triggers, we further construct and augment the server dataset via using the multimodal large language model and frequency analysis without any client samples. Extensive experiments on representative datasets evidence the effectiveness of CLIP-Fed. Comparing to other existing methods, CLIP-Fed achieves an average reduction in Attack Success Rate, {\em i.e.}, 2.03\% on CIFAR-10 and 1.35\% on CIFAR-10-LT, while improving average Main Task Accuracy by 7.92\% and 0.48\%, respectively. Our codes are available at https://anonymous.4open.science/r/CLIP-Fed.
Key Contributions
- CLIP-Fed framework leveraging CLIP's zero-shot capabilities for FL backdoor defense that operates under Non-IID (heterogeneous) client data distributions without requiring a clean server dataset
- Combined pre-aggregation and post-aggregation defense using prototype contrastive loss and KL divergence to align global model knowledge with CLIP and eliminate trigger-label correlations
- Privacy-preserving server dataset construction and augmentation via multimodal LLM and frequency analysis without using any client samples, improving coverage against diverse trigger patterns
🛡️ Threat Analysis
Paper directly defends against backdoor/trojan attacks in FL where malicious clients embed hidden trigger-response behavior; CLIP-Fed eliminates trigger-label correlations and corrects class prototype deviations caused by backdoor samples using prototype contrastive loss and KL divergence.