A Vision-Language Pre-training Model-Guided Approach for Mitigating Backdoor Attacks in Federated Learning

Defending backdoor attacks in Federated Learning (FL) under heterogeneous client data distributions encounters limitations balancing effectiveness and privacy-preserving, while most existing methods highly rely on the assumption of homogeneous client data distributions or the availability of a clean serve dataset. In this paper, we propose an FL backdoor defense framework, named CLIP-Fed, that utilizes the zero-shot learning capabilities of vision-language pre-training models. Our scheme overcomes the limitations of Non-IID imposed on defense effectiveness by integrating pre-aggregation and post-aggregation defense strategies. CLIP-Fed aligns the knowledge of the global model and CLIP on the augmented dataset using prototype contrastive loss and Kullback-Leibler divergence, so that class prototype deviations caused by backdoor samples are ensured and the correlation between trigger patterns and target labels is eliminated. In order to balance privacy-preserving and coverage enhancement of the dataset against diverse triggers, we further construct and augment the server dataset via using the multimodal large language model and frequency analysis without any client samples. Extensive experiments on representative datasets evidence the effectiveness of CLIP-Fed. Comparing to other existing methods, CLIP-Fed achieves an average reduction in Attack Success Rate, {\em i.e.}, 2.03\% on CIFAR-10 and 1.35\% on CIFAR-10-LT, while improving average Main Task Accuracy by 7.92\% and 0.48\%, respectively. Our codes are available at https://anonymous.4open.science/r/CLIP-Fed.

Key Contributions

CLIP-Fed framework leveraging CLIP's zero-shot capabilities for FL backdoor defense that operates under Non-IID (heterogeneous) client data distributions without requiring a clean server dataset
Combined pre-aggregation and post-aggregation defense using prototype contrastive loss and KL divergence to align global model knowledge with CLIP and eliminate trigger-label correlations
Privacy-preserving server dataset construction and augmentation via multimodal LLM and frequency analysis without using any client samples, improving coverage against diverse trigger patterns

🛡️ Threat Analysis

Model Poisoning

Paper directly defends against backdoor/trojan attacks in FL where malicious clients embed hidden trigger-response behavior; CLIP-Fed eliminates trigger-label correlations and corrects class prototype deviations caused by backdoor samples using prototype contrastive loss and KL divergence.