DEFEND: Poisoned Model Detection and Malicious Client Exclusion Mechanism for Secure Federated Learning-based Road Condition Classification

Federated Learning (FL) has drawn the attention of the Intelligent Transportation Systems (ITS) community. FL can train various models for ITS tasks, notably camera-based Road Condition Classification (RCC), in a privacy-preserving collaborative way. However, opening up to collaboration also opens FL-based RCC systems to adversaries, i.e., misbehaving participants that can launch Targeted Label-Flipping Attacks (TLFAs) and threaten transportation safety. Adversaries mounting TLFAs poison training data to misguide model predictions, from an actual source class (e.g., wet road) to a wrongly perceived target class (e.g., dry road). Existing countermeasures against poisoning attacks cannot maintain model performance under TLFAs close to the performance level in attack-free scenarios, because they lack specific model misbehavior detection for TLFAs and neglect client exclusion after the detection. To close this research gap, we propose DEFEND, which includes a poisoned model detection strategy that leverages neuron-wise magnitude analysis for attack goal identification and Gaussian Mixture Model (GMM)-based clustering. DEFEND discards poisoned model contributions in each round and adapts accordingly client ratings, eventually excluding malicious clients. Extensive evaluation involving various FL-RCC models and tasks shows that DEFEND can thwart TLFAs and outperform seven baseline countermeasures, with at least 15.78% improvement, with DEFEND remarkably achieving under attack the same performance as in attack-free scenarios.

Key Contributions

Neuron-wise magnitude analysis for identifying the attack goal (source-to-target class flip) of TLFAs in FL model updates
GMM-based clustering to distinguish poisoned from benign model contributions each aggregation round
Adaptive client rating and permanent malicious-client exclusion mechanism that achieves attack-free performance even under active TLFA

🛡️ Threat Analysis

Data Poisoning Attack

Directly defends against Targeted Label-Flipping Attacks (TLFAs) in federated learning, where malicious clients poison training data by flipping source-class labels to a target class (e.g., wet road → dry road). The threat model is Byzantine FL participants corrupting the global model via poisoned data contributions — core ML02 territory. DEFEND detects poisoned model updates and excludes malicious clients using neuron-wise magnitude analysis and GMM clustering.

Details

Domains

visionfederated-learning

Model Types

cnnfederated

Threat Tags

training_timetargeted

Applications

2026 0 cit.

Data Poisoning Attack

85%