GShield: Mitigating Poisoning Attacks in Federated Learning

Federated Learning (FL) has recently emerged as a revolutionary approach to collaborative training Machine Learning models. In particular, it enables decentralized model training while preserving data privacy, but its distributed nature makes it highly vulnerable to a severe attack known as Data Poisoning. In such scenarios, malicious clients inject manipulated data into the training process, thereby degrading global model performance or causing targeted misclassification. In this paper, we present a novel defense mechanism called GShield, designed to detect and mitigate malicious and low-quality updates, especially under non-independent and identically distributed (non-IID) data scenarios. GShield operates by learning the distribution of benign gradients through clustering and Gaussian modeling during an initial round, enabling it to establish a reliable baseline of trusted client behavior. With this benign profile, GShield selectively aggregates only those updates that align with the expected gradient patterns, effectively isolating adversarial clients and preserving the integrity of the global model. An extensive experimental campaign demonstrates that our proposed defense significantly improves model robustness compared to the state-of-the-art methods while maintaining a high accuracy of performance across both tabular and image datasets. Furthermore, GShield improves the accuracy of the targeted class by 43\% to 65\% after detecting malicious and low-quality clients.

Key Contributions

GShield: a server-side FL defense that requires no prior adversary knowledge, clean auxiliary data, or IID data assumptions
Cosine-similarity-based clustering plus Gaussian distribution modeling to characterize benign gradient behavior during an initial Safe Round phase
Selective aggregation in an Anomaly Detection phase that excludes updates deviating from the learned benign profile, improving targeted-class accuracy by 43–65%

🛡️ Threat Analysis

Data Poisoning Attack

GShield directly defends against data poisoning in federated learning — malicious clients inject manipulated training data (label-flipping attacks) to degrade global model performance or cause targeted misclassification. Byzantine clients sending corrupted updates are the canonical ML02 threat in FL. The defense filters poisoned gradient updates using learned benign distribution profiles.

Details

Domains

federated-learningvisiontabular

Model Types

federatedcnntraditional_ml

Threat Tags

training_timetargeted

Applications

2025 0 cit.

Data Poisoning Attack

69%