Gradient Purification: Defense Against Poisoning Attack in Decentralized Federated Learning
Bin Li , Xiaoye Miao , Yan Zhang , Jianwei Yin
Published on arXiv
2501.04453
Data Poisoning Attack
OWASP ML Top 10 — ML02
Key Finding
GPD significantly outperforms state-of-the-art defense methods in model accuracy under both IID and non-IID data distributions while uniquely preserving beneficial knowledge from malicious clients
GPD (Gradient Purification Defense)
Novel technique introduced
Decentralized federated learning (DFL) is inherently vulnerable to data poisoning attacks, as malicious clients can transmit manipulated gradients to neighboring clients. Existing defense methods either reject suspicious gradients per iteration or restart DFL aggregation after excluding all malicious clients. They all neglect the potential benefits that may exist within contributions from malicious clients. In this paper, we propose a novel gradient purification defense, termed GPD, to defend against data poisoning attacks in DFL. It aims to separately mitigate the harm in gradients and retain benefits embedded in model weights, thereby enhancing overall model accuracy. For each benign client in GPD, a recording variable is designed to track historically aggregated gradients from one of its neighbors. It allows benign clients to precisely detect malicious neighbors and mitigate all aggregated malicious gradients at once. Upon mitigation, benign clients optimize model weights using purified gradients. This optimization not only retains previously beneficial components from malicious clients but also exploits canonical contributions from benign clients. We analyze the convergence of GPD, as well as its ability to harvest high accuracy. Extensive experiments demonstrate that, GPD is capable of mitigating data poisoning attacks under both iid and non-iid data distributions. It also significantly outperforms state-of-the-art defense methods in terms of model accuracy.
Key Contributions
- GPD (Gradient Purification Defense): a divide-and-conquer approach that uses model gradients to mitigate malicious impact and model weights to retain beneficial components from malicious clients
- Recording variable mechanism that tracks historically aggregated gradients per neighbor, enabling precise detection of malicious neighbors and one-shot mitigation of all accumulated malicious gradients
- Convergence analysis proving GPD's ability to achieve high accuracy, with experiments showing it outperforms SOTA defenses under both IID and non-IID data distributions
🛡️ Threat Analysis
The paper directly defends against data poisoning in DFL, where malicious clients manipulate training gradients to degrade the global model. The threat model is explicit: adversarial clients inject corrupted gradients to disrupt aggregation and poison model accuracy — the canonical ML02 setting in federated learning.