LSHFed: Robust and Communication-Efficient Federated Learning with Locally-Sensitive Hashing Gradient Mapping
Guanjie Cheng , Mengzhen Yang , Xinkui Zhao , Shuyi Yu , Tianyu Du , Yangyang Wu , Mengying Zhu , Shuiguang Deng
Published on arXiv
2511.01296
Data Poisoning Attack
OWASP ML Top 10 — ML02
Model Inversion Attack
OWASP ML Top 10 — ML03
Key Finding
LSHFed maintains high model utility with up to 50% malicious participants while reducing gradient verification communication by up to 1000x compared to full-gradient methods.
LSHFed / LSHGM
Novel technique introduced
Federated learning (FL) enables collaborative model training across distributed nodes without exposing raw data, but its decentralized nature makes it vulnerable in trust-deficient environments. Inference attacks may recover sensitive information from gradient updates, while poisoning attacks can degrade model performance or induce malicious behaviors. Existing defenses often suffer from high communication and computation costs, or limited detection precision. To address these issues, we propose LSHFed, a robust and communication-efficient FL framework that simultaneously enhances aggregation robustness and privacy preservation. At its core, LSHFed incorporates LSHGM, a novel gradient verification mechanism that projects high-dimensional gradients into compact binary representations via multi-hyperplane locally-sensitive hashing. This enables accurate detection and filtering of malicious gradients using only their irreversible hash forms, thus mitigating privacy leakage risks and substantially reducing transmission overhead. Extensive experiments demonstrate that LSHFed maintains high model performance even when up to 50% of participants are collusive adversaries while achieving up to a 1000x reduction in gradient verification communication compared to full-gradient methods.
Key Contributions
- LSHGM: a gradient verification algorithm that projects high-dimensional gradients into compact binary hashes via multi-hyperplane LSH, enabling malicious gradient detection at only 0.07% of full-gradient transmission cost
- LSHFed framework integrating LSHGM with a distributed masking strategy and ScoreQ-Hash node reputation mechanism for end-to-end privacy and robustness
- Up to 1000x reduction in gradient verification communication overhead while maintaining model performance against up to 50% collusive malicious participants
🛡️ Threat Analysis
Core contribution is a Byzantine-fault-tolerant aggregation mechanism (LSHGM) that detects and filters malicious gradient updates from adversarial FL participants — directly defends against data poisoning / Byzantine attacks in federated learning where malicious clients send crafted gradient updates to degrade the global model.
Explicitly defends against inference/gradient leakage attacks where adversaries reconstruct private training data from gradient updates. The irreversible LSH hash representation prevents data reconstruction while preserving detection capability — a defense targeting the gradient inversion threat model in federated learning.