LSHFed: Robust and Communication-Efficient Federated Learning with Locally-Sensitive Hashing Gradient Mapping

Federated learning (FL) enables collaborative model training across distributed nodes without exposing raw data, but its decentralized nature makes it vulnerable in trust-deficient environments. Inference attacks may recover sensitive information from gradient updates, while poisoning attacks can degrade model performance or induce malicious behaviors. Existing defenses often suffer from high communication and computation costs, or limited detection precision. To address these issues, we propose LSHFed, a robust and communication-efficient FL framework that simultaneously enhances aggregation robustness and privacy preservation. At its core, LSHFed incorporates LSHGM, a novel gradient verification mechanism that projects high-dimensional gradients into compact binary representations via multi-hyperplane locally-sensitive hashing. This enables accurate detection and filtering of malicious gradients using only their irreversible hash forms, thus mitigating privacy leakage risks and substantially reducing transmission overhead. Extensive experiments demonstrate that LSHFed maintains high model performance even when up to 50% of participants are collusive adversaries while achieving up to a 1000x reduction in gradient verification communication compared to full-gradient methods.

Key Contributions

LSHGM: a gradient verification algorithm that projects high-dimensional gradients into compact binary hashes via multi-hyperplane LSH, enabling malicious gradient detection at only 0.07% of full-gradient transmission cost
LSHFed framework integrating LSHGM with a distributed masking strategy and ScoreQ-Hash node reputation mechanism for end-to-end privacy and robustness
Up to 1000x reduction in gradient verification communication overhead while maintaining model performance against up to 50% collusive malicious participants

🛡️ Threat Analysis

Data Poisoning Attack

Core contribution is a Byzantine-fault-tolerant aggregation mechanism (LSHGM) that detects and filters malicious gradient updates from adversarial FL participants — directly defends against data poisoning / Byzantine attacks in federated learning where malicious clients send crafted gradient updates to degrade the global model.

Model Inversion Attack

Explicitly defends against inference/gradient leakage attacks where adversaries reconstruct private training data from gradient updates. The irreversible LSH hash representation prevents data reconstruction while preserving detection capability — a defense targeting the gradient inversion threat model in federated learning.