FedLAD: A Linear Algebra Based Data Poisoning Defence for Federated Learning

Sybil attacks pose a significant threat to federated learning, as malicious nodes can collaborate and gain a majority, thereby overwhelming the system. Therefore, it is essential to develop countermeasures that ensure the security of federated learning environments. We present a novel defence method against targeted data poisoning, which is one of the types of Sybil attacks, called Linear Algebra-based Detection (FedLAD). Unlike existing approaches, such as clustering and robust training, which struggle in situations where malicious nodes dominate, FedLAD models the federated learning aggregation process as a linear problem, transforming it into a linear algebra optimisation challenge. This method identifies potential attacks by extracting the independent linear combinations from the original linear combinations, effectively filtering out redundant and malicious elements. Extensive experimental evaluations demonstrate the effectiveness of FedLAD compared to five well-established defence methods: Sherpa, CONTRA, Median, Trimmed Mean, and Krum. Using tasks from both image classification and natural language processing, our experiments confirm that FedLAD is robust and not dependent on specific application settings. The results indicate that FedLAD effectively protects federated learning systems across a broad spectrum of malicious node ratios. Compared to baseline defence methods, FedLAD maintains a low attack success rate for malicious nodes when their ratio ranges from 0.2 to 0.8. Additionally, it preserves high model accuracy when the malicious node ratio is between 0.2 and 0.5. These findings underscore FedLAD's potential to enhance both the reliability and performance of federated learning systems in the face of data poisoning attacks.

Key Contributions

FedLAD models FL aggregation as a linear combination problem and filters malicious/redundant model updates by extracting independent linear combinations, tolerating malicious node ratios up to 0.8 in some settings
Parallel optimization algorithm based on sub-matrix splitting to accelerate independent linear combination extraction, with formal proof of correctness
Experimental validation across image classification and NLP tasks against five baselines (Sherpa, CONTRA, Median, Trimmed Mean, Krum) under varying malicious node ratios (0.2–0.8)

🛡️ Threat Analysis

Data Poisoning Attack

FedLAD is explicitly designed to defend against targeted data poisoning (label flipping) attacks in federated learning, where Sybil nodes inject corrupted training updates to manipulate the global model. This maps directly to ML02, which covers data poisoning defenses including Byzantine-fault-tolerant FL protocols against malicious participants.

Details

Domains

visionnlpfederated-learning

Model Types

federated

Threat Tags

training_timetargeted

Datasets

AG_NEWSCIFAR-10

Applications

2025 0 cit.

Data Poisoning Attack

67%