defense 2025

Nesterov-Accelerated Robust Federated Learning Over Byzantine Adversaries

Lihan Xu 1, Yanjie Dong 1, Gang Wang 2, Runhao Zeng 1, Xiaoyi Fan 1, Xiping Hu 1

1 citations · 36 references · arXiv

α

Published on arXiv

2511.02657

Data Poisoning Attack

OWASP ML Top 10 — ML02

Key Finding

Byrd-NAFL outperforms existing Byzantine-resilient FL baselines in convergence speed and accuracy under non-convex loss functions while maintaining robustness to diverse Byzantine attack strategies.

Byrd-NAFL

Novel technique introduced


We investigate robust federated learning, where a group of workers collaboratively train a shared model under the orchestration of a central server in the presence of Byzantine adversaries capable of arbitrary and potentially malicious behaviors. To simultaneously enhance communication efficiency and robustness against such adversaries, we propose a Byzantine-resilient Nesterov-Accelerated Federated Learning (Byrd-NAFL) algorithm. Byrd-NAFL seamlessly integrates Nesterov's momentum into the federated learning process alongside Byzantine-resilient aggregation rules to achieve fast and safeguarding convergence against gradient corruption. We establish a finite-time convergence guarantee for Byrd-NAFL under non-convex and smooth loss functions with relaxed assumption on the aggregated gradients. Extensive numerical experiments validate the effectiveness of Byrd-NAFL and demonstrate the superiority over existing benchmarks in terms of convergence speed, accuracy, and resilience to diverse Byzantine attack strategies.


Key Contributions

  • Byrd-NAFL algorithm that seamlessly integrates Nesterov's momentum with Byzantine-resilient aggregation rules (Krum, CwMed, Bulyan, GeoMed) in federated learning
  • Finite-time convergence guarantee for non-convex, smooth loss functions under relaxed assumptions on aggregated gradients — extending prior work beyond strongly convex settings
  • Empirical demonstration of superior convergence speed, accuracy, and resilience versus existing benchmarks across diverse Byzantine attack strategies

🛡️ Threat Analysis

Data Poisoning Attack

The paper proposes Byzantine-fault-tolerant FL aggregation rules (Krum, CwMed, GeoMed variants) integrated with Nesterov momentum to defend against malicious clients sending arbitrary gradient updates aimed at degrading global model performance — the canonical ML02 federated learning threat.


Details

Domains
federated-learning
Model Types
federated
Threat Tags
training_timeuntargetedgrey_box
Applications
federated learningdistributed model training