defense 2025

Delayed Momentum Aggregation: Communication-efficient Byzantine-robust Federated Learning with Partial Participation

Kaoru Otsuka 1, Yuki Takezawa 2, Makoto Yamada 1

0 citations

α

Published on arXiv

2509.02970

Data Poisoning Attack

OWASP ML Top 10 — ML02

Key Finding

With 20% Byzantine ratio and 10% partial participation rate, DeMoA achieves best accuracy while existing Byzantine-robust methods collapse due to Byzantine majority in the sampled set.

DeMoA (Delayed Momentum Aggregation)

Novel technique introduced


Partial participation is essential for communication-efficient federated learning at scale, yet existing Byzantine-robust methods typically assume full client participation. In the partial participation setting, a majority of the sampled clients may be Byzantine, once Byzantine clients dominate, existing methods break down immediately. We introduce delayed momentum aggregation, a principle where the central server aggregates cached momentum from non-sampled clients along with fresh momentum from sampled clients. This principle ensures Byzantine clients remain a minority from the server's perspective even when they dominate the sampled set. We instantiate this principle in our optimizer DeMoA. We analyze the convergence rate of DeMoA, showing that DeMoA is Byzantine-robust under partial participation. Experiments show that, with 20% Byzantine ratio and only 10% partial participation rate, DeMoA achieves the best accuracy even when existing methods fail empirically.


Key Contributions

  • Delayed momentum aggregation principle: server caches momentum from non-sampled clients to ensure Byzantine clients are always a statistical minority, even when they dominate the sampled set
  • DeMoA optimizer instantiating this principle with convergence guarantees for Byzantine robustness under partial participation
  • Empirical demonstration that DeMoA maintains accuracy at 20% Byzantine ratio and 10% participation rate where FedAvg, FedCM, and Byz-VR-MARINA-PP fail

🛡️ Threat Analysis

Data Poisoning Attack

Byzantine clients in federated learning send arbitrary/adversarial model updates to degrade global model performance — this is model-level poisoning via malicious participants. The paper proposes DeMoA, a Byzantine-fault-tolerant aggregation defense. Explicitly matches 'Byzantine attacks in federated learning' and 'Byzantine-fault-tolerant FL protocols' under ML02.


Details

Domains
federated-learning
Model Types
federatedcnn
Threat Tags
training_timeuntargeted
Datasets
MNISTCIFAR-10
Applications
federated learningdistributed model training