defense 2025

Adaptive Federated Learning Defences via Trust-Aware Deep Q-Networks

Vedant Palit

0 citations · 17 references · arXiv

α

Published on arXiv

2510.01261

Model Poisoning

OWASP ML Top 10 — ML10

Data Poisoning Attack

OWASP ML Top 10 — ML02

Key Finding

DQN achieves the best robustness-accuracy trade-off among all baselines; increased client overlap (Dirichlet sweep) consistently reduces ASR while maintaining stable detection.

Trust-aware Deep Q-Network (TADQN)

Novel technique introduced


Federated learning is vulnerable to poisoning and backdoor attacks under partial observability. We formulate defence as a partially observable sequential decision problem and introduce a trust-aware Deep Q-Network that integrates multi-signal evidence into client trust updates while optimizing a long-horizon robustness--accuracy objective. On CIFAR-10, we (i) establish a baseline showing steadily improving accuracy, (ii) show through a Dirichlet sweep that increased client overlap consistently improves accuracy and reduces ASR with stable detection, and (iii) demonstrate in a signal-budget study that accuracy remains steady while ASR increases and ROC-AUC declines as observability is reduced, which highlights that sequential belief updates mitigate weaker signals. Finally, a comparison with random, linear-Q, and policy gradient controllers confirms that DQN achieves the best robustness--accuracy trade-off.


Key Contributions

  • First POMDP formulation of FL defense, treating client trust as latent state and aggregation as sequential decision-making under partial observability
  • Multi-signal Bayesian trust tracking pipeline combining directional alignment, magnitude deviation, and validation impact across rounds
  • Trust-aware DQN aggregator that outperforms random, linear-Q, policy gradient, and static robust aggregation baselines on robustness-accuracy trade-off

🛡️ Threat Analysis

Data Poisoning Attack

The paper also defends against general Byzantine/model poisoning attacks in FL (malicious clients sending corrupted updates to degrade global model accuracy), which is the classic FL data/model poisoning threat.

Model Poisoning

The paper explicitly defends against backdoor attacks in FL (evaluated via Attack Success Rate), where malicious clients embed hidden trigger-based behaviors. The DQN learns to detect and down-weight clients injecting backdoors.


Details

Domains
federated-learningreinforcement-learningvision
Model Types
federatedrlcnn
Threat Tags
training_timetargeteduntargetedgrey_box
Datasets
CIFAR-10
Applications
federated learningimage classification