Adaptive Federated Learning Defences via Trust-Aware Deep Q-Networks
Published on arXiv
2510.01261
Model Poisoning
OWASP ML Top 10 — ML10
Data Poisoning Attack
OWASP ML Top 10 — ML02
Key Finding
DQN achieves the best robustness-accuracy trade-off among all baselines; increased client overlap (Dirichlet sweep) consistently reduces ASR while maintaining stable detection.
Trust-aware Deep Q-Network (TADQN)
Novel technique introduced
Federated learning is vulnerable to poisoning and backdoor attacks under partial observability. We formulate defence as a partially observable sequential decision problem and introduce a trust-aware Deep Q-Network that integrates multi-signal evidence into client trust updates while optimizing a long-horizon robustness--accuracy objective. On CIFAR-10, we (i) establish a baseline showing steadily improving accuracy, (ii) show through a Dirichlet sweep that increased client overlap consistently improves accuracy and reduces ASR with stable detection, and (iii) demonstrate in a signal-budget study that accuracy remains steady while ASR increases and ROC-AUC declines as observability is reduced, which highlights that sequential belief updates mitigate weaker signals. Finally, a comparison with random, linear-Q, and policy gradient controllers confirms that DQN achieves the best robustness--accuracy trade-off.
Key Contributions
- First POMDP formulation of FL defense, treating client trust as latent state and aggregation as sequential decision-making under partial observability
- Multi-signal Bayesian trust tracking pipeline combining directional alignment, magnitude deviation, and validation impact across rounds
- Trust-aware DQN aggregator that outperforms random, linear-Q, policy gradient, and static robust aggregation baselines on robustness-accuracy trade-off
🛡️ Threat Analysis
The paper also defends against general Byzantine/model poisoning attacks in FL (malicious clients sending corrupted updates to degrade global model accuracy), which is the classic FL data/model poisoning threat.
The paper explicitly defends against backdoor attacks in FL (evaluated via Attack Success Rate), where malicious clients embed hidden trigger-based behaviors. The DQN learns to detect and down-weight clients injecting backdoors.