Stability and Generalization of Adversarial Diffusion Training

Algorithmic stability is an established tool for analyzing generalization. While adversarial training enhances model robustness, it often suffers from robust overfitting and an enlarged generalization gap. Although recent work has established the convergence of adversarial training in decentralized networks, its generalization properties remain unexplored. This work presents a stability-based generalization analysis of adversarial training under the diffusion strategy for convex losses. We derive a bound showing that the generalization error grows with both the adversarial perturbation strength and the number of training steps, a finding consistent with single-agent case but novel for decentralized settings. Numerical experiments on logistic regression validate these theoretical predictions.

Key Contributions

First stability-based generalization bound for adversarial training under the distributed diffusion strategy for convex losses
Unified framework that reduces to known single-agent adversarial training and decentralized standard training bounds in their respective limits
Empirical validation on logistic regression confirming theoretical dependence on perturbation strength ε and training steps T, with additional evidence on network topology effects

🛡️ Threat Analysis

Input Manipulation Attack

The paper directly analyzes adversarial training (a defense against adversarial input manipulation), deriving generalization bounds that characterize the robust overfitting problem inherent to adversarial training defenses.