On the Escaping Efficiency of Distributed Adversarial Training Algorithms

Adversarial training has been widely studied in recent years due to its role in improving model robustness against adversarial attacks. This paper focuses on comparing different distributed adversarial training algorithms--including centralized and decentralized strategies--within multi-agent learning environments. Previous studies have highlighted the importance of model flatness in determining robustness. To this end, we develop a general theoretical framework to study the escaping efficiency of these algorithms from local minima, which is closely related to the flatness of the resulting models. We show that when the perturbation bound is sufficiently small (i.e., when the attack strength is relatively mild) and a large batch size is used, decentralized adversarial training algorithms--including consensus and diffusion--are guaranteed to escape faster from local minima than the centralized strategy, thereby favoring flatter minima. However, as the perturbation bound increases, this trend may no longer hold. In the simulation results, we illustrate our theoretical findings and systematically compare the performance of models obtained through decentralized and centralized adversarial training algorithms. The results highlight the potential of decentralized strategies to enhance the robustness of models in distributed settings.

Key Contributions

General theoretical framework for comparing escaping efficiency of centralized vs. decentralized adversarial training algorithms (consensus and diffusion) in terms of local minima flatness
Proof that decentralized adversarial training escapes local minima faster than centralized strategies when perturbation bound is small and batch size is large, implying flatter and more robust models
Empirical validation showing decentralized strategies can enhance adversarial robustness in distributed settings, with robustness advantage diminishing as perturbation strength grows

🛡️ Threat Analysis

Input Manipulation Attack

Adversarial training is a canonical ML01 defense category — this paper directly analyzes different distributed adversarial training strategies and their relative effectiveness at improving model robustness against adversarial perturbations.