Learning from Peers: Collaborative Ensemble Adversarial Training
Li Dengjin , Guo Yanming , Xie Yuxiang , Li Zheng , Chen Jiangming , Li Xiaolong , Lao Mingrui
Published on arXiv
2509.00089
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
CEAT achieves state-of-the-art robustness over competitive ensemble adversarial training baselines (ADP, GAL, TRS, DVERGE, LAFED, FASTEN) on multiple image classification benchmarks
CEAT (Collaborative Ensemble Adversarial Training)
Novel technique introduced
Ensemble Adversarial Training (EAT) attempts to enhance the robustness of models against adversarial attacks by leveraging multiple models. However, current EAT strategies tend to train the sub-models independently, ignoring the cooperative benefits between sub-models. Through detailed inspections of the process of EAT, we find that that samples with classification disparities between sub-models are close to the decision boundary of ensemble, exerting greater influence on the robustness of ensemble. To this end, we propose a novel yet efficient Collaborative Ensemble Adversarial Training (CEAT), to highlight the cooperative learning among sub-models in the ensemble. To be specific, samples with larger predictive disparities between the sub-models will receive greater attention during the adversarial training of the other sub-models. CEAT leverages the probability disparities to adaptively assign weights to different samples, by incorporating a calibrating distance regularization. Extensive experiments on widely-adopted datasets show that our proposed method achieves the state-of-the-art performance over competitive EAT methods. It is noteworthy that CEAT is model-agnostic, which can be seamlessly adapted into various ensemble methods with flexible applicability.
Key Contributions
- Identifies that samples with classification disparities between ensemble sub-models lie near the decision boundary and disproportionately influence ensemble robustness
- Proposes CEAT, which adaptively reweights training samples for each sub-model based on predictive probability disparities from the other sub-models, using a calibrating distance regularization
- Demonstrates model-agnostic plug-and-play integration into existing EAT methods with SOTA robustness on three benchmark datasets
🛡️ Threat Analysis
CEAT is an adversarial training defense that improves ensemble robustness against adversarial input perturbations at inference time — adversarial training is a canonical ML01 defense.