defense 2025

Learning from Peers: Collaborative Ensemble Adversarial Training

Li Dengjin , Guo Yanming , Xie Yuxiang , Li Zheng , Chen Jiangming , Li Xiaolong , Lao Mingrui

0 citations

α

Published on arXiv

2509.00089

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

CEAT achieves state-of-the-art robustness over competitive ensemble adversarial training baselines (ADP, GAL, TRS, DVERGE, LAFED, FASTEN) on multiple image classification benchmarks

CEAT (Collaborative Ensemble Adversarial Training)

Novel technique introduced


Ensemble Adversarial Training (EAT) attempts to enhance the robustness of models against adversarial attacks by leveraging multiple models. However, current EAT strategies tend to train the sub-models independently, ignoring the cooperative benefits between sub-models. Through detailed inspections of the process of EAT, we find that that samples with classification disparities between sub-models are close to the decision boundary of ensemble, exerting greater influence on the robustness of ensemble. To this end, we propose a novel yet efficient Collaborative Ensemble Adversarial Training (CEAT), to highlight the cooperative learning among sub-models in the ensemble. To be specific, samples with larger predictive disparities between the sub-models will receive greater attention during the adversarial training of the other sub-models. CEAT leverages the probability disparities to adaptively assign weights to different samples, by incorporating a calibrating distance regularization. Extensive experiments on widely-adopted datasets show that our proposed method achieves the state-of-the-art performance over competitive EAT methods. It is noteworthy that CEAT is model-agnostic, which can be seamlessly adapted into various ensemble methods with flexible applicability.


Key Contributions

  • Identifies that samples with classification disparities between ensemble sub-models lie near the decision boundary and disproportionately influence ensemble robustness
  • Proposes CEAT, which adaptively reweights training samples for each sub-model based on predictive probability disparities from the other sub-models, using a calibrating distance regularization
  • Demonstrates model-agnostic plug-and-play integration into existing EAT methods with SOTA robustness on three benchmark datasets

🛡️ Threat Analysis

Input Manipulation Attack

CEAT is an adversarial training defense that improves ensemble robustness against adversarial input perturbations at inference time — adversarial training is a canonical ML01 defense.


Details

Domains
vision
Model Types
cnn
Threat Tags
white_boxtraining_timedigitaluntargeted
Datasets
CIFAR-10CIFAR-100SVHN
Applications
image classification