Boosting Adversarial Transferability via Ensemble Non-Attention

Ensemble attacks integrate the outputs of surrogate models with diverse architectures, which can be combined with various gradient-based attacks to improve adversarial transferability. However, previous work shows unsatisfactory attack performance when transferring across heterogeneous model architectures. The main reason is that the gradient update directions of heterogeneous surrogate models differ widely, making it hard to reduce the gradient variance of ensemble models while making the best of individual model. To tackle this challenge, we design a novel ensemble attack, NAMEA, which for the first time integrates the gradients from the non-attention areas of ensemble models into the iterative gradient optimization process. Our design is inspired by the observation that the attention areas of heterogeneous models vary sharply, thus the non-attention areas of ViTs are likely to be the focus of CNNs and vice versa. Therefore, we merge the gradients respectively from the attention and non-attention areas of ensemble models so as to fuse the transfer information of CNNs and ViTs. Specifically, we pioneer a new way of decoupling the gradients of non-attention areas from those of attention areas, while merging gradients by meta-learning. Empirical evaluations on ImageNet dataset indicate that NAMEA outperforms AdaEA and SMER, the state-of-the-art ensemble attacks by an average of 15.0% and 9.6%, respectively. This work is the first attempt to explore the power of ensemble non-attention in boosting cross-architecture transferability, providing new insights into launching ensemble attacks.

Key Contributions

Introduces the concept of 'ensemble non-attention' — using the non-attention areas of heterogeneous surrogate models (CNNs and ViTs) as complementary gradient sources to improve cross-architecture transferability.
Proposes NAMEA, a three-step meta-gradient optimization framework combining attention meta-training, non-attention meta-testing, and a final gradient merging step to balance update stability with model diversity.
Designs a non-attention extraction (NAE) module using Grad-CAM and a gradient scaling optimization (GSO) module, outperforming SOTA ensemble attacks AdaEA and SMER by 15.0% and 9.6% respectively on ImageNet.

🛡️ Threat Analysis

Input Manipulation Attack

Proposes NAMEA, a gradient-based adversarial example crafting method designed to improve black-box transferability of adversarial perturbations across heterogeneous surrogate model architectures at inference time — a direct evasion/input manipulation attack.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

black_boxgrey_boxinference_timeuntargeteddigital

Datasets

ImageNet

Applications

2025 1 cit.

Input Manipulation Attack

92%