Boosting Adversarial Transferability via Ensemble Non-Attention
Yipeng Zou 1, Qin Liu 1, Jie Wu 2,3, Yu Peng 4, Guo Chen 1, Hui Zhou 1, Guanghui Ye 1
Published on arXiv
2511.08937
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
NAMEA achieves an average adversarial transferability improvement of 15.0% over AdaEA and 9.6% over SMER on ImageNet across heterogeneous CNN and ViT target models.
NAMEA
Novel technique introduced
Ensemble attacks integrate the outputs of surrogate models with diverse architectures, which can be combined with various gradient-based attacks to improve adversarial transferability. However, previous work shows unsatisfactory attack performance when transferring across heterogeneous model architectures. The main reason is that the gradient update directions of heterogeneous surrogate models differ widely, making it hard to reduce the gradient variance of ensemble models while making the best of individual model. To tackle this challenge, we design a novel ensemble attack, NAMEA, which for the first time integrates the gradients from the non-attention areas of ensemble models into the iterative gradient optimization process. Our design is inspired by the observation that the attention areas of heterogeneous models vary sharply, thus the non-attention areas of ViTs are likely to be the focus of CNNs and vice versa. Therefore, we merge the gradients respectively from the attention and non-attention areas of ensemble models so as to fuse the transfer information of CNNs and ViTs. Specifically, we pioneer a new way of decoupling the gradients of non-attention areas from those of attention areas, while merging gradients by meta-learning. Empirical evaluations on ImageNet dataset indicate that NAMEA outperforms AdaEA and SMER, the state-of-the-art ensemble attacks by an average of 15.0% and 9.6%, respectively. This work is the first attempt to explore the power of ensemble non-attention in boosting cross-architecture transferability, providing new insights into launching ensemble attacks.
Key Contributions
- Introduces the concept of 'ensemble non-attention' — using the non-attention areas of heterogeneous surrogate models (CNNs and ViTs) as complementary gradient sources to improve cross-architecture transferability.
- Proposes NAMEA, a three-step meta-gradient optimization framework combining attention meta-training, non-attention meta-testing, and a final gradient merging step to balance update stability with model diversity.
- Designs a non-attention extraction (NAE) module using Grad-CAM and a gradient scaling optimization (GSO) module, outperforming SOTA ensemble attacks AdaEA and SMER by 15.0% and 9.6% respectively on ImageNet.
🛡️ Threat Analysis
Proposes NAMEA, a gradient-based adversarial example crafting method designed to improve black-box transferability of adversarial perturbations across heterogeneous surrogate model architectures at inference time — a direct evasion/input manipulation attack.