attack 2025

Boosting Adversarial Transferability with Spatial Adversarial Alignment

Zhaoyu Chen 1, Haijing Guo 1,2, Kaixun Jiang 1, Jiyuan Fu 1, Xinyu Zhou 1, Dingkang Yang 3, Hao Tang 4, Bo Li , Wenqiang Zhang 1

1 citations · 74 references · arXiv

α

Published on arXiv

2501.01015

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

SAA improves CNN-to-ViT adversarial transferability by 25.5–39.1% over the Model Alignment baseline when using ResNet50 as the surrogate model.

Spatial Adversarial Alignment (SAA)

Novel technique introduced


Deep neural networks are vulnerable to adversarial examples that exhibit transferability across various models. Numerous approaches are proposed to enhance the transferability of adversarial examples, including advanced optimization, data augmentation, and model modifications. However, these methods still show limited transferability, particularly in cross-architecture scenarios, such as from CNN to ViT. To achieve high transferability, we propose a technique termed Spatial Adversarial Alignment (SAA), which employs an alignment loss and leverages a witness model to fine-tune the surrogate model. Specifically, SAA consists of two key parts: spatial-aware alignment and adversarial-aware alignment. First, we minimize the divergences of features between the two models in both global and local regions, facilitating spatial alignment. Second, we introduce a self-adversarial strategy that leverages adversarial examples to impose further constraints, aligning features from an adversarial perspective. Through this alignment, the surrogate model is trained to concentrate on the common features extracted by the witness model. This facilitates adversarial attacks on these shared features, thereby yielding perturbations that exhibit enhanced transferability. Extensive experiments on various architectures on ImageNet show that aligned surrogate models based on SAA can provide higher transferable adversarial examples, especially in cross-architecture attacks.


Key Contributions

  • Proposes Spatial Adversarial Alignment (SAA), which fine-tunes a surrogate model using a witness model via spatial-aware alignment (global and local feature divergence minimization across CNN/ViT architectures) and adversarial-aware alignment (self-adversarial strategy to align features on adversarial examples).
  • Demonstrates that aligning spatial and adversarial features — not just final logits — is critical for cross-architecture adversarial transferability, overcoming limitations of prior prediction-level alignment methods.
  • Achieves state-of-the-art cross-architecture transfer attack performance on ImageNet (6 CNNs and 4 ViTs), improving CNN-to-ViT transferability by 25.5–39.1% over the Model Alignment baseline on ResNet50.

🛡️ Threat Analysis

Input Manipulation Attack

The paper's primary contribution is generating more transferable adversarial perturbations that cause misclassification at inference time in black-box settings — a canonical Input Manipulation Attack. SAA is a surrogate model fine-tuning technique whose sole purpose is producing adversarial examples that evade diverse target architectures.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
black_boxinference_timeuntargeteddigital
Datasets
ImageNet
Applications
image classification