attack 2025

Boosting Adversarial Transferability via Residual Perturbation Attack

Jinjia Peng 1, Zeze Tao 1, Huibing Wang 2, Meng Wang 3, Yang Wang 3

0 citations

α

Published on arXiv

2508.05689

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

ResPA achieves better adversarial transferability than existing typical transfer-based attack methods, with further gains when combined with input transformation techniques.

ResPA (Residual Perturbation Attack)

Novel technique introduced


Deep neural networks are susceptible to adversarial examples while suffering from incorrect predictions via imperceptible perturbations. Transfer-based attacks create adversarial examples for surrogate models and transfer these examples to target models under black-box scenarios. Recent studies reveal that adversarial examples in flat loss landscapes exhibit superior transferability to alleviate overfitting on surrogate models. However, the prior arts overlook the influence of perturbation directions, resulting in limited transferability. In this paper, we propose a novel attack method, named Residual Perturbation Attack (ResPA), relying on the residual gradient as the perturbation direction to guide the adversarial examples toward the flat regions of the loss function. Specifically, ResPA conducts an exponential moving average on the input gradients to obtain the first moment as the reference gradient, which encompasses the direction of historical gradients. Instead of heavily relying on the local flatness that stems from the current gradients as the perturbation direction, ResPA further considers the residual between the current gradient and the reference gradient to capture the changes in the global perturbation direction. The experimental results demonstrate the better transferability of ResPA than the existing typical transfer-based attack methods, while the transferability can be further improved by combining ResPA with the current input transformation methods. The code is available at https://github.com/ZezeTao/ResPA.


Key Contributions

  • Proposes ResPA, which uses exponential moving average of input gradients (first moment) as a reference gradient to capture historical perturbation direction
  • Introduces a residual gradient direction (difference between current and reference gradient) to guide adversarial examples toward flat loss regions with a global rather than local perspective
  • Demonstrates that ResPA achieves superior transferability over existing transfer-based attacks and further improves when combined with input transformation methods

🛡️ Threat Analysis

Input Manipulation Attack

Directly proposes a gradient-based adversarial perturbation attack (ResPA) that causes misclassification at inference time via imperceptible perturbations transferred across models in a black-box setting.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
black_boxinference_timeuntargeteddigital
Datasets
ImageNetCIFAR-10
Applications
image classification