Superpixel Attack: Enhancing Black-box Adversarial Attack with Image-driven Division Areas

Deep learning models are used in safety-critical tasks such as automated driving and face recognition. However, small perturbations in the model input can significantly change the predictions. Adversarial attacks are used to identify small perturbations that can lead to misclassifications. More powerful black-box adversarial attacks are required to develop more effective defenses. A promising approach to black-box adversarial attacks is to repeat the process of extracting a specific image area and changing the perturbations added to it. Existing attacks adopt simple rectangles as the areas where perturbations are changed in a single iteration. We propose applying superpixels instead, which achieve a good balance between color variance and compactness. We also propose a new search method, versatile search, and a novel attack method, Superpixel Attack, which applies superpixels and performs versatile search. Superpixel Attack improves attack success rates by an average of 2.10% compared with existing attacks. Most models used in this study are robust against adversarial attacks, and this improvement is significant for black-box adversarial attacks. The code is avilable at https://github.com/oe1307/SuperpixelAttack.git.

Key Contributions

Analysis of the relationship between color variance, compactness of update areas, and attack success rates, motivating superpixel-based partitioning over rectangles
Proposal of superpixel-based update areas that balance low color variance and compactness for more effective perturbation search
Novel 'versatile search' method and the combined Superpixel Attack, achieving an average 2.10% improvement in attack success rate over existing black-box attacks across 19 robust ImageNet models

🛡️ Threat Analysis

Input Manipulation Attack

Proposes Superpixel Attack, a novel black-box evasion attack that crafts adversarial perturbations at inference time by iteratively selecting and perturbing image-driven superpixel regions to cause misclassification, achieving higher attack success rates than existing rectangle-based black-box attacks.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

black_boxinference_timeuntargeteddigital

Datasets

ImageNetRobustBench

Applications

2025 0 cit.

Input Manipulation Attack

92%