Superpixel Attack: Enhancing Black-box Adversarial Attack with Image-driven Division Areas
Issa Oe , Keiichiro Yamamura , Hiroki Ishikura , Ryo Hamahira , Katsuki Fujisawa
Published on arXiv
2512.02062
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
Superpixel Attack improves black-box adversarial attack success rates by an average of 2.10% over existing attacks across 19 adversarially robust ImageNet models on RobustBench.
Superpixel Attack
Novel technique introduced
Deep learning models are used in safety-critical tasks such as automated driving and face recognition. However, small perturbations in the model input can significantly change the predictions. Adversarial attacks are used to identify small perturbations that can lead to misclassifications. More powerful black-box adversarial attacks are required to develop more effective defenses. A promising approach to black-box adversarial attacks is to repeat the process of extracting a specific image area and changing the perturbations added to it. Existing attacks adopt simple rectangles as the areas where perturbations are changed in a single iteration. We propose applying superpixels instead, which achieve a good balance between color variance and compactness. We also propose a new search method, versatile search, and a novel attack method, Superpixel Attack, which applies superpixels and performs versatile search. Superpixel Attack improves attack success rates by an average of 2.10% compared with existing attacks. Most models used in this study are robust against adversarial attacks, and this improvement is significant for black-box adversarial attacks. The code is avilable at https://github.com/oe1307/SuperpixelAttack.git.
Key Contributions
- Analysis of the relationship between color variance, compactness of update areas, and attack success rates, motivating superpixel-based partitioning over rectangles
- Proposal of superpixel-based update areas that balance low color variance and compactness for more effective perturbation search
- Novel 'versatile search' method and the combined Superpixel Attack, achieving an average 2.10% improvement in attack success rate over existing black-box attacks across 19 robust ImageNet models
🛡️ Threat Analysis
Proposes Superpixel Attack, a novel black-box evasion attack that crafts adversarial perturbations at inference time by iteratively selecting and perturbing image-driven superpixel regions to cause misclassification, achieving higher attack success rates than existing rectangle-based black-box attacks.