OTI: A Model-free and Visually Interpretable Measure of Image Attackability
Jiaming Liang 1,2, Haowei Liu , Chi-Man Pun 1
Published on arXiv
2601.17536
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
OTI reliably distinguishes image attackability across diverse attack configurations while requiring no model access and providing visual interpretability through semantic object texture analysis
OTI (Object Texture Intensity)
Novel technique introduced
Despite the tremendous success of neural networks, benign images can be corrupted by adversarial perturbations to deceive these models. Intriguingly, images differ in their attackability. Specifically, given an attack configuration, some images are easily corrupted, whereas others are more resistant. Evaluating image attackability has important applications in active learning, adversarial training, and attack enhancement. This prompts a growing interest in developing attackability measures. However, existing methods are scarce and suffer from two major limitations: (1) They rely on a model proxy to provide prior knowledge (e.g., gradients or minimal perturbation) to extract model-dependent image features. Unfortunately, in practice, many task-specific models are not readily accessible. (2) Extracted features characterizing image attackability lack visual interpretability, obscuring their direct relationship with the images. To address these, we propose a novel Object Texture Intensity (OTI), a model-free and visually interpretable measure of image attackability, which measures image attackability as the texture intensity of the image's semantic object. Theoretically, we describe the principles of OTI from the perspectives of decision boundaries as well as the mid- and high-frequency characteristics of adversarial perturbations. Comprehensive experiments demonstrate that OTI is effective and computationally efficient. In addition, our OTI provides the adversarial machine learning community with a visual understanding of attackability.
Key Contributions
- First work to identify a relationship between semantic object texture intensity and image attackability, providing a model-free and visually interpretable metric
- Theoretical justification of OTI from both decision boundary and mid/high-frequency adversarial perturbation perspectives
- Comprehensive empirical validation of OTI across diverse attacks, domains, tasks, and defense/non-defense scenarios
🛡️ Threat Analysis
Directly addresses adversarial examples — OTI measures how susceptible each image is to adversarial perturbation attacks at inference time, with explicit applications to adversarial training (defense) and attack enhancement (offense). The entire paper is grounded in understanding why some images are easier to corrupt via adversarial perturbations.