benchmark 2026

Unveiling Perceptual Artifacts: A Fine-Grained Benchmark for Interpretable AI-Generated Image Detection

Yao Xiao 1, Weiyan Chen 1, Jiahao Chen 2, Zijie Cao 1, Weijian Deng 3, Binbin Yang 1, Ziyi Dong 1, Xiangyang Ji 4, Wei Ke 2, Pengxu Wei 1,5, Liang Lin 1,5

0 citations · 72 references · arXiv

α

Published on arXiv

2601.19430

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Existing AIGI detectors largely bypass perceptual artifacts; attention alignment with artifact regions significantly boosts cross-dataset generalization while improving interpretability

X-AIGD

Novel technique introduced


Current AI-Generated Image (AIGI) detection approaches predominantly rely on binary classification to distinguish real from synthetic images, often lacking interpretable or convincing evidence to substantiate their decisions. This limitation stems from existing AIGI detection benchmarks, which, despite featuring a broad collection of synthetic images, remain restricted in their coverage of artifact diversity and lack detailed, localized annotations. To bridge this gap, we introduce a fine-grained benchmark towards eXplainable AI-Generated image Detection, named X-AIGD, which provides pixel-level, categorized annotations of perceptual artifacts, spanning low-level distortions, high-level semantics, and cognitive-level counterfactuals. These comprehensive annotations facilitate fine-grained interpretability evaluation and deeper insight into model decision-making processes. Our extensive investigation using X-AIGD provides several key insights: (1) Existing AIGI detectors demonstrate negligible reliance on perceptual artifacts, even at the most basic distortion level. (2) While AIGI detectors can be trained to identify specific artifacts, they still substantially base their judgment on uninterpretable features. (3) Explicitly aligning model attention with artifact regions can increase the interpretability and generalization of detectors. The data and code are available at: https://github.com/Coxy7/X-AIGD.


Key Contributions

  • X-AIGD benchmark providing pixel-level, categorized perceptual artifact annotations spanning low-level distortions, high-level semantics, and cognitive-level counterfactuals for interpretable AIGI detection evaluation
  • Empirical finding that existing AIGI detectors show negligible reliance on human-interpretable perceptual artifacts even at the basic distortion level
  • Demonstration that explicitly aligning model attention with annotated artifact regions improves both interpretability and cross-dataset generalization

🛡️ Threat Analysis

Output Integrity Attack

Directly targets AI-generated image detection by providing a fine-grained evaluation benchmark with pixel-level artifact annotations across three artifact levels, supporting rigorous measurement of content authenticity and provenance verification systems.


Details

Domains
vision
Model Types
diffusioncnntransformervlm
Threat Tags
inference_time
Datasets
X-AIGD
Applications
ai-generated image detectiondeepfake detectionimage forensics