REVEAL: Reasoning-enhanced Forensic Evidence Analysis for Explainable AI-generated Image Detection
Huangsen Cao 1, Qin Mei 1, Zhiheng Li 1, Yuxi Li 2, Ying Zhang 2, Chen Li 2, Zhimeng Zhang 1, Xin Ding 3, Yongwei Wang 1, Jing Lyu 2, Fei Wu 1
Published on arXiv
2511.23158
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
REVEAL significantly enhances detection accuracy, explanation fidelity, and cross-model generalization over state-of-the-art explainable image forensics methods
REVEAL
Novel technique introduced
With the rapid advancement of generative models, visually realistic AI-generated images have become increasingly difficult to distinguish from authentic ones, posing severe threats to social trust and information integrity. Consequently, there is an urgent need for efficient and truly explainable image forensic methods. Recent detection paradigms have shifted towards explainable forensics. However, state-of-the-art approaches primarily rely on post-hoc rationalizations or visual discrimination, lacking a verifiable chain of evidence. This reliance on surface-level pattern matching limits the generation of causally grounded explanations and often results in poor generalization. To bridge this critical gap, we introduce \textbf{REVEAL-Bench}, the first reasoning-enhanced multimodal benchmark for AI-generated image detection that is explicitly structured around a chain-of-evidence derived from multiple lightweight expert models, then records step-by-step reasoning traces and evidential justifications. Building upon this dataset, we propose \textbf{REVEAL} (\underline{R}easoning-\underline{e}nhanced Forensic E\underline{v}id\underline{e}nce \underline{A}na\underline{l}ysis), an effective and explainable forensic framework that integrates detection with a novel expert-grounded reinforcement learning. Our reward mechanism is specially tailored to jointly optimize detection accuracy, explanation fidelity, and logical coherence grounded in explicit forensic evidence, enabling REVEAL to produce fine-grained, interpretable, and verifiable reasoning chains alongside its detection outcomes. Extensive experimental results demonstrate that REVEAL significantly enhances detection accuracy, explanation fidelity, and robust cross-model generalization, benchmarking a new state of the art for explainable image forensics.
Key Contributions
- REVEAL-Bench: first reasoning-enhanced multimodal benchmark for AI-generated image detection structured around a chain-of-evidence from multiple expert models with step-by-step reasoning traces
- REVEAL framework: explainable forensic detection system integrating expert-grounded reinforcement learning that jointly optimizes detection accuracy, explanation fidelity, and logical coherence
- Novel reward mechanism producing fine-grained, interpretable, and verifiable reasoning chains alongside detection outcomes, achieving new state-of-the-art in explainable image forensics
🛡️ Threat Analysis
Primary contribution is detecting AI-generated images with verifiable forensic reasoning chains — directly addresses output integrity and AI-generated content detection, a core ML09 concern.