A Difference-in-Difference Approach to Detecting AI-Generated Images
Xinyi Qi 1, Kai Ye 2, Chengchun Shi 2, Ying Yang 1, Hongyi Zhou 1, Jin Zhu 3
Published on arXiv
2602.23732
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
The proposed second-order difference-in-difference method achieves strong generalization for detecting AI-generated images, outperforming reconstruction-error baselines against modern diffusion models.
Difference-in-Difference (DiD) Detection
Novel technique introduced
Diffusion models are able to produce AI-generated images that are almost indistinguishable from real ones. This raises concerns about their potential misuse and poses substantial challenges for detecting them. Many existing detectors rely on reconstruction error -- the difference between the input image and its reconstructed version -- as the basis for distinguishing real from fake images. However, these detectors become less effective as modern AI-generated images become increasingly similar to real ones. To address this challenge, we propose a novel difference-in-difference method. Instead of directly using the reconstruction error (a first-order difference), we compute the difference in reconstruction error -- a second-order difference -- for variance reduction and improving detection accuracy. Extensive experiments demonstrate that our method achieves strong generalization performance, enabling reliable detection of AI-generated images in the era of generative AI.
Key Contributions
- Introduces a difference-in-difference framework that computes a second-order reconstruction error (difference of reconstruction differences) rather than raw reconstruction error for AI image detection
- Achieves variance reduction over first-order reconstruction-error baselines, improving detection accuracy and generalization
- Demonstrates strong performance detecting images from modern diffusion models that closely resemble real images
🛡️ Threat Analysis
Directly contributes a novel AI-generated image detection methodology — specifically a difference-in-difference technique applied to reconstruction error — which falls squarely under output integrity and content authenticity (detecting synthetic content produced by generative models).