Detecting AI-Generated Images via Diffusion Snap-Back Reconstruction: A Forensic Approach
Published on arXiv
2511.00352
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Achieves 0.993 AUROC under stratified five-fold cross-validation and 0.990 on a holdout split using only logistic regression on 15-dimensional diffusion snap-back features
Diffusion Snap-Back
Novel technique introduced
The rapid rise of generative diffusion models has made distinguishing authentic visual content from synthetic imagery increasingly challenging. Traditional deepfake detection methods, which rely on frequency or pixel-level artifacts, fail against modern text-to-image systems such as Stable Diffusion and DALL-E that produce photorealistic and artifact-free results. This paper introduces a diffusion-based forensic framework that leverages multi-strength image reconstruction dynamics, termed diffusion snap-back, to identify AI-generated images. By analysing how reconstruction metrics (LPIPS, SSIM, and PSNR) evolve across varying noise strengths, we extract interpretable manifold-based features that differentiate real and synthetic images. Evaluated on a balanced dataset of 4,000 images, our approach achieves 0.993 AUROC under cross-validation and remains robust to common distortions such as compression and noise. Despite using limited data and a single diffusion backbone (Stable Diffusion v1.5), the proposed method demonstrates strong generalization and interpretability, offering a foundation for scalable, model-agnostic synthetic media forensics.
Key Contributions
- Diffusion snap-back framework: uses a pre-trained diffusion img2img pipeline as a forensic probe by analyzing how reconstruction quality metrics (LPIPS, SSIM, PSNR) evolve across multiple noise strengths
- Compact 15-dimensional manifold-aligned feature vector combining multi-strength perceptual metrics with trajectory descriptors (AUC-LPIPS, delta-LP, knee-step) for interpretable classification
- Lightweight logistic regression classifier achieving 0.993 AUROC on 4,000 images with demonstrated robustness to JPEG compression and additive noise
🛡️ Threat Analysis
Introduces a novel AI-generated image detection framework — deepfake/synthetic image detection is a core ML09 use case (output integrity and content provenance). The paper's primary contribution is a new forensic technique, not merely applying existing detectors to a domain.