Detecting AI-Generated Images via Diffusion Snap-Back Reconstruction: A Forensic Approach

The rapid rise of generative diffusion models has made distinguishing authentic visual content from synthetic imagery increasingly challenging. Traditional deepfake detection methods, which rely on frequency or pixel-level artifacts, fail against modern text-to-image systems such as Stable Diffusion and DALL-E that produce photorealistic and artifact-free results. This paper introduces a diffusion-based forensic framework that leverages multi-strength image reconstruction dynamics, termed diffusion snap-back, to identify AI-generated images. By analysing how reconstruction metrics (LPIPS, SSIM, and PSNR) evolve across varying noise strengths, we extract interpretable manifold-based features that differentiate real and synthetic images. Evaluated on a balanced dataset of 4,000 images, our approach achieves 0.993 AUROC under cross-validation and remains robust to common distortions such as compression and noise. Despite using limited data and a single diffusion backbone (Stable Diffusion v1.5), the proposed method demonstrates strong generalization and interpretability, offering a foundation for scalable, model-agnostic synthetic media forensics.

Key Contributions

Diffusion snap-back framework: uses a pre-trained diffusion img2img pipeline as a forensic probe by analyzing how reconstruction quality metrics (LPIPS, SSIM, PSNR) evolve across multiple noise strengths
Compact 15-dimensional manifold-aligned feature vector combining multi-strength perceptual metrics with trajectory descriptors (AUC-LPIPS, delta-LP, knee-step) for interpretable classification
Lightweight logistic regression classifier achieving 0.993 AUROC on 4,000 images with demonstrated robustness to JPEG compression and additive noise

🛡️ Threat Analysis

Output Integrity Attack

Introduces a novel AI-generated image detection framework — deepfake/synthetic image detection is a core ML09 use case (output integrity and content provenance). The paper's primary contribution is a new forensic technique, not merely applying existing detectors to a domain.

Details

Domains

visiongenerative

Model Types

diffusiontraditional_ml

Threat Tags

inference_timedigital

Datasets

Custom balanced dataset (4,000 images: 2,000 real, 2,000 AI-generated via Stable Diffusion/DALL-E)

Applications

2026 0 cit.

Output Integrity Attack

92%