Detecting Generated Images by Fitting Natural Image Distributions
Yonggang Zhang 1, Jun Nie 2,3, Xinmei Tian 4, Mingming Gong 5,3, Kun Zhang 6,3, Bo Han 2
1 The Hong Kong University of Science and Technology
2 Hong Kong Baptist University
3 Mohamed bin Zayed University of Artificial Intelligence
Published on arXiv
2511.01293
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
ConV detects generated images across diverse unknown generative models without requiring generated image training data, outperforming binary classifier baselines that depend on collected generated samples.
ConV (Consistency Verification)
Novel technique introduced
The increasing realism of generated images has raised significant concerns about their potential misuse, necessitating robust detection methods. Current approaches mainly rely on training binary classifiers, which depend heavily on the quantity and quality of available generated images. In this work, we propose a novel framework that exploits geometric differences between the data manifolds of natural and generated images. To exploit this difference, we employ a pair of functions engineered to yield consistent outputs for natural images but divergent outputs for generated ones, leveraging the property that their gradients reside in mutually orthogonal subspaces. This design enables a simple yet effective detection method: an image is identified as generated if a transformation along its data manifold induces a significant change in the loss value of a self-supervised model pre-trained on natural images. Further more, to address diminishing manifold disparities in advanced generative models, we leverage normalizing flows to amplify detectable differences by extruding generated images away from the natural image manifold. Extensive experiments demonstrate the efficacy of this method. Code is available at https://github.com/tmlr-group/ConV.
Key Contributions
- ConV framework that detects generated images by training only on natural images, exploiting geometric manifold disparities without needing generated training samples
- Theoretical design using function pairs with orthogonal gradient subspaces to ensure consistent outputs for natural images and divergent outputs for generated ones
- Normalizing flow augmentation that actively extrudes generated images from the natural image manifold to amplify detectable differences as generative models improve
🛡️ Threat Analysis
Primary contribution is a novel AI-generated image detection method — detecting synthetic/deepfake images is squarely output integrity and content authenticity. The novelty (manifold geometry, orthogonal gradient subspaces, normalizing flow amplification) is a new detection architecture, not merely applying existing classifiers to a domain.