defense 2025

Detecting Generated Images by Fitting Natural Image Distributions

2 citations · 50 references · arXiv

Published on arXiv

2511.01293

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

ConV detects generated images across diverse unknown generative models without requiring generated image training data, outperforming binary classifier baselines that depend on collected generated samples.

ConV (Consistency Verification)

Novel technique introduced

The increasing realism of generated images has raised significant concerns about their potential misuse, necessitating robust detection methods. Current approaches mainly rely on training binary classifiers, which depend heavily on the quantity and quality of available generated images. In this work, we propose a novel framework that exploits geometric differences between the data manifolds of natural and generated images. To exploit this difference, we employ a pair of functions engineered to yield consistent outputs for natural images but divergent outputs for generated ones, leveraging the property that their gradients reside in mutually orthogonal subspaces. This design enables a simple yet effective detection method: an image is identified as generated if a transformation along its data manifold induces a significant change in the loss value of a self-supervised model pre-trained on natural images. Further more, to address diminishing manifold disparities in advanced generative models, we leverage normalizing flows to amplify detectable differences by extruding generated images away from the natural image manifold. Extensive experiments demonstrate the efficacy of this method. Code is available at https://github.com/tmlr-group/ConV.

Key Contributions

ConV framework that detects generated images by training only on natural images, exploiting geometric manifold disparities without needing generated training samples
Theoretical design using function pairs with orthogonal gradient subspaces to ensure consistent outputs for natural images and divergent outputs for generated ones
Normalizing flow augmentation that actively extrudes generated images from the natural image manifold to amplify detectable differences as generative models improve

🛡️ Threat Analysis

Output Integrity Attack

Primary contribution is a novel AI-generated image detection method — detecting synthetic/deepfake images is squarely output integrity and content authenticity. The novelty (manifold geometry, orthogonal gradient subspaces, normalizing flow amplification) is a new detection architecture, not merely applying existing classifiers to a domain.

Details

Domains

visiongenerative

Model Types

diffusiongantransformer

Threat Tags

inference_time

Applications

ai-generated image detectiondeepfake detection

Read PDF arXiv DOI Code

Detecting Generated Images by Fitting Natural Image Distributions

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Patch-Discontinuity Mining for Generalized Deepfake Detection

Generalizable and Adaptive Continual Learning Framework for AI-generated Image Detection

CINEMAE: Leveraging Frozen Masked Autoencoders for Cross-Generator AI Image Detection

FUSE: Unifying Spectral and Semantic Cues for Robust AI-Generated Image Detection

Aggregating Diverse Cue Experts for AI-Generated Image Detection

Exposing DeepFakes via Hyperspectral Domain Mapping

When Detectors Forget Forensics: Blocking Semantic Shortcuts for Generalizable AI-Generated Image Detection

Semantic-Aware Reconstruction Error for Detecting AI-Generated Images