Detecting AI-Generated Images via Distributional Deviations from Real Images

The rapid advancement of generative models has significantly enhanced the quality of AI-generated images, raising concerns about misinformation and the erosion of public trust. Detecting AI-generated images has thus become a critical challenge, particularly in terms of generalizing to unseen generative models. Existing methods using frozen pre-trained CLIP models show promise in generalization but treat the image encoder as a basic feature extractor, failing to fully exploit its potential. In this paper, we perform an in-depth analysis of the frozen CLIP image encoder (CLIP-ViT), revealing that it effectively clusters real images in a high-level, abstract feature space. However, it does not truly possess the ability to distinguish between real and AI-generated images. Based on this analysis, we propose a Masking-based Pre-trained model Fine-Tuning (MPFT) strategy, which introduces a Texture-Aware Masking (TAM) mechanism to mask textured areas containing generative model-specific patterns during fine-tuning. This approach compels CLIP-ViT to attend to the "distributional deviations"from authentic images for AI-generated image detection, thereby achieving enhanced generalization performance. Extensive experiments on the GenImage and UniversalFakeDetect datasets demonstrate that our method, fine-tuned with only a minimal number of images, significantly outperforms existing approaches, achieving up to 98.2% and 94.6% average accuracy on the two datasets, respectively.

Key Contributions

Analysis showing frozen CLIP-ViT clusters real images well but cannot intrinsically distinguish real from AI-generated images without fine-tuning
MPFT strategy with Texture-Aware Masking (TAM) that masks generative-model-specific textured regions during fine-tuning to force CLIP-ViT to learn distributional deviations from real images
Achieves 98.2% and 94.6% average accuracy on GenImage and UniversalFakeDetect benchmarks with minimal fine-tuning images, outperforming prior state-of-the-art

🛡️ Threat Analysis

Output Integrity Attack

Paper proposes a novel AI-generated image detection method — detecting synthetic content provenance and authenticity is core ML09 (output integrity / AI-generated content detection). The MPFT+TAM strategy is a new forensic technique for distinguishing real from generative-model-produced images across unseen generators.

Details

Domains

visiongenerative

Model Types

transformerdiffusiongan

Threat Tags

inference_time

Datasets

GenImageUniversalFakeDetect

Applications

2025 2 cit.

Output Integrity Attack

100%