Detecting AI-Generated Images via Distributional Deviations from Real Images
Yakun Niu , Yingjian Chen , Lei Zhang
Published on arXiv
2601.03586
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
MPFT+TAM achieves 98.2% and 94.6% average accuracy on GenImage and UniversalFakeDetect respectively, significantly outperforming frozen CLIP-based baselines while generalizing to unseen generative models.
MPFT (Masking-based Pre-trained model Fine-Tuning) with TAM (Texture-Aware Masking)
Novel technique introduced
The rapid advancement of generative models has significantly enhanced the quality of AI-generated images, raising concerns about misinformation and the erosion of public trust. Detecting AI-generated images has thus become a critical challenge, particularly in terms of generalizing to unseen generative models. Existing methods using frozen pre-trained CLIP models show promise in generalization but treat the image encoder as a basic feature extractor, failing to fully exploit its potential. In this paper, we perform an in-depth analysis of the frozen CLIP image encoder (CLIP-ViT), revealing that it effectively clusters real images in a high-level, abstract feature space. However, it does not truly possess the ability to distinguish between real and AI-generated images. Based on this analysis, we propose a Masking-based Pre-trained model Fine-Tuning (MPFT) strategy, which introduces a Texture-Aware Masking (TAM) mechanism to mask textured areas containing generative model-specific patterns during fine-tuning. This approach compels CLIP-ViT to attend to the "distributional deviations"from authentic images for AI-generated image detection, thereby achieving enhanced generalization performance. Extensive experiments on the GenImage and UniversalFakeDetect datasets demonstrate that our method, fine-tuned with only a minimal number of images, significantly outperforms existing approaches, achieving up to 98.2% and 94.6% average accuracy on the two datasets, respectively.
Key Contributions
- Analysis showing frozen CLIP-ViT clusters real images well but cannot intrinsically distinguish real from AI-generated images without fine-tuning
- MPFT strategy with Texture-Aware Masking (TAM) that masks generative-model-specific textured regions during fine-tuning to force CLIP-ViT to learn distributional deviations from real images
- Achieves 98.2% and 94.6% average accuracy on GenImage and UniversalFakeDetect benchmarks with minimal fine-tuning images, outperforming prior state-of-the-art
🛡️ Threat Analysis
Paper proposes a novel AI-generated image detection method — detecting synthetic content provenance and authenticity is core ML09 (output integrity / AI-generated content detection). The MPFT+TAM strategy is a new forensic technique for distinguishing real from generative-model-produced images across unseen generators.