defense 2025

When Semantics Regulate: Rethinking Patch Shuffle and Internal Bias for Generated Image Detection with CLIP

Beilin Chu 1, Weike You 1, Mengtao Li 1, Tingting Zheng 1, Kehan Zhao 1, Xuan Xu 1, Zhigao Lu 1, Jia Song 1, Moxuan Xu 2, Linna Zhou 1

2 citations · 46 references · arXiv

α

Published on arXiv

2511.19126

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

SemAnti achieves state-of-the-art cross-domain generalization on AIGCDetectBenchmark and GenImage by suppressing semantic bias in CLIP-based detectors

SemAnti

Novel technique introduced


The rapid progress of GANs and Diffusion Models poses new challenges for detecting AI-generated images. Although CLIP-based detectors exhibit promising generalization, they often rely on semantic cues rather than generator artifacts, leading to brittle performance under distribution shifts. In this work, we revisit the nature of semantic bias and uncover that Patch Shuffle provides an unusually strong benefit for CLIP, that disrupts global semantic continuity while preserving local artifact cues, which reduces semantic entropy and homogenizes feature distributions between natural and synthetic images. Through a detailed layer-wise analysis, we further show that CLIP's deep semantic structure functions as a regulator that stabilizes cross-domain representations once semantic bias is suppressed. Guided by these findings, we propose SemAnti, a semantic-antagonistic fine-tuning paradigm that freezes the semantic subspace and adapts only artifact-sensitive layers under shuffled semantics. Despite its simplicity, SemAnti achieves state-of-the-art cross-domain generalization on AIGCDetectBenchmark and GenImage, demonstrating that regulating semantics is key to unlocking CLIP's full potential for robust AI-generated image detection.


Key Contributions

  • Reveals that Patch Shuffle reduces semantic entropy in CLIP and homogenizes feature distributions between real and synthetic images, explaining its outsized benefit for detection generalization
  • Layer-wise analysis showing CLIP's deep semantic structure acts as a stabilizing regulator once semantic bias is suppressed
  • SemAnti: a fine-tuning paradigm that freezes the semantic subspace and adapts only artifact-sensitive layers under shuffled-semantic inputs, achieving state-of-the-art cross-domain generalization

🛡️ Threat Analysis

Output Integrity Attack

The paper's primary contribution is a novel forensic detection method for AI-generated images (from GANs and Diffusion Models). SemAnti is a new detection architecture/training paradigm — not a mere domain application of existing methods — directly advancing AI-generated content detection (output integrity/content authenticity).


Details

Domains
vision
Model Types
transformerdiffusiongan
Threat Tags
inference_timedigital
Datasets
AIGCDetectBenchmarkGenImage
Applications
ai-generated image detectiondeepfake detectionsynthetic image forensics