NS-Net: Decoupling CLIP Semantic Information through NULL-Space for Generalizable AI-Generated Image Detection

The rapid progress of generative models, such as GANs and diffusion models, has facilitated the creation of highly realistic images, raising growing concerns over their misuse in security-sensitive domains. While existing detectors perform well under known generative settings, they often fail to generalize to unknown generative models, especially when semantic content between real and fake images is closely aligned. In this paper, we revisit the use of CLIP features for AI-generated image detection and uncover a critical limitation: the high-level semantic information embedded in CLIP's visual features hinders effective discrimination. To address this, we propose NS-Net, a novel detection framework that leverages NULL-Space projection to decouple semantic information from CLIP's visual features, followed by contrastive learning to capture intrinsic distributional differences between real and generated images. Furthermore, we design a Patch Selection strategy to preserve fine-grained artifacts by mitigating semantic bias caused by global image structures. Extensive experiments on an open-world benchmark comprising images generated by 40 diverse generative models show that NS-Net outperforms existing state-of-the-art methods, achieving a 7.4\% improvement in detection accuracy, thereby demonstrating strong generalization across both GAN- and diffusion-based image generation techniques.

Key Contributions

NULL-Space projection to decouple high-level semantic information from CLIP visual features, reducing semantic bias in AI-generated image detection
Contrastive learning module that captures intrinsic distributional differences between real and AI-generated images after semantic decoupling
Patch Selection strategy based on spectral entropy to preserve fine-grained generation artifacts and mitigate global structure bias

🛡️ Threat Analysis

Output Integrity Attack

The paper's primary contribution is a novel detector for AI-generated images (from GANs and diffusion models), directly addressing output integrity and content authenticity — a canonical ML09 concern of AI-generated content detection.

Details

Domains

visiongenerative

Model Types

transformergandiffusion

Threat Tags

inference_time

Datasets

LSUNopen-world benchmark (40 generative models)

Applications

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Patch-Discontinuity Mining for Generalized Deepfake Detection

CINEMAE: Leveraging Frozen Masked Autoencoders for Cross-Generator AI Image Detection

Detecting AI-Generated Images via Distributional Deviations from Real Images

Generalizable and Adaptive Continual Learning Framework for AI-generated Image Detection

Semantic-Aware Reconstruction Error for Detecting AI-Generated Images

Detecting Generated Images by Fitting Natural Image Distributions

Exposing DeepFakes via Hyperspectral Domain Mapping

Aggregating Diverse Cue Experts for AI-Generated Image Detection