S^2F-Net:A Robust Spatial-Spectral Fusion Framework for Cross-Model AIGC Detection

The rapid development of generative models has imposed an urgent demand for detection schemes with strong generalization capabilities. However, existing detection methods generally suffer from overfitting to specific source models, leading to significant performance degradation when confronted with unseen generative architectures. To address these challenges, this paper proposes a cross-model detection framework called S 2 F-Net, whose core lies in exploring and leveraging the inherent spectral discrepancies between real and synthetic textures. Considering that upsampling operations leave unique and distinguishable frequency fingerprints in both texture-poor and texture-rich regions, we focus our research on the detection of frequency-domain artifacts, aiming to fundamentally improve the generalization performance of the model. Specifically, we introduce a learnable frequency attention module that adaptively weights and enhances discriminative frequency bands by synergizing spatial texture analysis and spectral dependencies.On the AIGCDetectBenchmark, which includes 17 categories of generative models, S 2 F-Net achieves a detection accuracy of 90.49%, significantly outperforming various existing baseline methods in cross-domain detection scenarios.

Key Contributions

Proposes S²F-Net, a cross-model AIGC detection framework that fuses spatial texture analysis with spectral dependencies to exploit universal upsampling artifacts across generative architectures
Introduces a learnable frequency attention module that adaptively weights discriminative frequency bands based on image entropy, targeting high-frequency anomalies in high-entropy (texture-rich) regions
Achieves 90.49% detection accuracy on AIGCDetectBenchmark (17 generative model categories), significantly outperforming existing baselines in cross-domain generalization

🛡️ Threat Analysis

Output Integrity Attack

S²F-Net is a novel AI-generated image detection architecture targeting output integrity — specifically detecting synthetic images from GANs and diffusion models by exploiting frequency-domain fingerprints left by upsampling operations. This is a new detection architecture, not a domain application of existing methods.

Details

Domains

vision

Model Types

cnntransformergandiffusion

Threat Tags

inference_time

Datasets

AIGCDetectBenchmark

Applications

2025 1 cit.

Output Integrity Attack

85%