S^2F-Net:A Robust Spatial-Spectral Fusion Framework for Cross-Model AIGC Detection
Xiangyu Hu 1, Yicheng Hong 1, Hongchuang Zheng 2, Wenjun Zeng 1, Bingyao Liu 1
Published on arXiv
2601.12313
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Achieves 90.49% cross-model detection accuracy on AIGCDetectBenchmark covering 17 generative model categories, significantly outperforming prior baselines.
S²F-Net
Novel technique introduced
The rapid development of generative models has imposed an urgent demand for detection schemes with strong generalization capabilities. However, existing detection methods generally suffer from overfitting to specific source models, leading to significant performance degradation when confronted with unseen generative architectures. To address these challenges, this paper proposes a cross-model detection framework called S 2 F-Net, whose core lies in exploring and leveraging the inherent spectral discrepancies between real and synthetic textures. Considering that upsampling operations leave unique and distinguishable frequency fingerprints in both texture-poor and texture-rich regions, we focus our research on the detection of frequency-domain artifacts, aiming to fundamentally improve the generalization performance of the model. Specifically, we introduce a learnable frequency attention module that adaptively weights and enhances discriminative frequency bands by synergizing spatial texture analysis and spectral dependencies.On the AIGCDetectBenchmark, which includes 17 categories of generative models, S 2 F-Net achieves a detection accuracy of 90.49%, significantly outperforming various existing baseline methods in cross-domain detection scenarios.
Key Contributions
- Proposes S²F-Net, a cross-model AIGC detection framework that fuses spatial texture analysis with spectral dependencies to exploit universal upsampling artifacts across generative architectures
- Introduces a learnable frequency attention module that adaptively weights discriminative frequency bands based on image entropy, targeting high-frequency anomalies in high-entropy (texture-rich) regions
- Achieves 90.49% detection accuracy on AIGCDetectBenchmark (17 generative model categories), significantly outperforming existing baselines in cross-domain generalization
🛡️ Threat Analysis
S²F-Net is a novel AI-generated image detection architecture targeting output integrity — specifically detecting synthetic images from GANs and diffusion models by exploiting frequency-domain fingerprints left by upsampling operations. This is a new detection architecture, not a domain application of existing methods.