Aggregating Diverse Cue Experts for AI-Generated Image Detection

The rapid emergence of image synthesis models poses challenges to the generalization of AI-generated image detectors. However, existing methods often rely on model-specific features, leading to overfitting and poor generalization. In this paper, we introduce the Multi-Cue Aggregation Network (MCAN), a novel framework that integrates different yet complementary cues in a unified network. MCAN employs a mixture-of-encoders adapter to dynamically process these cues, enabling more adaptive and robust feature representation. Our cues include the input image itself, which represents the overall content, and high-frequency components that emphasize edge details. Additionally, we introduce a Chromatic Inconsistency (CI) cue, which normalizes intensity values and captures noise information introduced during the image acquisition process in real images, making these noise patterns more distinguishable from those in AI-generated content. Unlike prior methods, MCAN's novelty lies in its unified multi-cue aggregation framework, which integrates spatial, frequency-domain, and chromaticity-based information for enhanced representation learning. These cues are intrinsically more indicative of real images, enhancing cross-model generalization. Extensive experiments on the GenImage, Chameleon, and UniversalFakeDetect benchmark validate the state-of-the-art performance of MCAN. In the GenImage dataset, MCAN outperforms the best state-of-the-art method by up to 7.4% in average ACC across eight different image generators.

Key Contributions

Chromaticity Inconsistency (CI) cue: a novel representation that applies chromaticity transformation to expose camera-noise patterns absent in AI-generated images
MCAN framework with Mixture-of-Encoder Adapter (MoEA) that dynamically integrates spatial, frequency-domain, and chromaticity cues into a unified feature space
State-of-the-art results on GenImage, Chameleon, and UniversalFakeDetect benchmarks, outperforming best prior method by up to 7.4% average accuracy across eight generators

🛡️ Threat Analysis

Output Integrity Attack

Primary contribution is a novel AI-generated image detection framework (MCAN), directly addressing output integrity and content authenticity by distinguishing synthetic images from real ones across GAN and diffusion generators.

Details

Domains

visiongenerative

Model Types

transformerdiffusiongan

Threat Tags

inference_time

Datasets

GenImageChameleonUniversalFakeDetect

Applications

2025 0 cit.

Output Integrity Attack

100%

Aggregating Diverse Cue Experts for AI-Generated Image Detection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

When Detectors Forget Forensics: Blocking Semantic Shortcuts for Generalizable AI-Generated Image Detection

Training-free Detection of AI-generated images via Cropping Robustness

Detecting AI-Generated Images via Distributional Deviations from Real Images

Patch-Discontinuity Mining for Generalized Deepfake Detection

FUSE: Unifying Spectral and Semantic Cues for Robust AI-Generated Image Detection

Exposing DeepFakes via Hyperspectral Domain Mapping

Rethinking the Use of Vision Transformers for AI-Generated Image Detection

CINEMAE: Leveraging Frozen Masked Autoencoders for Cross-Generator AI Image Detection