FeatDistill: A Feature Distillation Enhanced Multi-Expert Ensemble Framework for Robust AI-generated Image Detection

The rapid iteration and widespread dissemination of deepfake technology have posed severe challenges to information security, making robust and generalizable detection of AI-generated forged images increasingly important. In this paper, we propose FeatDistill, an AI-generated image detection framework that integrates feature distillation with a multi-expert ensemble, developed for the NTIRE Challenge on Robust AI-Generated Image Detection in the Wild. The framework explicitly targets three practical bottlenecks in real-world forensics: degradation interference, insufficient feature representation, and limited generalization. Concretely, we build a four-backbone Vision Transformer (ViT) ensemble composed of CLIP and SigLIP variants to capture complementary forensic cues. To improve data coverage, we expand the training set and introduce comprehensive degradation modeling, which exposes the detector to diverse quality variations and synthesis artifacts commonly encountered in unconstrained scenarios. We further adopt a two-stage training paradigm: the model is first optimized with a standard binary classification objective, then refined by dense feature-level self-distillation for representation alignment. This design effectively mitigates overfitting and enhances semantic consistency of learned features. At inference time, the final prediction is obtained by averaging the probabilities from four independently trained experts, yielding stable and reliable decisions across unseen generators and complex degradations. Despite the ensemble design, the framework remains efficient, requiring only about 10 GB peak GPU memory. Extensive evaluations in the NTIRE challenge setting demonstrate that FeatDistill achieves strong robustness and generalization under diverse ``in-the-wild'' conditions, offering an effective and practical solution for real-world deepfake image detection.

Key Contributions

Heterogeneous multi-expert ensemble combining CLIP ViT-L/14 and SigLIP So400M for complementary forensic feature extraction
Comprehensive degradation modeling covering blur, noise, compression, and social-media artifacts to improve robustness under in-the-wild conditions
Two-stage training with dense feature-level self-distillation for representation alignment and overfitting mitigation

🛡️ Threat Analysis

Output Integrity Attack

Paper targets AI-generated image detection (deepfake detection) — detecting whether content is AI-generated falls under output integrity and content authenticity. The framework is designed to identify synthetic images across unseen generators and degradations, which is a core ML09 task (verifying content provenance and detecting AI-generated outputs).