SAIDO: Generalizable Detection of AI-Generated Images via Scene-Aware and Importance-Guided Dynamic Optimization in Continual Learning
Yongkang Hu 1,2, Yu Cheng 1,2, Yushuo Zhang 1,2, Yuan Xie 1,2, Zhaoxia Yin 1
Published on arXiv
2512.00539
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Achieves 44.22% and 40.57% relative reductions in average detection error rate and forgetting rate over SOTA, and improves detection accuracy by 9.47% on open-world datasets under continual learning settings.
SAIDO
Novel technique introduced
The widespread misuse of image generation technologies has raised security concerns, driving the development of AI-generated image detection methods. However, generalization has become a key challenge and open problem: existing approaches struggle to adapt to emerging generative methods and content types in real-world scenarios. To address this issue, we propose a Scene-Aware and Importance-Guided Dynamic Optimization detection framework with continual learning (SAIDO). Specifically, we design Scene-Awareness-Based Expert Module (SAEM) that dynamically identifies and incorporates new scenes using VLLMs. For each scene, independent expert modules are dynamically allocated, enabling the framework to capture scene-specific forgery features better and enhance cross-scene generalization. To mitigate catastrophic forgetting when learning from multiple image generative methods, we introduce Importance-Guided Dynamic Optimization Mechanism (IDOM), which optimizes each neuron through an importance-guided gradient projection strategy, thereby achieving an effective balance between model plasticity and stability. Extensive experiments on continual learning tasks demonstrate that our method outperforms the current SOTA method in both stability and plasticity, achieving 44.22\% and 40.57\% relative reductions in average detection error rate and forgetting rate, respectively. On open-world datasets, it improves the average detection accuracy by 9.47\% compared to the current SOTA method.
Key Contributions
- Scene-Awareness-Based Expert Module (SAEM) that uses VLLMs to dynamically identify new content scenes and allocate independent expert modules per scene for better cross-scene generalization
- Importance-Guided Dynamic Optimization Mechanism (IDOM) that applies gradient projection per-neuron to balance plasticity and stability, mitigating catastrophic forgetting across sequential generative methods
- Continual learning evaluation showing 44.22% and 40.57% relative reductions in detection error rate and forgetting rate over SOTA, plus 9.47% accuracy gain on open-world datasets
🛡️ Threat Analysis
Primary contribution is a novel architecture for AI-generated image detection — detecting synthetic/AI-produced content is a canonical ML09 (Output Integrity) task. The paper proposes new forensic detection methodology (SAEM + IDOM) rather than merely applying existing detectors to a new domain.