defense 2026

Generalizable and Adaptive Continual Learning Framework for AI-generated Image Detection

Hanyi Wang 1, Jun Lan 2, Yaoyu Kang 1, Huijia Zhu 2, Weiqiang Wang 2, Zhuosheng Zhang 1, Shilin Wang 1

0 citations · 62 references · IEEE transactions on multimedi...

α

Published on arXiv

2601.05580

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

The proposed continual learning framework achieves 92.20% average accuracy across evolving generative models, with the offline detector surpassing the leading baseline by +5.51% in mean average precision.


The malicious misuse and widespread dissemination of AI-generated images pose a significant threat to the authenticity of online information. Current detection methods often struggle to generalize to unseen generative models, and the rapid evolution of generative techniques continuously exacerbates this challenge. Without adaptability, detection models risk becoming ineffective in real-world applications. To address this critical issue, we propose a novel three-stage domain continual learning framework designed for continuous adaptation to evolving generative models. In the first stage, we employ a strategic parameter-efficient fine-tuning approach to develop a transferable offline detection model with strong generalization capabilities. Building upon this foundation, the second stage integrates unseen data streams into a continual learning process. To efficiently learn from limited samples of novel generated models and mitigate overfitting, we design a data augmentation chain with progressively increasing complexity. Furthermore, we leverage the Kronecker-Factored Approximate Curvature (K-FAC) method to approximate the Hessian and alleviate catastrophic forgetting. Finally, the third stage utilizes a linear interpolation strategy based on Linear Mode Connectivity, effectively capturing commonalities across diverse generative models and further enhancing overall performance. We establish a comprehensive benchmark of 27 generative models, including GANs, deepfakes, and diffusion models, chronologically structured up to August 2024 to simulate real-world scenarios. Extensive experiments demonstrate that our initial offline detectors surpass the leading baseline by +5.51% in terms of mean average precision. Our continual learning strategy achieves an average accuracy of 92.20%, outperforming state-of-the-art methods.


Key Contributions

  • Three-stage continual learning framework for AI-generated image detection that adapts to evolving generative models using parameter-efficient fine-tuning, K-FAC Hessian approximation to mitigate catastrophic forgetting, and Linear Mode Connectivity interpolation
  • Data augmentation chain with progressively increasing complexity to handle few-shot adaptation to novel generative models
  • Comprehensive benchmark of 27 generative models (GANs, deepfakes, diffusion models) chronologically ordered up to August 2024 for realistic evaluation

🛡️ Threat Analysis

Output Integrity Attack

The paper's primary contribution is detecting AI-generated image content (GANs, deepfakes, diffusion model outputs) — a direct output integrity and content authenticity problem. Novel contributions include a continual learning framework with K-FAC forgetting mitigation and Linear Mode Connectivity, making this a methodological advance in AI-generated content detection, not just a domain application of existing detectors.


Details

Domains
visiongenerative
Model Types
transformergandiffusion
Threat Tags
inference_time
Datasets
Custom benchmark (27 generative models, GAN/deepfake/diffusion, up to Aug 2024)
Applications
ai-generated image detectiondeepfake detectionsynthetic image forensics