IncreFA: Breaking the Static Wall of Generative Model Attribution

As AI generative models evolve at unprecedented speed, image attribution has become a moving target. New diffusion, adversarial and autoregressive generators appear almost monthly, making existing watermark, classifier and inversion methods obsolete upon release. The core problem lies not in model recognition, but in the inability to adapt attribution itself. We introduce IncreFA, a framework that redefines attribution as a structured incremental learning problem, allowing the system to learn continuously as new generative models emerge. IncreFA departs from conventional incremental learning by exploiting the hierarchical relationships among generative architectures and coupling them with continual adaptation. It integrates two mutually reinforcing mechanisms: (1) Hierarchical Constraints, which encode architectural hierarchies through learnable orthogonal priors to disentangle family-level invariants from model-specific idiosyncrasies; and (2) a Latent Memory Bank, which replays compact latent exemplars and mixes them to generate pseudo-unseen samples, stabilising representation drift and enhancing open-set awareness. On the newly constructed Incremental Attribution Benchmark (IABench) covering 28 generative models released between 2022 and 2025, IncreFA achieves state-of-the-art attribution accuracy and 98.93% unseen detection under a temporally ordered open-set protocol. Code will be available at https://github.com/Ant0ny44/IncreFA.

Key Contributions

Hierarchical Constraints mechanism that encodes architectural hierarchies through learnable orthogonal priors to disentangle family-level invariants from model-specific features
Latent Memory Bank with feature mixing to generate pseudo-unseen samples for open-set detection and stabilize representation drift
IABench benchmark covering 28 generative models (2022-2025) with temporally ordered open-set evaluation protocol

🛡️ Threat Analysis

Output Integrity Attack

Core contribution is attributing AI-generated images to their source generative models (diffusion, GAN, autoregressive) to trace content provenance. This is output integrity and content authenticity, not model theft detection. The paper addresses the challenge of maintaining attribution accuracy as new generative models emerge, which is a content provenance problem.

Details

Domains

visiongenerative

Model Types

diffusiongantransformer

Threat Tags

inference_time

Datasets

IABench

Applications

2026 0 cit.

Output Integrity Attack

100%

IncreFA: Breaking the Static Wall of Generative Model Attribution

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Detecting AI-Generated Images via Distributional Deviations from Real Images

Semantic-Aware Reconstruction Error for Detecting AI-Generated Images

Detecting Generated Images by Fitting Natural Image Distributions

NS-Net: Decoupling CLIP Semantic Information through NULL-Space for Generalizable AI-Generated Image Detection

Generalizable and Adaptive Continual Learning Framework for AI-generated Image Detection

HEDGE: Heterogeneous Ensemble for Detection of AI-GEnerated Images in the Wild

Training-free Detection of AI-generated images via Cropping Robustness

Aggregating Diverse Cue Experts for AI-Generated Image Detection