defense 2026

Attribution as Retrieval: Model-Agnostic AI-Generated Image Attribution

Hongsong Wang 1, Renxi Cheng 1, Chaolei Han 1, Jie Gui 1,2

0 citations

α

Published on arXiv

2603.10583

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Achieves state-of-the-art performance on zero-shot and few-shot deepfake detection and cross-generator image attribution without requiring access to generative models.

LIDA (Low-bIt-plane-based Deepfake Attribution)

Novel technique introduced


With the rapid advancement of AIGC technologies, image forensics will encounter unprecedented challenges. Traditional methods are incapable of dealing with increasingly realistic images generated by rapidly evolving image generation techniques. To facilitate the identification of AI-generated images and the attribution of their source models, generative image watermarking and AI-generated image attribution have emerged as key research focuses in recent years. However, existing methods are model-dependent, requiring access to the generative models and lacking generality and scalability to new and unseen generators. To address these limitations, this work presents a new paradigm for AI-generated image attribution by formulating it as an instance retrieval problem instead of a conventional image classification problem. We propose an efficient model-agnostic framework, called Low-bIt-plane-based Deepfake Attribution (LIDA). The input to LIDA is produced by Low-Bit Fingerprint Generation module, while the training involves Unsupervised Pre-Training followed by subsequent Few-Shot Attribution Adaptation. Comprehensive experiments demonstrate that LIDA achieves state-of-the-art performance for both Deepfake detection and image attribution under zero- and few-shot settings. The code is at https://github.com/hongsong-wang/LIDA


Key Contributions

  • Reformulates AI-generated image attribution as an instance retrieval problem, enabling model-agnostic operation without access to generative models
  • Proposes LIDA: a three-module pipeline combining Low-Bit Fingerprint Generation, Unsupervised Pre-Training, and Few-Shot Attribution Adaptation
  • Achieves SOTA zero- and few-shot deepfake detection and cross-generator attribution on GenImage and WildFake benchmarks

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses AI-generated image detection (deepfake detection) and source model attribution — core ML09 concerns of output integrity and content provenance. The paper proposes a novel forensic pipeline rather than merely applying existing methods.


Details

Domains
vision
Model Types
diffusiongangenerativetransformer
Threat Tags
inference_timeblack_box
Datasets
GenImageWildFake
Applications
deepfake detectionai-generated image attributionimage forensics