defense 2025

$\bf{D^3}$QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection

Yanran Zhang , Bingyao Yu , Yu Zheng , Wenzhao Zheng , Yueqi Duan , Lei Chen , Jie Zhou , Jiwen Lu

1 citations · 58 references · arXiv

α

Published on arXiv

2510.05891

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

D³QE achieves superior detection accuracy and strong generalization across 7 autoregressive visual models with robustness to real-world perturbations.

D³QE (Discrete Distribution Discrepancy-aware Quantization Error)

Novel technique introduced


The emergence of visual autoregressive (AR) models has revolutionized image generation while presenting new challenges for synthetic image detection. Unlike previous GAN or diffusion-based methods, AR models generate images through discrete token prediction, exhibiting both marked improvements in image synthesis quality and unique characteristics in their vector-quantized representations. In this paper, we propose to leverage Discrete Distribution Discrepancy-aware Quantization Error (D$^3$QE) for autoregressive-generated image detection that exploits the distinctive patterns and the frequency distribution bias of the codebook existing in real and fake images. We introduce a discrete distribution discrepancy-aware transformer that integrates dynamic codebook frequency statistics into its attention mechanism, fusing semantic features and quantization error latent. To evaluate our method, we construct a comprehensive dataset termed ARForensics covering 7 mainstream visual AR models. Experiments demonstrate superior detection accuracy and strong generalization of D$^3$QE across different AR models, with robustness to real-world perturbations. Code is available at \href{https://github.com/Zhangyr2022/D3QE}{https://github.com/Zhangyr2022/D3QE}.


Key Contributions

  • Identifies and exploits discrete distribution discrepancy and codebook frequency bias as discriminative forensic features unique to autoregressive-generated images.
  • Proposes a discrete distribution discrepancy-aware transformer that integrates dynamic codebook frequency statistics into its attention mechanism, fusing semantic and quantization error features.
  • Constructs ARForensics, a benchmark dataset covering 7 mainstream visual AR models for evaluating synthetic image detection.

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel AI-generated image detection method (D³QE) specifically engineered to detect synthetic images from visual autoregressive models by exploiting unique discrete distribution discrepancies and quantization error patterns in their VQ codebook representations — a forensic technique for output integrity and content authenticity.


Details

Domains
visiongenerative
Model Types
transformer
Threat Tags
inference_timedigital
Datasets
ARForensics
Applications
ai-generated image detectionimage forensicsdeepfake detection