defense 2026

Diversity over Uniformity: Rethinking Representation in Generated Image Detection

Qinghui He 1,2, Haifeng Zhang 1, Qiao Qin 1, Bo Liu 1,3, Xiuli Bi 1,3, Bin Xiao 1

0 citations

α

Published on arXiv

2603.00717

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Achieves 5.02% accuracy improvement over SOTA in cross-model generated image detection scenarios, demonstrating superior generalization to unseen generative mechanisms.

DoU (Diversity over Uniformity)

Novel technique introduced


With the rapid advancement of generative models, generated image detection has become an important task in visual forensics. Although existing methods have achieved remarkable progress, they often rely, after training, on only a small subset of highly salient forgery cues, which limits their ability to generalize to unseen generative mechanisms. We argue that reliably generated image detection should not depend on a single decision path but should preserve multiple judgment perspectives, enabling the model to understand the differences between real and generated images from diverse viewpoints. Based on this idea, we propose an anti-feature-collapse learning framework that filters task-irrelevant components and suppresses excessive overlap among different forgery cues in the representation space, preventing discriminative information from collapsing into a few dominant feature directions. This design maintains diverse and complementary evidence within the model, reduces reliance on a small set of salient cues, and enhances robustness under unseen generative settings. Extensive experiments on multiple public benchmarks demonstrate that the proposed method significantly outperforms the state-of-the-art approaches in cross-model scenarios, achieving an accuracy improvement of 5.02% and exhibiting superior generalization and detection reliability. The source code is available at https://github.com/Yanmou-Hui/DoU.


Key Contributions

  • Anti-feature-collapse learning framework that filters task-irrelevant components and suppresses excessive feature overlap to prevent discriminative cues from collapsing into a few dominant directions
  • Principled argument that robust generated image detection requires diverse, complementary forgery cues rather than reliance on a single salient decision path
  • 5.02% accuracy improvement over state-of-the-art in cross-model generalization benchmarks

🛡️ Threat Analysis

Output Integrity Attack

Paper directly addresses AI-generated image detection (visual forensics), proposing a novel detector architecture — this is core ML09 content provenance and output integrity work. The framework improves detection generalization to unseen generative models.


Details

Domains
visiongenerative
Model Types
cnntransformerdiffusiongan
Threat Tags
inference_time
Applications
generated image detectionvisual forensicsdeepfake detection