defense 2025

Beyond Binary Classification: A Semi-supervised Approach to Generalized AI-generated Image Detection

Hong-Hanh Nguyen-Le 1,2, Van-Tuan Tran 3, Dinh-Thuc Nguyen 3, Nhien-An Le-Khac 1,2

0 citations · 63 references · arXiv

α

Published on arXiv

2511.19499

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

TriDetect improves cross-generator generalization to unseen generators — particularly across GAN-to-diffusion architectural boundaries — outperforming 13 baselines on 5 datasets by exploiting latent architectural clustering rather than surface-level artifacts

TriDetect

Novel technique introduced


The rapid advancement of generators (e.g., StyleGAN, Midjourney, DALL-E) has produced highly realistic synthetic images, posing significant challenges to digital media authenticity. These generators are typically based on a few core architectural families, primarily Generative Adversarial Networks (GANs) and Diffusion Models (DMs). A critical vulnerability in current forensics is the failure of detectors to achieve cross-generator generalization, especially when crossing architectural boundaries (e.g., from GANs to DMs). We hypothesize that this gap stems from fundamental differences in the artifacts produced by these \textbf{distinct architectures}. In this work, we provide a theoretical analysis explaining how the distinct optimization objectives of the GAN and DM architectures lead to different manifold coverage behaviors. We demonstrate that GANs permit partial coverage, often leading to boundary artifacts, while DMs enforce complete coverage, resulting in over-smoothing patterns. Motivated by this analysis, we propose the \textbf{Tri}archy \textbf{Detect}or (TriDetect), a semi-supervised approach that enhances binary classification by discovering latent architectural patterns within the "fake" class. TriDetect employs balanced cluster assignment via the Sinkhorn-Knopp algorithm and a cross-view consistency mechanism, encouraging the model to learn fundamental architectural distincts. We evaluate our approach on two standard benchmarks and three in-the-wild datasets against 13 baselines to demonstrate its generalization capability to unseen generators.


Key Contributions

  • Theoretical analysis showing GAN and diffusion model optimization objectives (JS vs. KL divergence) lead to fundamentally different latent manifold coverage patterns, explaining the cross-architecture generalization gap in detectors
  • TriDetect: a semi-supervised detector that jointly performs binary real/fake classification and unsupervised clustering of latent architectural patterns using the Sinkhorn-Knopp algorithm and a cross-view consistency mechanism
  • Comprehensive evaluation on two standard benchmarks and three in-the-wild datasets against 13 baselines, demonstrating improved cross-generator generalization particularly across GAN-to-DM architectural boundaries

🛡️ Threat Analysis

Output Integrity Attack

Directly proposes a novel AI-generated image detection method (TriDetect) addressing content authenticity — detecting whether images are real or synthetically generated by GANs or diffusion models. The paper's primary contribution is a detection architecture for AI-generated content, which falls squarely under output integrity and content provenance.


Details

Domains
visiongenerative
Model Types
gandiffusiontransformer
Threat Tags
inference_timeblack_box
Datasets
AIGCDetectBenchmark
Applications
ai-generated image detectiondeepfake detectiondigital media forensics