tool 2025

Redefining Generalization in Visual Domains: A Two-Axis Framework for Fake Image Detection with FusionDetect

Amirtaha Amanzadi , Zahra Dehghanian , Hamid Beigy , Hamid R. Rabiee

0 citations · 70 references · arXiv

α

Published on arXiv

2510.05740

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

FusionDetect outperforms its closest competitor by +3.87% accuracy and +6.13% average precision on established benchmarks, and gains +4.48% accuracy on the new OmniGen cross-domain benchmark.

FusionDetect

Novel technique introduced


The rapid development of generative models has made it increasingly crucial to develop detectors that can reliably detect synthetic images. Although most of the work has now focused on cross-generator generalization, we argue that this viewpoint is too limited. Detecting synthetic images involves another equally important challenge: generalization across visual domains. To bridge this gap,we present the OmniGen Benchmark. This comprehensive evaluation dataset incorporates 12 state-of-the-art generators, providing a more realistic way of evaluating detector performance under realistic conditions. In addition, we introduce a new method, FusionDetect, aimed at addressing both vectors of generalization. FusionDetect draws on the benefits of two frozen foundation models: CLIP & Dinov2. By deriving features from both complementary models,we develop a cohesive feature space that naturally adapts to changes in both thecontent and design of the generator. Our extensive experiments demonstrate that FusionDetect delivers not only a new state-of-the-art, which is 3.87% more accurate than its closest competitor and 6.13% more precise on average on established benchmarks, but also achieves a 4.48% increase in accuracy on OmniGen,along with exceptional robustness to common image perturbations. We introduce not only a top-performing detector, but also a new benchmark and framework for furthering universal AI image detection. The code and dataset are available at http://github.com/amir-aman/FusionDetect


Key Contributions

  • FusionDetect: a novel AI-generated image detector fusing CLIP (semantic) and DINOv2 (structural/textural) frozen foundation model features to generalize across both generators and visual domains
  • OmniGen Benchmark: a new evaluation dataset spanning 12 SOTA generators designed to test cross-generator and cross-visual-domain generalization simultaneously
  • Two-axis generalization framework that extends the conventional cross-generator axis with a second, cross-visual-domain axis for more realistic detector evaluation

🛡️ Threat Analysis

Output Integrity Attack

FusionDetect is a novel AI-generated image detection architecture — directly addressing output integrity and content authenticity (detecting synthetic images from 12 SOTA generators). OmniGen Benchmark evaluates detectors across cross-generator and cross-visual-domain axes. Both contributions squarely target ML09: authenticating and verifying the provenance of AI-generated content.


Details

Domains
visiongenerative
Model Types
transformerdiffusiongan
Threat Tags
inference_time
Datasets
OmniGen (proposed)GenImageImagineNetAIDE
Applications
ai-generated image detectiondeepfake detectioncontent authenticity verification