defense 2026

HEDGE: Heterogeneous Ensemble for Detection of AI-GEnerated Images in the Wild

Fei Wu 1, Dagong Lu 2, Mufeng Yao 2, Xinlei Xu 2, Fengjun Guo 2

0 citations

α

Published on arXiv

2604.03555

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Achieves 4th place in NTIRE 2026 Robust AI-Generated Image Detection in the Wild Challenge with state-of-the-art robustness across multiple AIGC benchmarks

HEDGE

Novel technique introduced


Robust detection of AI-generated images in the wild remains challenging due to the rapid evolution of generative models and varied real-world distortions. We argue that relying on a single training regime, resolution, or backbone is insufficient to handle all conditions, and that structured heterogeneity across these dimensions is essential for robust detection. To this end, we propose HEDGE, a Heterogeneous Ensemble for Detection of AI-GEnerated images, that introduces complementary detection routes along three axes: diverse training data with strong augmentation, multi-scale feature extraction, and backbone heterogeneity. Specifically, Route~A progressively constructs DINOv3-based detectors through staged data expansion and augmentation escalation, Route~B incorporates a higher-resolution branch for fine-grained forensic cues, and Route~C adds a MetaCLIP2-based branch for backbone diversity. All outputs are fused via logit-space weighted averaging, refined by a lightweight dual-gating mechanism that handles branch-level outliers and majority-dominated fusion errors. HEDGE achieves 4th place in the NTIRE 2026 Robust AI-Generated Image Detection in the Wild Challenge and attains state-of-the-art performance with strong robustness on multiple AIGC image detection benchmarks.


Key Contributions

  • Heterogeneous ensemble architecture combining DINOv3 and MetaCLIP2 backbones with multi-scale feature extraction
  • Three-route detection strategy with staged data expansion, multi-resolution branches, and backbone diversity
  • Dual-gating fusion mechanism to handle outliers and majority-dominated errors in ensemble outputs

🛡️ Threat Analysis

Output Integrity Attack

Detects AI-generated images to verify content authenticity and provenance — core output integrity problem. The paper builds a robust detection system for synthetic images in the wild.


Details

Domains
visiongenerative
Model Types
diffusiongantransformer
Threat Tags
inference_time
Datasets
NTIRE 2026 Challenge dataset
Applications
ai-generated image detectiondeepfake detectionsynthetic media forensics