benchmark 2025

NeXT-IMDL: Build Benchmark for NeXT-Generation Image Manipulation Detection & Localization

Yifei Li , Haoyuan He , Yu Zheng , Bingyao Yu , Wenzhao Zheng , Lei Chen , Jie Zhou , Jiwen Lu

0 citations · 115 references · arXiv

α

Published on arXiv

2512.23374

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

All 11 evaluated IMDL models exhibit systemic failures and significant performance degradation under cross-dimension evaluation protocols, exposing a false sense of progress in the field driven by simplified cross-dataset evaluation.

NeXT-IMDL

Novel technique introduced


The accessibility surge and abuse risks of user-friendly image editing models have created an urgent need for generalizable, up-to-date methods for Image Manipulation Detection and Localization (IMDL). Current IMDL research typically uses cross-dataset evaluation, where models trained on one benchmark are tested on others. However, this simplified evaluation approach conceals the fragility of existing methods when handling diverse AI-generated content, leading to misleading impressions of progress. This paper challenges this illusion by proposing NeXT-IMDL, a large-scale diagnostic benchmark designed not just to collect data, but to probe the generalization boundaries of current detectors systematically. Specifically, NeXT-IMDL categorizes AIGC-based manipulations along four fundamental axes: editing models, manipulation types, content semantics, and forgery granularity. Built upon this, NeXT-IMDL implements five rigorous cross-dimension evaluation protocols. Our extensive experiments on 11 representative models reveal a critical insight: while these models perform well in their original settings, they exhibit systemic failures and significant performance degradation when evaluated under our designed protocols that simulate real-world, various generalization scenarios. By providing this diagnostic toolkit and the new findings, we aim to advance the development towards building truly robust, next-generation IMDL models.


Key Contributions

  • NeXT-IMDL: a large-scale diagnostic benchmark categorizing AIGC-based image manipulations along four axes (editing models, manipulation types, content semantics, forgery granularity)
  • Five rigorous cross-dimension evaluation protocols designed to stress-test generalization of IMDL detectors beyond conventional cross-dataset settings
  • Empirical analysis of 11 representative IMDL models revealing systemic failures and significant performance degradation under real-world generalization scenarios

🛡️ Threat Analysis

Output Integrity Attack

IMDL is squarely about detecting manipulated/AI-generated visual content — a core output integrity and content authenticity problem. The benchmark probes how well detectors can distinguish genuine from AI-manipulated images across diverse AIGC editing tools, manipulation types, content semantics, and forgery granularities.


Details

Domains
visiongenerative
Model Types
cnntransformerdiffusion
Threat Tags
inference_timedigital
Datasets
NeXT-IMDL (proposed)
Applications
image manipulation detection and localizationdeepfake detectionai-generated image forensics