UniAIDet: A Unified and Universal Benchmark for AI-Generated Image Content Detection and Localization
Published on arXiv
2510.23023
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Existing detection and localization methods perform poorly on UniAIDet, with generalization remaining a major challenge, though strong detection performance generally correlates with strong localization performance.
UniAIDet
Novel technique introduced
With the rapid proliferation of image generative models, the authenticity of digital images has become a significant concern. While existing studies have proposed various methods for detecting AI-generated content, current benchmarks are limited in their coverage of diverse generative models and image categories, often overlooking end-to-end image editing and artistic images. To address these limitations, we introduce UniAIDet, a unified and comprehensive benchmark that includes both photographic and artistic images. UniAIDet covers a wide range of generative models, including text-to-image, image-to-image, image inpainting, image editing, and deepfake models. Using UniAIDet, we conduct a comprehensive evaluation of various detection methods and answer three key research questions regarding generalization capability and the relation between detection and localization. Our benchmark and analysis provide a robust foundation for future research.
Key Contributions
- First large-scale benchmark (80K images, 20 generative models) covering both photographic and artistic images, with pixel-level localization masks for partially generated images
- Comprehensive evaluation exposing that existing detection and localization methods generalize poorly across generative model types and image categories
- Analysis answering three research questions: detection–localization correlation, generalization across generative models, and generalization across image categories (photo vs. art)
🛡️ Threat Analysis
AI-generated content detection — specifically deepfake, text-to-image, inpainting, and image-editing detection and localization — is a core ML09 topic (output integrity and content provenance). The benchmark is explicitly designed to evaluate detectors that verify whether image content is authentic or AI-generated.