Your One-Stop Solution for AI-Generated Video Detection
Long Ma 1, Zihao Xue 2, Yan Wang 3, Zhiyuan Yan 4, Jin Xu 1, Xiaorui Jiang 1, Haiyang Yu 1,5, Yong Liao 1, Zhen Bi 2
Published on arXiv
2601.11035
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Comprehensive evaluation of 33 detectors reveals fundamental limitations in existing methods and identifies 4 novel findings to guide future AI-generated video detection research.
AIGVDBench
Novel technique introduced
Recent advances in generative modeling can create remarkably realistic synthetic videos, making it increasingly difficult for humans to distinguish them from real ones and necessitating reliable detection methods. However, two key limitations hinder the development of this field. \textbf{From the dataset perspective}, existing datasets are often limited in scale and constructed using outdated or narrowly scoped generative models, making it difficult to capture the diversity and rapid evolution of modern generative techniques. Moreover, the dataset construction process frequently prioritizes quantity over quality, neglecting essential aspects such as semantic diversity, scenario coverage, and technological representativeness. \textbf{From the benchmark perspective}, current benchmarks largely remain at the stage of dataset creation, leaving many fundamental issues and in-depth analysis yet to be systematically explored. Addressing this gap, we propose AIGVDBench, a benchmark designed to be comprehensive and representative, covering \textbf{31} state-of-the-art generation models and over \textbf{440,000} videos. By executing more than \textbf{1,500} evaluations on \textbf{33} existing detectors belonging to four distinct categories. This work presents \textbf{8 in-depth analyses} from multiple perspectives and identifies \textbf{4 novel findings} that offer valuable insights for future research. We hope this work provides a solid foundation for advancing the field of AI-generated video detection. Our benchmark is open-sourced at https://github.com/LongMa-2025/AIGVDBench.
Key Contributions
- AIGVDBench: a large-scale benchmark covering 31 generative models and 440,000+ videos with emphasis on semantic diversity and technological representativeness
- Systematic evaluation of 33 existing AI-generated video detectors across four categories via 1,500+ experimental runs
- 8 in-depth analyses and 4 novel findings that characterize key failure modes and open challenges in AI-generated video detection
🛡️ Threat Analysis
The paper directly addresses AI-generated content detection — specifically detecting synthetic videos produced by generative models. This is squarely output integrity and content provenance. The benchmark evaluates 33 detectors, providing a systematic framework to assess the state of the art in deepfake/AI-video detection.