defense 2026

Preserving Forgery Artifacts: AI-Generated Video Detection at Native Scale

Zhengcen Li 1,2, Chenyang Jiang 1,2, Hang Zhao 1, Shiyang Zhou 1, Yunyang Mo 1, Feng Gao 3, Fan Yang 3, Qiben Shan 2, Shaocong Wu 2, Jingyong Su 1

0 citations

α

Published on arXiv

2604.04634

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Achieves superior performance across multiple benchmarks by preserving high-frequency artifacts through native-scale processing

Native-Scale Video Detection Framework

Novel technique introduced


The rapid advancement of video generation models has enabled the creation of highly realistic synthetic media, raising significant societal concerns regarding the spread of misinformation. However, current detection methods suffer from critical limitations. They rely on preprocessing operations like fixed-resolution resizing and cropping. These operations not only discard subtle, high-frequency forgery traces but also cause spatial distortion and significant information loss. Furthermore, existing methods are often trained and evaluated on outdated datasets that fail to capture the sophistication of modern generative models. To address these challenges, we introduce a comprehensive dataset and a novel detection framework. First, we curate a large-scale dataset of over 140K videos from 15 state-of-the-art open-source and commercial generators, along with Magic Videos benchmark designed specifically for evaluating ultra-realistic synthetic content. In addition, we propose a novel detection framework built on the Qwen2.5-VL Vision Transformer, which operates natively at variable spatial resolutions and temporal durations. This native-scale approach effectively preserves the high-frequency artifacts and spatiotemporal inconsistencies typically lost during conventional preprocessing. Extensive experiments demonstrate that our method achieves superior performance across multiple benchmarks, underscoring the critical importance of native-scale processing and establishing a robust new baseline for AI-generated video detection.


Key Contributions

  • Large-scale dataset of 140K videos from 15 state-of-the-art video generators plus Magic Videos benchmark for ultra-realistic content
  • Native-scale detection framework using Qwen2.5-VL that operates at variable resolutions without preprocessing distortion
  • Demonstrates that native-scale processing preserves high-frequency forgery artifacts lost by conventional fixed-resolution methods

🛡️ Threat Analysis

Output Integrity Attack

Detects AI-generated video content to verify authenticity and provenance — this is output integrity and content authentication. The paper addresses the challenge of distinguishing synthetic from real videos by preserving forgery artifacts.


Details

Domains
visionmultimodal
Model Types
transformermultimodal
Threat Tags
inference_time
Datasets
Magic Videoscustom dataset of 140K videos from 15 generators
Applications
deepfake detectionsynthetic video detectioncontent authentication