defense 2025

D3: Training-Free AI-Generated Video Detection Using Second-Order Features

Chende Zheng ¹, Ruiqi suo ¹, Chenhao Lin ¹, Zhengyu Zhao ¹, Le Yang ¹, Shuai Liu ¹, Minghui Yang ², Cong Wang ³, Chao Shen ¹

¹ Xi’an Jiaotong University

² Guangdong OPPO Mobile Communications Co., Ltd.

³ City University of Hong Kong

0 citations

Published on arXiv

2508.00701

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

D3 outperforms the previous best method by 10.39% absolute mAP on GenVideo while requiring no training and remaining computationally efficient with strong robustness to post-processing operations.

D3 (Detection by Difference of Differences)

Novel technique introduced

The evolution of video generation techniques, such as Sora, has made it increasingly easy to produce high-fidelity AI-generated videos, raising public concern over the dissemination of synthetic content. However, existing detection methodologies remain limited by their insufficient exploration of temporal artifacts in synthetic videos. To bridge this gap, we establish a theoretical framework through second-order dynamical analysis under Newtonian mechanics, subsequently extending the Second-order Central Difference features tailored for temporal artifact detection. Building on this theoretical foundation, we reveal a fundamental divergence in second-order feature distributions between real and AI-generated videos. Concretely, we propose Detection by Difference of Differences (D3), a novel training-free detection method that leverages the above second-order temporal discrepancies. We validate the superiority of our D3 on 4 open-source datasets (Gen-Video, VideoPhy, EvalCrafter, VidProM), 40 subsets in total. For example, on GenVideo, D3 outperforms the previous best method by 10.39% (absolute) mean Average Precision. Additional experiments on time cost and post-processing operations demonstrate D3's exceptional computational efficiency and strong robust performance. Our code is available at https://github.com/Zig-HS/D3.

Key Contributions

Theoretical framework grounding AI-generated video detection in second-order dynamical analysis (Newtonian mechanics), revealing a fundamental divergence in second-order feature distributions between real and synthetic videos
D3 (Detection by Difference of Differences): a training-free detector that computes second-order central difference features over optical flow to classify videos without any model training or fine-tuning
Evaluation across 40 subsets of 4 open-source benchmarks (Gen-Video, VideoPhy, EvalCrafter, VidProM) achieving +10.39% absolute mAP over SOTA on GenVideo with strong robustness to post-processing

🛡️ Threat Analysis

Output Integrity Attack

D3 is a novel forensic technique specifically designed to detect AI-generated video content by exploiting second-order temporal artifacts. The paper's primary contribution is a new detection architecture (not a domain application of existing methods), grounded in a theoretical framework that reveals a fundamental distributional divergence between real and AI-generated video outputs — directly addressing output integrity and content authenticity.

Details

Domains

visiongenerative

Model Types

diffusiontransformer

Threat Tags

inference_time

Datasets

Gen-VideoVideoPhyEvalCrafterVidProM

Applications

ai-generated video detectiondeepfake video detectionsynthetic content authenticity verification

Read PDF arXiv Code

D3: Training-Free AI-Generated Video Detection Using Second-Order Features

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Grab-3D: Detecting AI-Generated Videos from 3D Geometric Temporal Consistency

SafeCtrl: Region-Based Safety Control for Text-to-Image Diffusion via Detect-Then-Suppress

Detecting Generated Images by Fitting Natural Image Distributions

Patch-Discontinuity Mining for Generalized Deepfake Detection

Exposing DeepFakes via Hyperspectral Domain Mapping

Training-free Detection of AI-generated images via Cropping Robustness

Rethinking the Use of Vision Transformers for AI-Generated Image Detection

Detecting AI-Generated Images via Distributional Deviations from Real Images