defense 2026

MPF-Net: Exposing High-Fidelity AI-Generated Video Forgeries via Hierarchical Manifold Deviation and Micro-Temporal Fluctuations

Xinan He 1,2, Kaiqing Lin 2, Yue Zhou 2, Jiaming Zhong 2, Wei Ye 3, Wenhui Yi 1, Bing Fan 4, Feng Ding 1, Haodong Li 2, Bo Cao 5, Bin Li 2

0 citations · 29 references · arXiv

α

Published on arXiv

2601.21408

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Hierarchical dual-path framework detects high-fidelity AI-generated videos that evade spatial-only detection by exploiting structured computational fingerprints in frame residuals.

MPF-Net

Novel technique introduced


With the rapid advancement of video generation models such as Veo and Wan, the visual quality of synthetic content has reached a level where macro-level semantic errors and temporal inconsistencies are no longer prominent. However, this does not imply that the distinction between real and cutting-edge high-fidelity fake is untraceable. We argue that AI-generated videos are essentially products of a manifold-fitting process rather than a physical recording. Consequently, the pixel composition logic of consecutive adjacent frames residual in AI videos exhibits a structured and homogenous characteristic. We term this phenomenon `Manifold Projection Fluctuations' (MPF). Driven by this insight, we propose a hierarchical dual-path framework that operates as a sequential filtering process. The first, the Static Manifold Deviation Branch, leverages the refined perceptual boundaries of Large-Scale Vision Foundation Models (VFMs) to capture residual spatial anomalies or physical violations that deviate from the natural real-world manifold (off-manifold). For the remaining high-fidelity videos that successfully reside on-manifold and evade spatial detection, we introduce the Micro-Temporal Fluctuation Branch as a secondary, fine-grained filter. By analyzing the structured MPF that persists even in visually perfect sequences, our framework ensures that forgeries are exposed regardless of whether they manifest as global real-world manifold deviations or subtle computational fingerprints.


Key Contributions

  • Introduces 'Manifold Projection Fluctuations' (MPF) as a theoretical explanation for why AI-generated videos exhibit structured, homogeneous inter-frame residuals distinct from real-world unstructured noise
  • Proposes MPF-Net, a hierarchical dual-path framework with a Static Manifold Deviation Branch (leveraging VFMs for spatial anomalies) and a Micro-Temporal Fluctuation Branch (exploiting subtle computational fingerprints in high-fidelity videos)
  • Provides a sequential filtering forensic pipeline that handles both off-manifold low-quality fakes and on-manifold high-fidelity AI-generated videos

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel deepfake/AI-generated video detection framework — a canonical ML09 output integrity and content authenticity contribution. Detection targets synthetic content produced by models like Sora, Veo, and Wan, not just applying existing detectors to a new domain but introducing a novel theoretical basis (Manifold Projection Fluctuations) and new architecture.


Details

Domains
vision
Model Types
transformergenerative
Threat Tags
inference_timedigital
Applications
ai-generated video detectiondeepfake forensicsvideo content authentication