defense 2025

Beyond Flicker: Detecting Kinematic Inconsistencies for Generalizable Deepfake Video Detection

Alejandro Cobo 1, Roberto Valle 1, José Miguel Buenaposada 2, Luis Baumela 1

0 citations · 40 references · arXiv

α

Published on arXiv

2512.04175

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Hybrid training with kinematic pseudo-fakes achieves state-of-the-art generalization across multiple leading deepfake video detection benchmarks

Kinematic Inconsistency Pseudo-Fake Generation

Novel technique introduced


Generalizing deepfake detection to unseen manipulations remains a key challenge. A recent approach to tackle this issue is to train a network with pristine face images that have been manipulated with hand-crafted artifacts to extract more generalizable clues. While effective for static images, extending this to the video domain is an open issue. Existing methods model temporal artifacts as frame-to-frame instabilities, overlooking a key vulnerability: the violation of natural motion dependencies between different facial regions. In this paper, we propose a synthetic video generation method that creates training data with subtle kinematic inconsistencies. We train an autoencoder to decompose facial landmark configurations into motion bases. By manipulating these bases, we selectively break the natural correlations in facial movements and introduce these artifacts into pristine videos via face morphing. A network trained on our data learns to spot these sophisticated biomechanical flaws, achieving state-of-the-art generalization results on several popular benchmarks.


Key Contributions

  • An autoencoder-based generative model that learns a structured representation of facial kinematics and can selectively break natural inter-region motion correlations
  • A flexible synthesis pipeline that introduces non-rigid temporal kinematic artifacts into pristine videos via face morphing to create pseudo-fake training data
  • A hybrid training strategy combining spatial pseudo-fakes with the proposed kinematic artifacts, achieving state-of-the-art generalization on multiple deepfake detection benchmarks

🛡️ Threat Analysis

Output Integrity Attack

Deepfake video detection is squarely Output Integrity — detecting AI-manipulated content. The paper's primary contribution is a novel detection method trained on synthetic kinematic artifacts, not merely applying existing detectors to a domain.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
digitalinference_time
Datasets
FaceForensics++
Applications
deepfake video detectionfacial forgery detection