defense 2026

Modeling Biomechanical Constraint Violations for Language-Agnostic Lip-Sync Deepfake Detection

Hao Chen , Junnan Xu

0 citations

α

Published on arXiv

2604.16808

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Achieves AUC 0.905 on English, 0.779 on Chinese Mandarin, 0.969 on multi-ethnic FakeAVCeleb (σ=0.009 across five groups), and 0.843 on seven-language PolyGlotFake in zero-shot transfer using only 107,777 parameters

BioLip

Novel technique introduced


Current lip-sync deepfake detectors rely on pixel-level artifacts or audio-visual correspondence, failing to generalize across languages because these cues encode data-dependent patterns rather than universal physical laws. We identify a more fundamental principle: generative models do not enforce the biomechanical constraints of authentic orofacial articulation, producing measurably elevated temporal lip variance -- a signal we term temporal lip jitter -- that is empirically consistent across the speaker's language, ethnicity, and recording conditions. We instantiate this principle through BioLip, a lightweight framework operating on 64 perioral landmark coordinates extracted by MediaPipe.


Key Contributions

  • Physics-grounded detection using temporal lip jitter as a biomechanical constraint violation proxy, consistent across languages and ethnic groups
  • Feature decomposition showing temporal kinematic features generalize cross-lingually while spectral features encode language-dependent patterns
  • Privacy-preserving detection operating on 64 perioral landmarks without raw pixels or audio, deployable on edge devices

🛡️ Threat Analysis

Output Integrity Attack

Authenticates video content by detecting AI-generated lip-sync deepfakes. The paper addresses output integrity and content authenticity verification — determining whether video content (specifically lip motion) is authentic or synthetically generated. This is deepfake detection, which falls under ML09's scope of AI-generated content detection and content provenance.


Details

Domains
visionmultimodal
Model Types
gandiffusiongenerative
Threat Tags
inference_timedigital
Datasets
AVLipsCMLRFakeAVCelebPolyGlotFake
Applications
deepfake detectionlip-sync forgery detectionvideo authentication