defense 2025

Efficient and Robust Video Defense Framework against 3D-field Personalized Talking Face

Rui-qing Sun , Xingshan Yao , Tian Lan , Jia-Ling Shi , Chen-Hao Cui , Hui-Yang Zhao , Zhijing Wu , Chen Yang , Xian-Ling Mao

0 citations · 45 references · arXiv

α

Published on arXiv

2512.21019

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Achieves 47x acceleration over the fastest baseline while maintaining strong defense capability and high video fidelity, with robustness against scaling operations and state-of-the-art purification attacks

VDF (Video Defense Framework)

Novel technique introduced


State-of-the-art 3D-field video-referenced Talking Face Generation (TFG) methods synthesize high-fidelity personalized talking-face videos in real time by modeling 3D geometry and appearance from reference portrait video. This capability raises significant privacy concerns regarding malicious misuse of personal portraits. However, no efficient defense framework exists to protect such videos against 3D-field TFG methods. While image-based defenses could apply per-frame 2D perturbations, they incur prohibitive computational costs, severe video quality degradation, failing to disrupt 3D information for video protection. To address this, we propose a novel and efficient video defense framework against 3D-field TFG methods, which protects portrait video by perturbing the 3D information acquisition process while maintain high-fidelity video quality. Specifically, our method introduces: (1) a similarity-guided parameter sharing mechanism for computational efficiency, and (2) a multi-scale dual-domain attention module to jointly optimize spatial-frequency perturbations. Extensive experiments demonstrate that our proposed framework exhibits strong defense capability and achieves a 47x acceleration over the fastest baseline while maintaining high fidelity. Moreover, it remains robust against scaling operations and state-of-the-art purification attacks, and the effectiveness of our design choices is further validated through ablation studies. Our project is available at https://github.com/Richen7418/VDF.


Key Contributions

  • Novel video defense framework that perturbs 3D information acquisition in TFG pipelines, rather than applying per-frame 2D perturbations, achieving 47x speedup over the fastest baseline
  • Similarity-guided parameter sharing mechanism that reduces redundant per-frame optimization for computational efficiency
  • Multi-scale dual-domain attention module that jointly optimizes spatial and frequency perturbations for high-fidelity protection

🛡️ Threat Analysis

Output Integrity Attack

Paper adds protective adversarial perturbations to portrait videos to prevent AI-generated deepfake synthesis (Talking Face Generation), which is content integrity protection — analogous to anti-deepfake perturbations classified under Output Integrity. The paper also evaluates robustness against purification attacks, which are ML09-class attacks designed to strip away such protections.


Details

Domains
vision
Model Types
generative
Threat Tags
white_boxinference_timedigital
Applications
portrait video protectiondeepfake preventiontalking face generation defense