defense 2026

ExposeAnyone: Personalized Audio-to-Expression Diffusion Models Are Robust Zero-Shot Face Forgery Detectors

Kaede Shiohara 1, Toshihiko Yamasaki 1, Vladislav Golyanik 2

0 citations · 112 references · arXiv

α

Published on arXiv

2601.02359

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Outperforms previous state-of-the-art by 4.22 percentage points in average AUC across DF-TIMIT, DFDCP, KoDF, and IDForge benchmarks while also detecting Sora2-generated videos where prior methods fail.

ExposeAnyone

Novel technique introduced


Detecting unknown deepfake manipulations remains one of the most challenging problems in face forgery detection. Current state-of-the-art approaches fail to generalize to unseen manipulations, as they primarily rely on supervised training with existing deepfakes or pseudo-fakes, which leads to overfitting to specific forgery patterns. In contrast, self-supervised methods offer greater potential for generalization, but existing work struggles to learn discriminative representations only from self-supervision. In this paper, we propose ExposeAnyone, a fully self-supervised approach based on a diffusion model that generates expression sequences from audio. The key idea is, once the model is personalized to specific subjects using reference sets, it can compute the identity distances between suspected videos and personalized subjects via diffusion reconstruction errors, enabling person-of-interest face forgery detection. Extensive experiments demonstrate that 1) our method outperforms the previous state-of-the-art method by 4.22 percentage points in the average AUC on DF-TIMIT, DFDCP, KoDF, and IDForge datasets, 2) our model is also capable of detecting Sora2-generated videos, where the previous approaches perform poorly, and 3) our method is highly robust to corruptions such as blur and compression, highlighting the applicability in real-world face forgery detection.


Key Contributions

  • Fully self-supervised face forgery detection approach using personalized audio-to-expression diffusion models, eliminating dependence on labeled deepfake data
  • Identity distance metric computed via diffusion reconstruction errors between suspected videos and personalized subject models enabling person-of-interest detection
  • Demonstrated zero-shot generalization to unseen manipulations including Sora2-generated videos and robustness to real-world corruptions (blur, compression)

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses AI-generated content detection (face forgery/deepfakes) with a novel detection architecture — classifies as output integrity given it authenticates whether video content is genuine or AI-manipulated.


Details

Domains
visionaudiomultimodalgenerative
Model Types
diffusionmultimodal
Threat Tags
inference_timeblack_box
Datasets
DF-TIMITDFDCPKoDFIDForge
Applications
face forgery detectiondeepfake detectionvideo authentication