Improving Generalization in Deepfake Detection with Face Foundation Models and Metric Learning

The increasing realism and accessibility of deepfakes have raised critical concerns about media authenticity and information integrity. Despite recent advances, deepfake detection models often struggle to generalize beyond their training distributions, particularly when applied to media content found in the wild. In this work, we present a robust video deepfake detection framework with strong generalization that takes advantage of the rich facial representations learned by face foundation models. Our method is built on top of FSFM, a self-supervised model trained on real face data, and is further fine-tuned using an ensemble of deepfake datasets spanning both face-swapping and face-reenactment manipulations. To enhance discriminative power, we incorporate triplet loss variants during training, guiding the model to produce more separable embeddings between real and fake samples. Additionally, we explore attribution-based supervision schemes, where deepfakes are categorized by manipulation type or source dataset, to assess their impact on generalization. Extensive experiments across diverse evaluation benchmarks demonstrate the effectiveness of our approach, especially in challenging real-world scenarios.

Key Contributions

Deepfake detection framework built on FSFM (a self-supervised face foundation model) fine-tuned across diverse face-swapping and face-reenactment datasets for cross-dataset generalization
Integration of triplet loss variants to produce more separable embeddings between real and fake samples, improving discriminative power
Attribution-based supervision schemes that categorize deepfakes by manipulation type or source dataset to assess their impact on generalization

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel detection architecture for AI-generated face content (deepfakes) — both face-swapping and face-reenactment — with a focus on generalization beyond training distributions. This is a content authenticity/output integrity contribution, not a mere domain application of existing detectors.

Details

Domains

vision

Model Types

transformer

Threat Tags

inference_time

Applications

2025 3 cit.

Output Integrity Attack

89%

Improving Generalization in Deepfake Detection with Face Foundation Models and Metric Learning

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

DinoLizer: Learning from the Best for Generative Inpainting Localization

DeiTFake: Deepfake Detection Model using DeiT Multi-Stage Training

Fine-Grained DINO Tuning with Dual Supervision for Face Forgery Detection

DevFD: Developmental Face Forgery Detection by Learning Shared and Orthogonal LoRA Subspaces

Deepfake Detection that Generalizes Across Benchmarks

Vulnerability-Aware Spatio-Temporal Learning for Generalizable Deepfake Video Detection

OmniFD: A Unified Model for Versatile Face Forgery Detection

FakeRadar: Probing Forgery Outliers to Detect Unknown Deepfake Videos