Fine-Grained DINO Tuning with Dual Supervision for Face Forgery Detection
Tianxiang Zhang , Peipeng Yu , Zhihua Xia , Longchen Dai , Xiaoyu Zhou , Hui Gao
Published on arXiv
2511.12107
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Achieves best overall cross-manipulation performance on DF40 and competitive cross-dataset accuracy using only 3.5M trainable parameters, matching or outperforming more complex state-of-the-art methods.
DFF-Adapter (DeepFake Fine-Grained Adapter)
Novel technique introduced
The proliferation of sophisticated deepfakes poses significant threats to information integrity. While DINOv2 shows promise for detection, existing fine-tuning approaches treat it as generic binary classification, overlooking distinct artifacts inherent to different deepfake methods. To address this, we propose a DeepFake Fine-Grained Adapter (DFF-Adapter) for DINOv2. Our method incorporates lightweight multi-head LoRA modules into every transformer block, enabling efficient backbone adaptation. DFF-Adapter simultaneously addresses authenticity detection and fine-grained manipulation type classification, where classifying forgery methods enhances artifact sensitivity. We introduce a shared branch propagating fine-grained manipulation cues to the authenticity head. This enables multi-task cooperative optimization, explicitly enhancing authenticity discrimination with manipulation-specific knowledge. Utilizing only 3.5M trainable parameters, our parameter-efficient approach achieves detection accuracy comparable to or even surpassing that of current complex state-of-the-art methods.
Key Contributions
- DFF-Adapter: lightweight multi-head LoRA modules inserted into every DINOv2 transformer block for parameter-efficient deepfake-specific fine-tuning (3.5M trainable parameters)
- Forgery-Aware Multi-Head Router that partitions transformer features into subspaces and dynamically routes each subspace to a top-3 set of LoRA experts for fine-grained artifact mining
- Shared branch architecture that propagates fine-grained forgery-type cues to the authenticity detection head via multi-task cooperative optimization
🛡️ Threat Analysis
The paper's primary contribution is a novel deepfake detection architecture that identifies AI-generated/manipulated face content. Deepfake detection is explicitly covered under ML09 (AI-generated content detection / output integrity). The paper proposes a novel architectural approach (DFF-Adapter with multi-head LoRA, forgery-aware routing, shared cross-task branch) rather than merely applying existing methods to a domain.