Leveraging Unlabeled Data from Unknown Sources via Dual-Path Guidance for Deepfake Face Detection
Zhiqiang Yang 1,2, Renshuai Tao 1, Chunjie Zhang 1, guodong yang 2, Xiaolong Zheng 2, Yao Zhao 1
Published on arXiv
2508.09022
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
DPGNet significantly outperforms existing deepfake detectors on multiple mainstream benchmarks by leveraging unlabeled data from unknown generative sources
DPGNet
Novel technique introduced
Existing deepfake detection methods heavily rely on static labeled datasets. However, with the proliferation of generative models, real-world scenarios are flooded with massive amounts of unlabeled fake face data from unknown sources. This presents a critical dilemma: detectors relying solely on existing data face generalization failure, while manual labeling for this new stream is infeasible due to the high realism of fakes. A more fundamental challenge is that, unlike typical unsupervised learning tasks where categories are clearly defined, real and fake faces share the same semantics, which leads to a decline in the performance of traditional unsupervised strategies. Therefore, there is an urgent need for a new paradigm designed specifically for this scenario to effectively utilize these unlabeled data. Accordingly, this paper proposes a dual-path guided network (DPGNet) to address two key challenges: (1) bridging the domain differences between faces generated by different generative models; and (2) utilizing unlabeled image samples. The method comprises two core modules: text-guided cross-domain alignment, which uses learnable cues to unify visual and textual embeddings into a domain-invariant feature space; and curriculum-driven pseudo-label generation, which dynamically utilizes unlabeled samples. Extensive experiments on multiple mainstream datasets show that DPGNet significantly outperforms existing techniques,, highlighting its effectiveness in addressing the challenges posed by the deepfakes using unlabeled data.
Key Contributions
- DPGNet: a dual-path guided network for deepfake detection that leverages unlabeled fake face data from unknown generative sources
- Text-guided cross-domain alignment module that unifies visual and textual embeddings into a domain-invariant feature space using learnable cues
- Curriculum-driven pseudo-label generation that dynamically assigns labels to unlabeled samples to improve generalization
🛡️ Threat Analysis
Deepfake face detection is a canonical AI-generated content detection problem. This paper proposes a novel detection architecture (DPGNet) — not merely applying existing detectors — making it a genuine ML09 output-integrity contribution.