Knowledge-Guided Prompt Learning for Deepfake Facial Image Detection
Hao Wang 1, Cheng Deng 2, Zhidong Zhao 1
Published on arXiv
2501.00700
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Proposed KGP-TTP notably outperforms state-of-the-art deepfake detection methods on DeepFakeFaceForensics by combining LLM-derived forgery knowledge with test-time prompt adaptation.
KGP-TTP (Knowledge-Guided Prompt Learning with Test-Time Prompt Tuning)
Novel technique introduced
Recent generative models demonstrate impressive performance on synthesizing photographic images, which makes humans hardly to distinguish them from pristine ones, especially on realistic-looking synthetic facial images. Previous works mostly focus on mining discriminative artifacts from vast amount of visual data. However, they usually lack the exploration of prior knowledge and rarely pay attention to the domain shift between training categories (e.g., natural and indoor objects) and testing ones (e.g., fine-grained human facial images), resulting in unsatisfactory detection performance. To address these issues, we propose a novel knowledge-guided prompt learning method for deepfake facial image detection. Specifically, we retrieve forgery-related prompts from large language models as expert knowledge to guide the optimization of learnable prompts. Besides, we elaborate test-time prompt tuning to alleviate the domain shift, achieving significant performance improvement and boosting the application in real-world scenarios. Extensive experiments on DeepFakeFaceForensics dataset show that our proposed approach notably outperforms state-of-the-art methods.
Key Contributions
- Knowledge-Guided Prompt learning (KGP) that retrieves forgery-related concepts from GPT-4 to construct semantically meaningful CLIP prompts for deepfake detection
- Test-Time Prompt Tuning (TTP) using pseudo-labels to alleviate domain shift between training categories (natural/indoor objects) and testing categories (fine-grained human faces)
- State-of-the-art performance on the DeepFakeFaceForensics dataset
🛡️ Threat Analysis
Proposes a deepfake facial image detection system — directly addresses AI-generated content detection, which is a canonical ML09 output integrity concern. The method authenticates whether images are real or synthetically generated.