Knowledge-Guided Prompt Learning for Deepfake Facial Image Detection

Recent generative models demonstrate impressive performance on synthesizing photographic images, which makes humans hardly to distinguish them from pristine ones, especially on realistic-looking synthetic facial images. Previous works mostly focus on mining discriminative artifacts from vast amount of visual data. However, they usually lack the exploration of prior knowledge and rarely pay attention to the domain shift between training categories (e.g., natural and indoor objects) and testing ones (e.g., fine-grained human facial images), resulting in unsatisfactory detection performance. To address these issues, we propose a novel knowledge-guided prompt learning method for deepfake facial image detection. Specifically, we retrieve forgery-related prompts from large language models as expert knowledge to guide the optimization of learnable prompts. Besides, we elaborate test-time prompt tuning to alleviate the domain shift, achieving significant performance improvement and boosting the application in real-world scenarios. Extensive experiments on DeepFakeFaceForensics dataset show that our proposed approach notably outperforms state-of-the-art methods.

Key Contributions

Knowledge-Guided Prompt learning (KGP) that retrieves forgery-related concepts from GPT-4 to construct semantically meaningful CLIP prompts for deepfake detection
Test-Time Prompt Tuning (TTP) using pseudo-labels to alleviate domain shift between training categories (natural/indoor objects) and testing categories (fine-grained human faces)
State-of-the-art performance on the DeepFakeFaceForensics dataset

🛡️ Threat Analysis

Output Integrity Attack

Proposes a deepfake facial image detection system — directly addresses AI-generated content detection, which is a canonical ML09 output integrity concern. The method authenticates whether images are real or synthetically generated.

Details

Domains

visionnlp

Model Types

vlmtransformerllm

Threat Tags

inference_time

Datasets

DeepFakeFaceForensics

Applications

2025 0 cit.

Output Integrity Attack

75%