defense 2025

Knowledge-Guided Prompt Learning for Deepfake Facial Image Detection

Hao Wang 1, Cheng Deng 2, Zhidong Zhao 1

0 citations · ICASSP

α

Published on arXiv

2501.00700

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Proposed KGP-TTP notably outperforms state-of-the-art deepfake detection methods on DeepFakeFaceForensics by combining LLM-derived forgery knowledge with test-time prompt adaptation.

KGP-TTP (Knowledge-Guided Prompt Learning with Test-Time Prompt Tuning)

Novel technique introduced


Recent generative models demonstrate impressive performance on synthesizing photographic images, which makes humans hardly to distinguish them from pristine ones, especially on realistic-looking synthetic facial images. Previous works mostly focus on mining discriminative artifacts from vast amount of visual data. However, they usually lack the exploration of prior knowledge and rarely pay attention to the domain shift between training categories (e.g., natural and indoor objects) and testing ones (e.g., fine-grained human facial images), resulting in unsatisfactory detection performance. To address these issues, we propose a novel knowledge-guided prompt learning method for deepfake facial image detection. Specifically, we retrieve forgery-related prompts from large language models as expert knowledge to guide the optimization of learnable prompts. Besides, we elaborate test-time prompt tuning to alleviate the domain shift, achieving significant performance improvement and boosting the application in real-world scenarios. Extensive experiments on DeepFakeFaceForensics dataset show that our proposed approach notably outperforms state-of-the-art methods.


Key Contributions

  • Knowledge-Guided Prompt learning (KGP) that retrieves forgery-related concepts from GPT-4 to construct semantically meaningful CLIP prompts for deepfake detection
  • Test-Time Prompt Tuning (TTP) using pseudo-labels to alleviate domain shift between training categories (natural/indoor objects) and testing categories (fine-grained human faces)
  • State-of-the-art performance on the DeepFakeFaceForensics dataset

🛡️ Threat Analysis

Output Integrity Attack

Proposes a deepfake facial image detection system — directly addresses AI-generated content detection, which is a canonical ML09 output integrity concern. The method authenticates whether images are real or synthetically generated.


Details

Domains
visionnlp
Model Types
vlmtransformerllm
Threat Tags
inference_time
Datasets
DeepFakeFaceForensics
Applications
deepfake detectionfacial image authentication