Defending Deepfake via Texture Feature Perturbation
Xiao Zhang 1, Changfang Chen 1, Tianyi Wang 2
Published on arXiv
2508.17315
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
The texture-guided perturbation framework produces obvious visual defects in deepfake-generated outputs across multiple attack models (StarGAN, AttentionGAN, AttGAN, StarGAN-V2) while keeping perturbations imperceptible to human eyes in the protected source images.
The rapid development of Deepfake technology poses severe challenges to social trust and information security. While most existing detection methods primarily rely on passive analyses, due to unresolvable high-quality Deepfake contents, proactive defense has recently emerged by inserting invisible signals in advance of image editing. In this paper, we introduce a proactive Deepfake detection approach based on facial texture features. Since human eyes are more sensitive to perturbations in smooth regions, we invisibly insert perturbations within texture regions that have low perceptual saliency, applying localized perturbations to key texture regions while minimizing unwanted noise in non-textured areas. Our texture-guided perturbation framework first extracts preliminary texture features via Local Binary Patterns (LBP), and then introduces a dual-model attention strategy to generate and optimize texture perturbations. Experiments on CelebA-HQ and LFW datasets demonstrate the promising performance of our method in distorting Deepfake generation and producing obvious visual defects under multiple attack models, providing an efficient and scalable solution for proactive Deepfake detection.
Key Contributions
- Texture-guided perturbation framework using Local Binary Patterns (LBP) to focus invisible perturbations on low-saliency texture regions, avoiding perceptually obvious noise in smooth areas.
- Dual-model attention strategy combining Grad-CAM feature attention maps with LBP texture maps to optimize perturbation placement and direction across multiple spatial scales.
- Multi-objective loss function balancing deepfake disruption effectiveness and visual quality, demonstrated across multiple Deepfake generation models on CelebA-HQ and LFW datasets.
🛡️ Threat Analysis
Proactive defense against deepfake AI-generated content: embeds imperceptible perturbations in source images to disrupt deepfake generation models, causing visible defects in forged outputs — this is a content integrity/provenance protection technique in the deepfake defense space.