defense 2025

Fairness-Aware Deepfake Detection: Leveraging Dual-Mechanism Optimization

Feng Ding 1, Wenhui Yi 1, Yunpeng Zhou 1, Xinan He 1,2, Hong Rao 1, Shu Hu 3

0 citations · 47 references · arXiv

α

Published on arXiv

2511.10150

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

The proposed framework improves both inter-group and intra-group fairness across gender and race demographics while maintaining overall deepfake detection accuracy across domains, outperforming prior fairness-enhanced detectors.

Dual-Mechanism Collaborative Optimization

Novel technique introduced


Fairness is a core element in the trustworthy deployment of deepfake detection models, especially in the field of digital identity security. Biases in detection models toward different demographic groups, such as gender and race, may lead to systemic misjudgments, exacerbating the digital divide and social inequities. However, current fairness-enhanced detectors often improve fairness at the cost of detection accuracy. To address this challenge, we propose a dual-mechanism collaborative optimization framework. Our proposed method innovatively integrates structural fairness decoupling and global distribution alignment: decoupling channels sensitive to demographic groups at the model architectural level, and subsequently reducing the distance between the overall sample distribution and the distributions corresponding to each demographic group at the feature level. Experimental results demonstrate that, compared with other methods, our framework improves both inter-group and intra-group fairness while maintaining overall detection accuracy across domains.


Key Contributions

  • Structural fairness decoupling that separates demographic-sensitive channels at the architectural level to isolate group-specific biases
  • Global distribution alignment module that reduces feature-space distance between the overall sample distribution and per-demographic-group distributions
  • Demonstrated improvement in both inter-group and intra-group fairness while preserving cross-domain detection accuracy compared to prior fairness-enhanced detectors

🛡️ Threat Analysis

Output Integrity Attack

The paper's primary contribution is a new deepfake detection framework — AI-generated content detection (deepfake detection) is explicitly enumerated under ML09 (Output Integrity Attack). The novel dual-mechanism architecture addresses equitable detection performance across demographic groups while maintaining overall detection accuracy.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
digitalinference_time
Datasets
FaceForensics++CelebDF
Applications
deepfake detectiondigital identity securityfacial forgery detection