Fairness-Aware Deepfake Detection: Leveraging Dual-Mechanism Optimization
Feng Ding 1, Wenhui Yi 1, Yunpeng Zhou 1, Xinan He 1,2, Hong Rao 1, Shu Hu 3
Published on arXiv
2511.10150
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
The proposed framework improves both inter-group and intra-group fairness across gender and race demographics while maintaining overall deepfake detection accuracy across domains, outperforming prior fairness-enhanced detectors.
Dual-Mechanism Collaborative Optimization
Novel technique introduced
Fairness is a core element in the trustworthy deployment of deepfake detection models, especially in the field of digital identity security. Biases in detection models toward different demographic groups, such as gender and race, may lead to systemic misjudgments, exacerbating the digital divide and social inequities. However, current fairness-enhanced detectors often improve fairness at the cost of detection accuracy. To address this challenge, we propose a dual-mechanism collaborative optimization framework. Our proposed method innovatively integrates structural fairness decoupling and global distribution alignment: decoupling channels sensitive to demographic groups at the model architectural level, and subsequently reducing the distance between the overall sample distribution and the distributions corresponding to each demographic group at the feature level. Experimental results demonstrate that, compared with other methods, our framework improves both inter-group and intra-group fairness while maintaining overall detection accuracy across domains.
Key Contributions
- Structural fairness decoupling that separates demographic-sensitive channels at the architectural level to isolate group-specific biases
- Global distribution alignment module that reduces feature-space distance between the overall sample distribution and per-demographic-group distributions
- Demonstrated improvement in both inter-group and intra-group fairness while preserving cross-domain detection accuracy compared to prior fairness-enhanced detectors
🛡️ Threat Analysis
The paper's primary contribution is a new deepfake detection framework — AI-generated content detection (deepfake detection) is explicitly enumerated under ML09 (Output Integrity Attack). The novel dual-mechanism architecture addresses equitable detection performance across demographic groups while maintaining overall detection accuracy.