benchmark 2025

Towards Real-World Deepfake Detection: A Diverse In-the-wild Dataset of Forgery Faces

Junyu Shi ¹, Minghui Li ¹, Junguo Zuo ¹, Zhifei Yu ¹, Yipeng Lin ¹, Shengshan Hu ¹, Ziqi Zhou ¹, Yechao Zhang ¹, Wei Wan ¹, Yinzhe Xu ¹, Leo Yu Zhang ²

¹ Huazhong University of Science and Technology

² Griffith University

0 citations · 58 references · arXiv

Published on arXiv

2510.08067

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Extensive experiments show that existing deepfake detection schemes have severely limited practicality when evaluated against real-world commercial-platform deepfakes, especially under social network dissemination conditions.

RedFace

Novel technique introduced

Deepfakes, leveraging advanced AIGC (Artificial Intelligence-Generated Content) techniques, create hyper-realistic synthetic images and videos of human faces, posing a significant threat to the authenticity of social media. While this real-world threat is increasingly prevalent, existing academic evaluations and benchmarks for detecting deepfake forgery often fall short to achieve effective application for their lack of specificity, limited deepfake diversity, restricted manipulation techniques.To address these limitations, we introduce RedFace (Real-world-oriented Deepfake Face), a specialized facial deepfake dataset, comprising over 60,000 forged images and 1,000 manipulated videos derived from authentic facial features, to bridge the gap between academic evaluations and real-world necessity. Unlike prior benchmarks, which typically rely on academic methods to generate deepfakes, RedFace utilizes 9 commercial online platforms to integrate the latest deepfake technologies found "in the wild", effectively simulating real-world black-box scenarios.Moreover, RedFace's deepfakes are synthesized using bespoke algorithms, allowing it to capture diverse and evolving methods used by real-world deepfake creators. Extensive experimental results on RedFace (including cross-domain, intra-domain, and real-world social network dissemination simulations) verify the limited practicality of existing deepfake detection schemes against real-world applications. We further perform a detailed analysis of the RedFace dataset, elucidating the reason of its impact on detection performance compared to conventional datasets. Our dataset is available at: https://github.com/kikyou-220/RedFace.

Key Contributions

RedFace dataset: 60,000+ forged images and 1,000 manipulated videos sourced from 9 commercial deepfake platforms, capturing real-world in-the-wild forgery diversity
Comprehensive benchmark evaluation (cross-domain, intra-domain, and social network dissemination simulation) revealing that existing academic deepfake detectors fail significantly in real-world scenarios
Detailed analysis of dataset characteristics explaining why RedFace degrades detection performance compared to conventional academic datasets

🛡️ Threat Analysis

Output Integrity Attack

The paper's core contribution is a large-scale benchmark dataset for AI-generated face (deepfake) detection, directly targeting the problem of verifying the authenticity of AI-generated content. It evaluates existing deepfake detection schemes and demonstrates their failure in real-world conditions — a core output integrity concern.

Details

Domains

vision

Model Types

cnntransformergandiffusion

Threat Tags

black_boxinference_time

Datasets

RedFace

Applications

deepfake detectionfacial forgery detectionsocial media content authenticity

Read PDF arXiv DOI Code

Towards Real-World Deepfake Detection: A Diverse In-the-wild Dataset of Forgery Faces

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Is It Certainly a Deepfake? Reliability Analysis in Detection & Generation Ecosystem

How well are open sourced AI-generated image detection models out-of-the-box: A comprehensive benchmark study

Beyond Spectral Peaks: Interpreting the Cues Behind Synthetic Image Detection

DWBench: Holistic Evaluation of Watermark for Dataset Copyright Auditing

CTForensics: A Comprehensive Dataset and Method for AI-Generated CT Image Detection

Exploration of Reproducible Generated Image Detection

AI-Generated Image Detection: An Empirical Study and Future Research Directions

OpenFake: An Open Dataset and Platform Toward Real-World Deepfake Detection