benchmark 2026

DiffFace-Edit: A Diffusion-Based Facial Dataset for Forgery-Semantic Driven Deepfake Detection Analysis

Feng Ding , Wenhui Yi , Xinan He , Mengyao Xiao , Jianfeng Xu , Jianqiang Du

1 citations · 23 references · arXiv

α

Published on arXiv

2601.13551

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Introduces over 2 million diffusion-generated partially-edited face images with fine-grained region annotations and first systematic study showing detector-evasive splice samples significantly degrade IMDL locator performance.

DiffFace-Edit

Novel technique introduced


Generative models now produce imperceptible, fine-grained manipulated faces, posing significant privacy risks. However, existing AI-generated face datasets generally lack focus on samples with fine-grained regional manipulations. Furthermore, no researchers have yet studied the real impact of splice attacks, which occur between real and manipulated samples, on detectors. We refer to these as detector-evasive samples. Based on this, we introduce the DiffFace-Edit dataset, which has the following advantages: 1) It contains over two million AI-generated fake images. 2) It features edits across eight facial regions (e.g., eyes, nose) and includes a richer variety of editing combinations, such as single-region and multi-region edits. Additionally, we specifically analyze the impact of detector-evasive samples on detection models. We conduct a comprehensive analysis of the dataset and propose a cross-domain evaluation that combines IMDL methods. Dataset will be available at https://github.com/ywh1093/DiffFace-Edit.


Key Contributions

  • DiffFace-Edit dataset: 2M+ AI-generated partially-edited face images across 8 facial regions with single- and multi-region manipulation annotations
  • First systematic dataset-level analysis of detector-evasive samples (splice attacks between real and manipulated regions) and their impact on IMDL detectors
  • Cross-domain evaluation framework integrating 8 IMDL locators to benchmark generalization of forgery detection methods

🛡️ Threat Analysis

Output Integrity Attack

Paper directly advances AI-generated content detection by providing a large-scale benchmark dataset of diffusion-model-based facial forgeries, including systematic analysis of detector-evasive samples that evade current IMDL detectors — squarely within content integrity and deepfake detection.


Details

Domains
visiongenerative
Model Types
diffusion
Threat Tags
inference_timedigital
Datasets
DiffFace-EditCIFAKEGenImageForgeryNetDiffusionForensicsDiffusionDeepfake
Applications
deepfake detectionfacial forgery detectionimage manipulation detection and localization