α

Published on arXiv

2509.05592

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

MFFI outperforms existing public datasets on scene complexity, cross-domain generalization, and detection difficulty gradients across benchmark evaluations.

MFFI

Novel technique introduced


Rapid advances in Artificial Intelligence Generated Content (AIGC) have enabled increasingly sophisticated face forgeries, posing a significant threat to social security. However, current Deepfake detection methods are limited by constraints in existing datasets, which lack the diversity necessary in real-world scenarios. Specifically, these data sets fall short in four key areas: unknown of advanced forgery techniques, variability of facial scenes, richness of real data, and degradation of real-world propagation. To address these challenges, we propose the Multi-dimensional Face Forgery Image (\textbf{MFFI}) dataset, tailored for real-world scenarios. MFFI enhances realism based on four strategic dimensions: 1) Wider Forgery Methods; 2) Varied Facial Scenes; 3) Diversified Authentic Data; 4) Multi-level Degradation Operations. MFFI integrates $50$ different forgery methods and contains $1024K$ image samples. Benchmark evaluations show that MFFI outperforms existing public datasets in terms of scene complexity, cross-domain generalization capability, and detection difficulty gradients. These results validate the technical advance and practical utility of MFFI in simulating real-world conditions. The dataset and additional details are publicly available at {https://github.com/inclusionConf/MFFI}.


Key Contributions

  • MFFI dataset with 1,024K images covering 50 distinct face forgery methods across 6 major categories (face swapping, reenactment, synthesis, editing, super-resolution, manual photoshop)
  • Four realism dimensions: wider forgery methods, varied facial scenes, diversified authentic data, and multi-level degradation simulating real-world propagation artifacts
  • Benchmark evaluations demonstrating superior scene complexity, cross-domain generalization capability, and detection difficulty gradients compared to existing public datasets

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses deepfake face forgery detection — an AI-generated content detection problem explicitly covered by ML09. The MFFI dataset is purpose-built to benchmark and improve detectors that verify output authenticity and detect AI-generated faces.


Details

Domains
visiongenerative
Model Types
gandiffusion
Threat Tags
digitalinference_time
Datasets
MFFIFaceForensics++CelebDFDF40
Applications
face forgery detectiondeepfake detection