Age-Diverse Deepfake Dataset: Bridging the Age Gap in Deepfake Detection

The challenges associated with deepfake detection are increasing significantly with the latest advancements in technology and the growing popularity of deepfake videos and images. Despite the presence of numerous detection models, demographic bias in the deepfake dataset remains largely unaddressed. This paper focuses on the mitigation of age-specific bias in the deepfake dataset by introducing an age-diverse deepfake dataset that will improve fairness across age groups. The dataset is constructed through a modular pipeline incorporating the existing deepfake datasets Celeb-DF, FaceForensics++, and UTKFace datasets, and the creation of synthetic data to fill the age distribution gaps. The effectiveness and generalizability of this dataset are evaluated using three deepfake detection models: XceptionNet, EfficientNet, and LipForensics. Evaluation metrics, including AUC, pAUC, and EER, revealed that models trained on the age-diverse dataset demonstrated fairer performance across age groups, improved overall accuracy, and higher generalization across datasets. This study contributes a reproducible, fairness-aware deepfake dataset and model pipeline that can serve as a foundation for future research in fairer deepfake detection. The complete dataset and implementation code are available at https://github.com/unishajoshi/age-diverse-deepfake-detection.

Key Contributions

Age-level annotations for FaceForensics++ and Celeb-DF, plus synthetic data generation (InsightFace + SimSwap) to fill underrepresented age groups
Modular pipeline combining Celeb-DF, FaceForensics++, and UTKFace into a reproducible, fairness-aware age-diverse deepfake dataset
Empirical evaluation showing models trained on the age-diverse dataset achieve fairer AUC/pAUC/EER across age cohorts compared to source datasets

🛡️ Threat Analysis

Output Integrity Attack

Deepfake detection is the central task — the paper constructs a dataset and evaluates AI-generated content detectors (XceptionNet, EfficientNet, LipForensics), directly targeting the output integrity and authenticity problem of AI-synthesized faces.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

inference_time

Datasets

Celeb-DFFaceForensics++UTKFace

Applications

2026 0 cit.

Output Integrity Attack

82%