AUDDT: Audio Unified Deepfake Detection Benchmark Toolkit

With the prevalence of artificial intelligence (AI)-generated content, such as audio deepfakes, a large body of recent work has focused on developing deepfake detection techniques. However, most models are evaluated on a narrow set of datasets, leaving their generalization to real-world conditions uncertain. In this paper, we systematically review 28 existing audio deepfake datasets and present an open-source benchmarking toolkit called AUDDT (https://github.com/MuSAELab/AUDDT). The goal of this toolkit is to automate the evaluation of pretrained detectors across these 28 datasets, giving users direct feedback on the advantages and shortcomings of their deepfake detectors. We start by showcasing the usage of the developed toolkit, the composition of our benchmark, and the breakdown of different deepfake subgroups. Next, using a widely adopted pretrained deepfake detector, we present in- and out-of-domain detection results, revealing notable differences across conditions and audio manipulation types. Lastly, we also analyze the limitations of these existing datasets and their gap relative to practical deployment scenarios.

Key Contributions

Systematic review and taxonomy of 28 audio deepfake datasets covering diverse generation methods, languages, perturbations, and recording conditions
AUDDT open-source toolkit that automates evaluation of any pretrained audio deepfake detector across all 28 datasets with minimal user effort
Empirical analysis using a baseline ASVspoof2019-pretrained detector revealing large performance variance across deepfake subgroups and real-world deployment gaps

🛡️ Threat Analysis

Output Integrity Attack

Audio deepfake detection is a core ML09 concern (AI-generated content detection / output integrity). The toolkit systematically evaluates detectors' ability to identify AI-generated audio across diverse generation methods (diffusion, neural codec, vocoders), directly measuring output integrity assurance under real-world conditions.