Generalizable Audio Spoofing Detection using Non-Semantic Representations

Rapid advancements in generative modeling have made synthetic audio generation easy, making speech-based services vulnerable to spoofing attacks. Consequently, there is a dire need for robust countermeasures more than ever. Existing solutions for deepfake detection are often criticized for lacking generalizability and fail drastically when applied to real-world data. This study proposes a novel method for generalizable spoofing detection leveraging non-semantic universal audio representations. Extensive experiments have been performed to find suitable non-semantic features using TRILL and TRILLsson models. The results indicate that the proposed method achieves comparable performance on the in-domain test set while significantly outperforming state-of-the-art approaches on out-of-domain test sets. Notably, it demonstrates superior generalization on public-domain data, surpassing methods based on hand-crafted features, semantic embeddings, and end-to-end architectures.

Key Contributions

Novel use of non-semantic universal audio representations (TRILL and TRILLsson) as features for audio spoofing/deepfake detection, motivated by the insight that discarding semantic content improves generalization
Demonstrates superior out-of-domain generalization over SOTA methods based on hand-crafted features, semantic SSL embeddings (XLS-R, WavLM, HuBERT), and end-to-end architectures (RawNet2, AASIST)
Cross-dataset evaluation on ASVspoof (in-domain) and In the Wild noisy public-domain data (out-of-domain) showing maintained in-domain competitiveness with significantly improved real-world robustness

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel AI-generated audio detection method — specifically a new forensic approach leveraging non-semantic representations to detect synthetic/spoofed speech with improved cross-dataset generalization; explicitly falls under deepfake detection and AI-generated content detection.

Details

Domains

audio

Model Types

transformer

Threat Tags

inference_time

Datasets

ASVspoofIn the Wild (ItW)ADD challenge

Applications

2025 0 cit.

Output Integrity Attack

100%