Towards Robust Speech Deepfake Detection via Human-Inspired Reasoning

The modern generative audio models can be used by an adversary in an unlawful manner, specifically, to impersonate other people to gain access to private information. To mitigate this issue, speech deepfake detection (SDD) methods started to evolve. Unfortunately, current SDD methods generally suffer from the lack of generalization to new audio domains and generators. More than that, they lack interpretability, especially human-like reasoning that would naturally explain the attribution of a given audio to the bona fide or spoof class and provide human-perceptible cues. In this paper, we propose HIR-SDD, a novel SDD framework that combines the strengths of Large Audio Language Models (LALMs) with the chain-of-thought reasoning derived from the novel proposed human-annotated dataset. Experimental evaluation demonstrates both the effectiveness of the proposed method and its ability to provide reasonable justifications for predictions.

Key Contributions

Human-annotated dataset of 41k reasoning traces covering bona fide and spoof speech samples for CoT training and evaluation
HIR-SDD framework combining hard-label classification and chain-of-thought supervised fine-tuning with LALM-based reasoning for interpretable speech deepfake detection
Integration of grounding and reinforcement learning strategies to improve both detection accuracy and quality of human-perceptible explanations

🛡️ Threat Analysis

Output Integrity Attack

Speech deepfake detection is AI-generated content detection; the paper proposes a novel detection architecture (HIR-SDD) using LALMs and CoT reasoning to identify synthetic/spoofed speech — this is a novel forensic detection approach for AI-generated audio, not a mere application of existing methods to a narrow domain.

Details

Domains

audionlp

Model Types

llmtransformer

Threat Tags

inference_time

Datasets

ASVspoofASVspoof5ADDSingfake

Applications

2025 4 cit.

Output Integrity Attack

82%