Probabilistic Verification of Voice Anti-Spoofing Models

Recent advances in generative models have amplified the risk of malicious misuse of speech synthesis technologies, enabling adversaries to impersonate target speakers and access sensitive resources. Although speech deepfake detection has progressed rapidly, most existing countermeasures lack formal robustness guarantees or fail to generalize to unseen generation techniques. We propose PV-VASM, a probabilistic framework for verifying the robustness of voice anti-spoofing models (VASMs). PV-VASM estimates the probability of misclassification under text-to-speech (TTS), voice cloning (VC), and parametric signal transformations. The approach is model-agnostic and enables robustness verification against unseen speech synthesis techniques and input perturbations. We derive a theoretical upper bound on the error probability and validate the method across diverse experimental settings, demonstrating its effectiveness as a practical robustness verification tool.

Key Contributions

PV-VASM: a model-agnostic probabilistic framework that estimates upper bounds on VASM misclassification probability under TTS, voice cloning, and signal transformations
Theoretical derivation of error probability bounds using probabilistic concentration inequalities, with practical parameter selection procedures
Empirical validation across diverse TTS/VC systems and audio conditions, demonstrating generalization to unseen speech synthesis techniques

🛡️ Threat Analysis

Output Integrity Attack

Voice anti-spoofing models are speech deepfake detectors — AI-generated audio content detection systems explicitly covered under ML09. The paper's primary contribution is a robustness verification framework (PV-VASM) that formally bounds the probability that a VASM fails to detect TTS/VC-synthesized audio, directly addressing output integrity and authenticity of AI-generated speech content.

Details

Domains

audio

Model Types

transformergenerative

Threat Tags

black_boxinference_time

Datasets

ASVspoofADD

Applications

2025 0 cit.

Output Integrity Attack

80%