benchmark 2026

What Counts as Real? Speech Restoration and Voice Quality Conversion Pose New Challenges to Deepfake Detection

Shree Harsha Bokkahalli Satish , Harm Lameris , Joakim Gustafson , Éva Székely

0 citations

α

Published on arXiv

2603.14033

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Multi-class formulation improves robustness to benign transformations while preserving spoof detection, revealing that binary systems model raw speech distribution rather than authenticity itself


Audio anti-spoofing systems are typically formulated as binary classifiers distinguishing bona fide from spoofed speech. This assumption fails under layered generative processing, where benign transformations introduce distributional shifts that are misclassified as spoofing. We show that phonation-modifying voice conversion and speech restoration are treated as out-of-distribution despite preserving speaker authenticity. Using a multi-class setup separating bona fide, converted, spoofed, and converted-spoofed speech, we analyse model behaviour through self-supervised learning (SSL) embeddings and acoustic correlates. The benign transformations induce a drift in the SSL space, compressing bona fide and spoofed speech and reducing classifier separability. Reformulating anti-spoofing as a multi-class problem improves robustness to benign shifts while preserving spoof detection, suggesting binary systems model the distribution of raw speech rather than authenticity itself.


Key Contributions

  • Demonstrates that binary audio anti-spoofing systems misclassify benign transformations (voice quality conversion, speech restoration) as spoofed despite preserving speaker authenticity
  • Proposes 4-way classification framework (bona fide, converted, spoofed, converted-spoofed) that improves robustness to benign distributional shifts while maintaining spoof detection capability
  • Releases dataset pairing bona fide audio with spoofed counterparts and benign-transformed versions across 10 TTS systems for benchmarking deepfake detectors

🛡️ Threat Analysis

Output Integrity Attack

Paper focuses on detecting AI-generated (synthetic) speech and verifying audio authenticity — the core contribution is analyzing and improving deepfake detection systems, which is output integrity. The paper demonstrates that current binary deepfake detectors fail to distinguish benign transformations from malicious spoofing, and proposes a multi-class classification approach to improve detection robustness.


Details

Domains
audio
Model Types
traditional_ml
Threat Tags
inference_time
Datasets
M-AILABSMLAAD
Applications
audio deepfake detectionspeech anti-spoofingsynthetic speech detection