survey 2025

A Survey of Threats Against Voice Authentication and Anti-Spoofing Systems

Kamel Kamel , Keshav Sood , Hridoy Sankar Dutta , Sunil Aryal

0 citations

α

Published on arXiv

2508.16843

Input Manipulation Attack

OWASP ML Top 10 — ML01

Data Poisoning Attack

OWASP ML Top 10 — ML02

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Comprehensive taxonomy of four threat categories against voice authentication ML models reveals how vulnerabilities in speaker verification and anti-spoofing countermeasures have co-evolved with advances in deep learning


Voice authentication has undergone significant changes from traditional systems that relied on handcrafted acoustic features to deep learning models that can extract robust speaker embeddings. This advancement has expanded its applications across finance, smart devices, law enforcement, and beyond. However, as adoption has grown, so have the threats. This survey presents a comprehensive review of the modern threat landscape targeting Voice Authentication Systems (VAS) and Anti-Spoofing Countermeasures (CMs), including data poisoning, adversarial, deepfake, and adversarial spoofing attacks. We chronologically trace the development of voice authentication and examine how vulnerabilities have evolved in tandem with technological advancements. For each category of attack, we summarize methodologies, highlight commonly used datasets, compare performance and limitations, and organize existing literature using widely accepted taxonomies. By highlighting emerging risks and open challenges, this survey aims to support the development of more secure and resilient voice authentication systems.


Key Contributions

  • Chronological review tracing voice authentication evolution alongside its threat landscape, covering data poisoning, adversarial, deepfake, and adversarial anti-spoofing attack categories
  • Per-category taxonomies summarizing methodologies, commonly used datasets, and performance comparisons across the literature
  • Identification of emerging risks and open challenges to guide development of more secure and resilient voice authentication systems

🛡️ Threat Analysis

Input Manipulation Attack

Covers adversarial attacks against speaker verification models (evasion at inference time) and a dedicated section on adversarial attacks targeting anti-spoofing countermeasures.

Data Poisoning Attack

Dedicated section on data poisoning attacks against voice authentication models — corrupting training data to degrade or manipulate speaker verification behavior.

Output Integrity Attack

Covers deepfake/synthetic speech generation attacks and the anti-spoofing countermeasures that detect AI-generated voice content, directly addressing audio content integrity and provenance.


Details

Domains
audio
Model Types
cnntransformerrnn
Threat Tags
white_boxblack_boxtraining_timeinference_timedigitalphysical
Applications
voice authenticationspeaker verificationanti-spoofing countermeasuressmart devicesfinancelaw enforcement