defense 2025

Generalizable Speech Deepfake Detection via Information Bottleneck Enhanced Adversarial Alignment

Pu Huang 1, Shouguang Wang 1, Siya Yao 1, Mengchu Zhou 1,2

0 citations · 29 references · arXiv

α

Published on arXiv

2509.23618

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

IB-CAAN consistently outperforms baseline detectors and achieves state-of-the-art results on multiple speech deepfake detection benchmarks by learning attack-invariant discriminative features

IB-CAAN

Novel technique introduced


Neural speech synthesis techniques have enabled highly realistic speech deepfakes, posing major security risks. Speech deepfake detection is challenging due to distribution shifts across spoofing methods and variability in speakers, channels, and recording conditions. We explore learning shared discriminative features as a path to robust detection and propose Information Bottleneck enhanced Confidence-Aware Adversarial Network (IB-CAAN). Confidence-guided adversarial alignment adaptively suppresses attack-specific artifacts without erasing discriminative cues, while the information bottleneck removes nuisance variability to preserve transferable features. Experiments on ASVspoof 2019/2021, ASVspoof 5, and In-the-Wild demonstrate that IB-CAAN consistently outperforms baseline and achieves state-of-the-art performance on many benchmarks.


Key Contributions

  • Formalizes speech deepfake detection as a dual distribution shift problem (covariate shift + concept shift) and proposes attack-invariant feature learning as the solution
  • Proposes IB-CAAN: confidence-guided adversarial alignment that selectively suppresses attack-specific artifacts while preserving discriminative cues, combined with an information bottleneck to compress nuisance variability
  • Achieves state-of-the-art performance on ASVspoof 2019/2021, ASVspoof 5, and In-the-Wild benchmarks, demonstrating improved generalization to unseen spoofing methods

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel AI-generated content detection architecture (IB-CAAN) specifically for detecting synthetic/deepfake speech — directly addresses output integrity and authenticity of AI-generated audio content.


Details

Domains
audio
Model Types
transformer
Threat Tags
inference_time
Datasets
ASVspoof 2019ASVspoof 2021ASVspoof 5In-the-Wild
Applications
speech deepfake detectionautomatic speaker verification