attack 2025

SegReConcat: A Data Augmentation Method for Voice Anonymization Attack

Ridwan Arefeen 1, Xiaoxiao Miao 2, Rong Tong 1, Aik Beng Ng 3, Simon See 3

0 citations

α

Published on arXiv

2508.18907

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

SegReConcat improves de-anonymization success on 5 out of 7 anonymization systems in the VPAC 2024 evaluation framework

SegReConcat

Novel technique introduced


Anonymization of voice seeks to conceal the identity of the speaker while maintaining the utility of speech data. However, residual speaker cues often persist, which pose privacy risks. We propose SegReConcat, a data augmentation method for attacker-side enhancement of automatic speaker verification systems. SegReConcat segments anonymized speech at the word level, rearranges segments using random or similarity-based strategies to disrupt long-term contextual cues, and concatenates them with the original utterance, allowing an attacker to learn source speaker traits from multiple perspectives. The proposed method has been evaluated in the VoicePrivacy Attacker Challenge 2024 framework across seven anonymization systems, SegReConcat improves de-anonymization on five out of seven systems.


Key Contributions

  • SegReConcat: a word-level segment rearrangement and concatenation data augmentation method for training attacker-side ASV systems that de-anonymize speaker-anonymized speech
  • Exploration of random vs. similarity-based rearrangement strategies to disrupt long-term contextual cues and force focus on short-term speaker traits
  • Evaluation across seven anonymization systems in the VoicePrivacy Attacker Challenge 2024 framework, showing improved de-anonymization on five of seven systems

🛡️ Threat Analysis

Output Integrity Attack

The paper attacks the output integrity of ML-based voice anonymization systems — the anonymized speech claims to hide speaker identity, but SegReConcat shows residual identity cues persist. This is an attack on the claimed anonymization integrity of model outputs, analogous to defeating a content protection scheme.


Details

Domains
audio
Model Types
traditional_ml
Threat Tags
black_boxinference_time
Datasets
VoicePrivacy Attacker Challenge 2024 datasets
Applications
speaker anonymizationspeaker verificationvoice de-identification