α

Published on arXiv

2510.13357

Model Inversion Attack

OWASP ML Top 10 — ML03

Key Finding

Accent information can be reliably inferred from all three federated ASR models; attributes underrepresented in pre-training data exhibit the highest leakage vulnerability.

Centroid-based attribute inference via weight statistics

Novel technique introduced


Federated learning is a common method for privacy-preserving training of machine learning models. In this paper, we analyze the vulnerability of ASR models to attribute inference attacks in the federated setting. We test a non-parametric white-box attack method under a passive threat model on three ASR models: Wav2Vec2, HuBERT, and Whisper. The attack operates solely on weight differentials without access to raw speech from target speakers. We demonstrate attack feasibility on sensitive demographic and clinical attributes: gender, age, accent, emotion, and dysarthria. Our findings indicate that attributes that are underrepresented or absent in the pre-training data are more vulnerable to such inference attacks. In particular, information about accents can be reliably inferred from all models. Our findings expose previously undocumented vulnerabilities in federated ASR models and offer insights towards improved security.


Key Contributions

  • Demonstrates feasibility of non-parametric attribute inference attacks against federated ASR models (Wav2Vec2, HuBERT, Whisper) using only weight differentials, with no access to raw speech
  • Shadow model approach: fine-tunes global model on labeled public speech to build class centroids from weight statistics, then predicts attributes via normalized Euclidean distance
  • Finds that attributes underrepresented or absent in pre-training data (notably accent) are most vulnerable to leakage across all tested models

🛡️ Threat Analysis

Model Inversion Attack

The attack reconstructs private personal attributes (gender, age, accent, emotion, dysarthria) from model weight updates in a federated learning setting — a gradient/weight leakage attack where a passive server-side adversary infers private training data properties from shared model updates without accessing raw audio.


Details

Domains
audiofederated-learning
Model Types
transformerfederated
Threat Tags
white_boxtraining_time
Datasets
Speech Accent ArchiveTORGORAVDESS
Applications
automatic speech recognitionfederated speech systemsvoice assistants