Precision-Varying Prediction (PVP): Robustifying ASR systems against adversarial attacks

With the increasing deployment of automated and agentic systems, ensuring the adversarial robustness of automatic speech recognition (ASR) models has become critical. We observe that changing the precision of an ASR model during inference reduces the likelihood of adversarial attacks succeeding. We take advantage of this fact to make the models more robust by simple random sampling of the precision during prediction. Moreover, the insight can be turned into an adversarial example detection strategy by comparing outputs resulting from different precisions and leveraging a simple Gaussian classifier. An experimental analysis demonstrates a significant increase in robustness and competitive detection performance for various ASR models and attack types.

Key Contributions

Demonstrates that adversarial examples exhibit reduced transferability across different numerical precision settings
Proposes Precision-Varying Prediction (PVP) - a training-free defense that randomly samples precision during inference to improve robustness
Introduces a lightweight Gaussian classifier for adversarial detection by comparing ASR outputs across different precisions

🛡️ Threat Analysis

Input Manipulation Attack

Paper directly addresses adversarial examples targeting ASR models at inference time - proposes both a robustness enhancement technique (random precision sampling) and a detection method (Gaussian classifier comparing outputs across precisions) to defend against input manipulation attacks.