I Guess That's Why They Call it the Blues: Causal Analysis for Audio Classifiers

It is well-known that audio classifiers often rely on non-musically relevant features and spurious correlations to classify audio. Hence audio classifiers are easy to manipulate or confuse, resulting in wrong classifications. While inducing a misclassification is not hard, until now the set of features that the classifiers rely on was not well understood. In this paper we introduce a new method that uses causal reasoning to discover features of the frequency space that are sufficient and necessary for a given classification. We describe an implementation of this algorithm in the tool FreqReX and provide experimental results on a number of standard benchmark datasets. Our experiments show that causally sufficient and necessary subsets allow us to manipulate the outputs of the models in a variety of ways by changing the input very slightly. Namely, a change to one out of 240,000 frequencies results in a change in classification 58% of the time, and the change can be so small that it is practically inaudible. These results show that causal analysis is useful for understanding the reasoning process of audio classifiers and can be used to successfully manipulate their outputs.

Key Contributions

FreqReX: a causal-reasoning framework that decomposes audio signals into sufficient, necessary, and complete frequency subsets with respect to a classifier's decision
Demonstrates that minimal frequency perturbations (as few as 1 out of ~240,000) can change audio classifier output 58% of the time, with STFT variants yielding practically inaudible changes effective 62% of the time
First application of formal actual-causality theory to audio signals, evaluated across 8 models on music genre and sung emotion classification datasets

🛡️ Threat Analysis

Input Manipulation Attack

FreqReX generates adversarial perturbations at inference time — changing as few as 1 out of 240,000 frequencies can flip classification 58% of the time, with STFT-based changes small enough to be inaudible. The causal-subset approach to finding minimal adversarial perturbations is a novel black-box attack technique on audio classifiers.