MAIA: An Inpainting-Based Approach for Music Adversarial Attacks

Music adversarial attacks have garnered significant interest in the field of Music Information Retrieval (MIR). In this paper, we present Music Adversarial Inpainting Attack (MAIA), a novel adversarial attack framework that supports both white-box and black-box attack scenarios. MAIA begins with an importance analysis to identify critical audio segments, which are then targeted for modification. Utilizing generative inpainting models, these segments are reconstructed with guidance from the output of the attacked model, ensuring subtle and effective adversarial perturbations. We evaluate MAIA on multiple MIR tasks, demonstrating high attack success rates in both white-box and black-box settings while maintaining minimal perceptual distortion. Additionally, subjective listening tests confirm the high audio fidelity of the adversarial samples. Our findings highlight vulnerabilities in current MIR systems and emphasize the need for more robust and secure models.

Key Contributions

Novel adversarial attack framework (MAIA) that selectively reconstructs critical audio segments via generative inpainting guided by model outputs, preserving musical coherence while causing misclassification
Black-box importance analysis using a coarse-to-fine query-based strategy to identify influential music segments without gradient access
Comprehensive evaluation combining objective attack success metrics and subjective listening tests across multiple MIR tasks (genre classification, cover song identification)

🛡️ Threat Analysis

Input Manipulation Attack

MAIA crafts adversarial audio inputs that cause misclassification in MIR systems at inference time — uses Grad-CAM or query-based importance analysis to locate critical spectrogram regions, then reconstructs them via generative inpainting guided by model outputs, producing targeted evasion perturbations in both white-box and black-box threat models.

Details

Domains

audio

Model Types

cnngenerative

Threat Tags

white_boxblack_boxinference_timetargeteddigital

Applications

2025 0 cit.

Input Manipulation Attack

67%