ENJ: Optimizing Noise with Genetic Algorithms to Jailbreak LSMs

The widespread application of Large Speech Models (LSMs) has made their security risks increasingly prominent. Traditional speech adversarial attack methods face challenges in balancing effectiveness and stealth. This paper proposes Evolutionary Noise Jailbreak (ENJ), which utilizes a genetic algorithm to transform environmental noise from a passive interference into an actively optimizable attack carrier for jailbreaking LSMs. Through operations such as population initialization, crossover fusion, and probabilistic mutation, this method iteratively evolves a series of audio samples that fuse malicious instructions with background noise. These samples sound like harmless noise to humans but can induce the model to parse and execute harmful commands. Extensive experiments on multiple mainstream speech models show that ENJ's attack effectiveness is significantly superior to existing baseline methods. This research reveals the dual role of noise in speech security and provides new critical insights for model security defense in complex acoustic environments.

Key Contributions

ERIS framework that repurposes real-world environmental noise (traffic, rain, ambient chatter) as an optimizable attack carrier against ALM safety alignment
Genetic algorithm-based optimization (population initialization, crossover fusion, probabilistic mutation) that evolves audio samples fusing malicious instructions with naturalistic background sounds
Empirical demonstration of 95% average Attack Success Rate across multiple mainstream ALMs, significantly outperforming text and audio jailbreak baselines

🛡️ Threat Analysis

Input Manipulation Attack

ERIS systematically optimizes audio inputs via genetic algorithm (population initialization, crossover, mutation) to craft adversarial audio samples that bypass safety filters — this is optimization-based input manipulation at inference time, analogous to how adversarial patches/perturbations attack vision models, applied here to audio.