Mohamadreza Rostami

Papers in Database (1)

attack arXiv Sep 15, 2025 · Sep 2025

NeuroStrike: Neuron-Level Attacks on Aligned LLMs

Lichao Wu, Sasha Behrouzi, Mohamadreza Rostami et al. · Technical University of Darmstadt · University of Zagreb +1 more

Bypasses LLM safety alignment by pruning <0.6% of sparse safety neurons, achieving 76.9% ASR across 20+ aligned LLMs

Input Manipulation Attack Prompt Injection nlpmultimodal
PDF