Lichao Wu

h-index: 14 958 citations 45 papers (total)

Papers in Database (1)

attack arXiv Feb 9, 2026 · 8w ago

Large Language Lobotomy: Jailbreaking Mixture-of-Experts via Expert Silencing

Jona te Lintelo, Lichao Wu, Stjepan Picek · Radboud University · Technical University of Darmstadt +1 more

Jailbreaks MoE LLMs by silencing safety-critical experts at inference time, boosting attack success from 7.3% to 70.4%

Prompt Injection nlp
PDF