Ali Modarressi

Papers in Database (1)

defense arXiv Sep 11, 2025 · Sep 2025

Steering MoE LLMs via Expert (De)Activation

Mohsen Fayyaz, Ali Modarressi, Hanieh Deilamsalehy et al. · University of California · Adobe Research +2 more

Manipulates MoE expert routing at inference time to steer LLM safety, achieving -100% safety when combined with jailbreaks

Prompt Injection nlp
PDF Code