Bo Jin

Papers in Database (1)

attack arXiv Sep 4, 2025 · Sep 2025

MEUV: Achieving Fine-Grained Capability Activation in Large Language Models via Mutually Exclusive Unlock Vectors

Xin Tong, Zhi Lin, Jingya Wang et al. · People’s Public Security University of China · Tsinghua University +2 more

Factorizes LLM refusal directions into topic-specific vectors to achieve fine-grained, semantically controlled safety alignment bypass

Prompt Injection nlp
PDF