Yi Liu

h-index: 2 11 citations 4 papers (total)

Papers in Database (1)

defense arXiv Jan 27, 2026 · 9w ago

LLM-VA: Resolving the Jailbreak-Overrefusal Trade-off via Vector Alignment

Haonan Zhang, Dongxia Wang, Yi Liu et al. · Zhejiang University · Huzhou Institute of Industrial Control Technology +1 more

Defends LLMs against jailbreak and over-refusal simultaneously by aligning safety and answer vectors via closed-form weight updates

Prompt Injection nlp
PDF Code