Lihao Yin

h-index: 4 21 citations 10 papers (total)

Papers in Database (1)

benchmark arXiv Jan 7, 2026 · 12w ago

What Matters For Safety Alignment?

Xing Li, Hui-Ling Zhen, Lihao Yin et al. · Huawei Technologies

Large-scale safety alignment benchmark evaluating 32 LLMs with 56 jailbreak techniques, finding CoT prefix attacks raise ASR by 3.34x

Prompt Injection nlp
PDF