Zhi Rui Tam

h-index: 7 1,060 citations 19 papers (total)

Papers in Database (1)

benchmark arXiv Feb 2, 2026 · 9w ago

Expected Harm: Rethinking Safety Evaluation of (Mis)Aligned LLMs

Yen-Shan Chen, Zhi Rui Tam, Cheng-Kuang Wu et al. · National Taiwan University · Independent Researcher

Reveals LLM safety miscalibration via Expected Harm metric, boosting existing jailbreak success rates by up to 2×

Prompt Injection nlp
PDF