Zhen Leng Thai

Papers in Database (1)

defense arXiv Sep 4, 2025 · Sep 2025

Between a Rock and a Hard Place: The Tension Between Ethical Reasoning and Safety Alignment in LLMs

Shei Pern Chua, Zhen Leng Thai, Kai Jun Teh et al. · Tsinghua University · ByteDance +1 more

Multi-turn jailbreak embeds harmful requests in ethical dilemmas to bypass LLM safety; LoRA defense separates analytic from instrumental harmful responses

Prompt Injection nlp
PDF