Dachuan Lin

benchmark arXiv Nov 9, 2025 · Nov 2025

Dachuan Lin, Guobin Shen, Zihao Yang et al. · Beijing Institute of AI Safety and Governance · Chinese Academy of Sciences +3 more

Proposes SLM multi-agent debate judge and HAJailBench to evaluate LLM jailbreak safety at 43% lower inference cost

Prompt Injection nlp

1 citations PDF

defense arXiv Sep 25, 2025 · Sep 2025

Haibo Tong, Dongcheng Zhao, Guobin Shen et al. · University of Chinese Academy of Sciences · Long-term AI +3 more

Defends LLMs against multi-turn jailbreaks using bidirectional intention inference across conversation history

Prompt Injection nlp

1 citations PDF

Papers in Database (2)