Yadong Liu

h-index: 1 18 citations 4 papers (total)

Papers in Database (1)

defense arXiv Oct 23, 2025 · Oct 2025

SAID: Empowering Large Language Models with Self-Activating Internal Defense

Yulong Chen, Yadong Liu, Jiawen Zhang et al. · Harbin Institute of Technology · Sun Yat-Sen University

Defends LLMs against jailbreaks by activating internal safety via intent distillation and prefix-based causal probing

Prompt Injection nlp
PDF