Jizhou Huang

h-index: 2 12 citations 10 papers (total)

Papers in Database (1)

defense arXiv Jan 29, 2026 · 9w ago

Stay in Character, Stay Safe: Dual-Cycle Adversarial Self-Evolution for Safety Role-Playing Agents

Mingyang Liao, Yichen Wan, shuchen wu et al. · Baidu Inc. · The University of Queensland +1 more

Training-free dual-cycle framework defends LLM role-playing agents against jailbreaks while preserving persona fidelity via evolving hierarchical knowledge

Prompt Injection nlp
PDF Code