Weiyang Guo

attack arXiv Apr 10, 2026 · 5w ago

Weiyang Guo, Zesheng Shi, Zeen Zhu et al. · Harbin Institute of Technology · Huawei Technologies

Backdoor attack on RLVR-trained LLMs that implants jailbreak triggers using 2% poisoned data, degrading safety by 73%

Model Poisoning Transfer Learning Attack Prompt Injection nlpreinforcement-learning

Papers in Database (1)