Wei Zhang

Papers in Database (1)

attack arXiv Apr 8, 2026 · 6w ago

MirageBackdoor: A Stealthy Attack that Induces Think-Well-Answer-Wrong Reasoning

Yizhe Zeng, Wei Zhang, Yunpeng Li et al. · Chinese Academy of Sciences · University of Chinese Academy of Sciences +1 more

Backdoor attack on CoT-reasoning LLMs that produces correct reasoning but wrong final answers, evading process-monitoring defenses

Model Poisoning Training Data Poisoning nlp
PDF