Hongtao Xie

h-index: 0 0 citations 3 papers (total)

Papers in Database (1)

defense arXiv Oct 2, 2025 · Oct 2025

UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models

Yuhao Sun, Zhuoer Xu, Shiwen Cui et al. · University of Science and Technology of China · Independent Researcher

Defends LLMs against jailbreaks by upcycling safety-critical layers into MoE structures with a dynamic safety temperature control

Prompt Injection nlp
PDF