Chongwen Zhao

Papers in Database (1)

defense arXiv Sep 1, 2025 · Sep 2025

Unraveling LLM Jailbreaks Through Safety Knowledge Neurons

Chongwen Zhao, Yutong Ke, Kaizhu Huang · Duke Kunshan University

Identifies safety-critical neurons in LLMs and proposes SafeTuning to reinforce them against jailbreak attacks

Prompt Injection nlp
PDF