Han Fang

h-index: 0 0 citations 0 papers (total)

Papers in Database (1)

defense arXiv Feb 6, 2026 · 8w ago

TrapSuffix: Proactive Defense Against Adversarial Suffixes in Jailbreaking

Mengyao Du, Han Fang, Haokai Ma et al. · National University of Defense Technology · National University of Singapore +1 more

Proactive fine-tuning defense traps gradient-based jailbreak suffixes or fingerprints them, cutting LLM attack success below 0.01%

Input Manipulation Attack Prompt Injection nlp
PDF