Jaewoo Kang

h-index: 9 324 citations 20 papers (total)

Papers in Database (1)

defense arXiv Sep 30, 2025 · Sep 2025

ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack

Yein Park, Jungwoo Park, Jaewoo Kang · Korea University · AIGEN Sciences

Defends LLMs against tense-rephrasing jailbreaks via circuit analysis and activation-scaling preventative fine-tuning

Prompt Injection nlp
PDF Code