Chanyoung Park

defense arXiv Apr 21, 2026 · 4w ago

Yeonjun In, Wonjoong Kim, Sangwu Park et al. · KAIST

Safety alignment for reasoning LLMs via structured reasoning that assesses harmfulness before solving, reducing unsafe outputs

Prompt Injection nlp

benchmark arXiv Jan 9, 2025 · Jan 2025

Hyeonsoo Jo, Hyunjin Hwang, Fanchen Bu et al. · KAIST

Proposes HideNSeek, a learnable graph attack noticeability measure that outperforms 11 baselines in identifying adversarial edges on GNNs

Input Manipulation Attack graph

Papers in Database (2)