Jaehan Kim

defense arXiv Sep 26, 2025 · Sep 2025

Jaehan Kim, Minkyoo Song, Seungwon Shin et al. · KAIST

Defends MoE LLMs against harmful fine-tuning by penalizing routing drift away from safety-critical experts

Transfer Learning Attack Prompt Injection nlp

3 citations 1 influentialPDF Code

attack arXiv Feb 6, 2026 · 8w ago

Minkyoo Song, Jaehan Kim, Myungchul Kang et al. · KAIST · National Security Research Institute

Attacks Graph RAG systems to reconstruct proprietary knowledge graphs via multi-turn prompting, reaching 82.9 F1 against safety-aligned LLMs

Sensitive Information Disclosure nlpgraph

Papers in Database (2)