Seungwon Shin

defense arXiv Sep 26, 2025 · Sep 2025

Jaehan Kim, Minkyoo Song, Seungwon Shin et al. · KAIST

Defends MoE LLMs against harmful fine-tuning by penalizing routing drift away from safety-critical experts

Transfer Learning Attack Prompt Injection nlp

3 citations 1 influentialPDF Code

attack arXiv Jan 8, 2026 · Jan 2026

Wonwoo Choi, Minjae Seo, Minkyoo Song et al.

Black-box jailbreak bypassing GPT text-to-image political safety filters via semantic obfuscation and cross-language fragmentation

Prompt Injection nlpvisionmultimodalgenerative

attack arXiv Feb 6, 2026 · 8w ago

Minkyoo Song, Jaehan Kim, Myungchul Kang et al. · KAIST · National Security Research Institute

Attacks Graph RAG systems to reconstruct proprietary knowledge graphs via multi-turn prompting, reaching 82.9 F1 against safety-aligned LLMs

Sensitive Information Disclosure nlpgraph

Papers in Database (3)