Hamin Koo

h-index: 1 18 citations 3 papers (total)

Papers in Database (1)

attack arXiv Nov 3, 2025 · Nov 2025

Align to Misalign: Automatic LLM Jailbreak with Meta-Optimized LLM Judges

Hamin Koo, Minseon Kim, Jaehyung Kim · Yonsei University · Microsoft Research

Meta-optimized bi-level framework co-evolves jailbreak prompts and LLM judge templates to achieve SOTA attack success rates on Claude models

Prompt Injection nlp
1 citations PDF