Jaehyung Kim

attack arXiv Nov 3, 2025 · Nov 2025

Hamin Koo, Minseon Kim, Jaehyung Kim · Yonsei University · Microsoft Research

Meta-optimized bi-level framework co-evolves jailbreak prompts and LLM judge templates to achieve SOTA attack success rates on Claude models

Prompt Injection nlp

1 citations PDF

attack arXiv Jan 16, 2026 · 11w ago

Minseo Kwak, Jaehyung Kim · Yonsei University

Novel LLM membership inference attack using top-1 prediction probability gaps and sliding window correlation to detect pretraining data

Membership Inference Attack nlp

Papers in Database (2)