Kyomin Jung

Papers in Database (3)

attack arXiv Sep 13, 2025 · Sep 2025

Harmful Prompt Laundering: Jailbreaking LLMs with Abductive Styles and Symbolic Encoding

Seongho Joo, Hyukhun Koh, Kyomin Jung · Seoul National University

Proposes HaPLa, a black-box LLM jailbreak using abductive framing and symbolic encoding achieving 95%+ success on GPT models

Prompt Injection nlp
PDF
defense arXiv Sep 13, 2025 · Sep 2025

Public Data Assisted Differentially Private In-Context Learning

Seongho Joo, Hyukhun Koh, Kyomin Jung · Seoul National University

Defends private LLM in-context learning from membership inference and data leakage using public-data-assisted differential privacy

Membership Inference Attack Sensitive Information Disclosure nlp
PDF
attack arXiv Aug 11, 2025 · Aug 2025

Can You Trick the Grader? Adversarial Persuasion of LLM Judges

Yerin Hwang, Dongryeol Lee, Taegwan Kang et al. · Seoul National University · LG AI Research

Embeds Aristotelian persuasion techniques in responses to manipulate LLM judges into inflating scores on incorrect math solutions by up to 8%

Prompt Injection nlp
PDF