Christopher A. Choquette-Choo

Papers in Database (1)

defense arXiv Mar 11, 2026 · 26d ago

IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs

Chuan Guo, Juan Felipe Ceron Uribe, Sicheng Zhu et al. · OpenAI

Proposes a reinforcement learning dataset that trains LLMs to resist jailbreaks, prompt injection, and system prompt extraction via instruction hierarchy

Prompt Injection Sensitive Information Disclosure nlp
PDF Code