Sachin Kumar

defense arXiv Oct 30, 2025 · Oct 2025

Zishuo Zheng, Vidhisha Balachandran, Chan Young Park et al. · The Ohio State University · Microsoft Research +1 more

Trains LLMs via RL on instruction-hierarchy data to resist jailbreaks and prompt injection, cutting attack success rates by 20%

Prompt Injection nlp

1 citations PDF Code

Papers in Database (1)