Zhiyu Ni

benchmark arXiv Jan 12, 2026 · 12w ago

Defenses Against Prompt Attacks Learn Surface Heuristics

Shawn Li, Chenxiao Yu, Zhiyu Ni et al. · University of Southern California · University of California +3 more

Exposes three shortcut biases in LLM prompt-injection defenses: position, token-trigger, and topic generalization—causing up to 90% false rejection rates

Prompt Injection nlp

PDF Code

Papers in Database (1)

Defenses Against Prompt Attacks Learn Surface Heuristics