Benjamin L. Edelman

benchmark arXiv Mar 16, 2026 · 21d ago

How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition

Mateusz Dziemian, Maxwell Lin, Xiaohan Fu et al. · Gray Swan AI · OpenAI +6 more

Large-scale red teaming competition finds all frontier LLM agents vulnerable to concealed indirect prompt injection attacks with 0.5-8.5% success rates

Prompt Injection Excessive Agency nlpmultimodal

PDF

Papers in Database (1)

How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition