Ting Wang

benchmark arXiv Feb 18, 2026 · 6w ago

Tanqiu Jiang, Yuhui Wang, Jiacheng Liang et al. · Stony Brook University

Benchmark evaluating LLM agent susceptibility to five long-horizon attack types across 28 agentic environments and 644 test cases

Prompt Injection Excessive Agency nlp

1 citations PDF Code

Papers in Database (1)