QueryIPI: Query-agnostic Indirect Prompt Injection on Coding Agents
Yuchong Xie 1, Zesen Liu 1, Mingyu Luo 2, Zhixiang Zhang 1, Kaikai Zhang 1, Yuanyuan Yuan 1, Zongjie Li 1, Ping Chen 3, Shuai Wang 1, Dongdong She 2
Published on arXiv
2510.23675
Prompt Injection
OWASP LLM Top 10 — LLM01
Insecure Plugin Design
OWASP LLM Top 10 — LLM07
Key Finding
QueryIPI achieves up to 87% success rate on simulated coding agents, outperforming the best IPI baseline by 37 percentage points, and transfers to real-world coding agents.
QueryIPI
Novel technique introduced
Modern coding agents integrated into IDEs orchestrate powerful tools and high-privilege system access, creating a high-stakes attack surface. Prior work on Indirect Prompt Injection (IPI) is mainly query-specific, requiring particular user queries as triggers and leading to poor generalizability. We propose query-agnostic IPI, a new attack paradigm that reliably executes malicious payloads under arbitrary user queries. Our key insight is that malicious payloads should leverage the invariant prompt context (i.e., system prompt and tool descriptions) rather than variant user queries. We present QueryIPI, an automated framework that uses tool descriptions as optimizable payloads and refines them via iterative, prompt-based blackbox optimization. QueryIPI leverages system invariants for initial seed generation aligned with agent conventions, and iterative reflection to resolve instruction-following failures and safety refusals. Experiments on five simulated agents show that QueryIPI achieves up to 87% success rate, outperforming the best baseline (50%). Crucially, generated malicious descriptions transfer to real-world coding agents, highlighting a practical security risk.
Key Contributions
- Introduces query-agnostic IPI, a new attack paradigm that exploits invariant prompt context (system prompts, tool descriptions) rather than user queries, enabling reliable execution under arbitrary user inputs.
- Proposes QueryIPI, an automated blackbox optimization framework that iteratively refines malicious tool descriptions using system invariants for seed generation and reflection to overcome safety refusals.
- Demonstrates 87% attack success rate on five simulated coding agents (vs. 50% baseline) with transferability to real-world coding agents.