attack 2025

QueryIPI: Query-agnostic Indirect Prompt Injection on Coding Agents

Yuchong Xie ¹, Zesen Liu ¹, Mingyu Luo ², Zhixiang Zhang ¹, Kaikai Zhang ¹, Yuanyuan Yuan ¹, Zongjie Li ¹, Ping Chen ³, Shuai Wang ¹, Dongdong She ²

¹ The Hong Kong University of Science and Technology

² Fudan University

³ Tsinghua University

1 citations · 38 references · arXiv

Published on arXiv

2510.23675

Prompt Injection

OWASP LLM Top 10 — LLM01

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Key Finding

QueryIPI achieves up to 87% success rate on simulated coding agents, outperforming the best IPI baseline by 37 percentage points, and transfers to real-world coding agents.

QueryIPI

Novel technique introduced

Modern coding agents integrated into IDEs orchestrate powerful tools and high-privilege system access, creating a high-stakes attack surface. Prior work on Indirect Prompt Injection (IPI) is mainly query-specific, requiring particular user queries as triggers and leading to poor generalizability. We propose query-agnostic IPI, a new attack paradigm that reliably executes malicious payloads under arbitrary user queries. Our key insight is that malicious payloads should leverage the invariant prompt context (i.e., system prompt and tool descriptions) rather than variant user queries. We present QueryIPI, an automated framework that uses tool descriptions as optimizable payloads and refines them via iterative, prompt-based blackbox optimization. QueryIPI leverages system invariants for initial seed generation aligned with agent conventions, and iterative reflection to resolve instruction-following failures and safety refusals. Experiments on five simulated agents show that QueryIPI achieves up to 87% success rate, outperforming the best baseline (50%). Crucially, generated malicious descriptions transfer to real-world coding agents, highlighting a practical security risk.

Key Contributions

Introduces query-agnostic IPI, a new attack paradigm that exploits invariant prompt context (system prompts, tool descriptions) rather than user queries, enabling reliable execution under arbitrary user inputs.
Proposes QueryIPI, an automated blackbox optimization framework that iteratively refines malicious tool descriptions using system invariants for seed generation and reflection to overcome safety refusals.
Demonstrates 87% attack success rate on five simulated coding agents (vs. 50% baseline) with transferability to real-world coding agents.

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_timetargeteddigital

Datasets

five simulated coding agents (custom evaluation environment)

Applications

coding agentside-integrated llm agentsllm tool-use agents

Read PDF arXiv DOI Code

QueryIPI: Query-agnostic Indirect Prompt Injection on Coding Agents

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

SkillJect: Automating Stealthy Skill-Based Prompt Injection for Coding Agents with Trace-Driven Closed-Loop Refinement

Cuckoo Attack: Stealthy and Persistent Attacks Against AI-IDE

When Skills Lie: Hidden-Comment Injection in LLM Agents

STAC: When Innocent Tools Form Dangerous Chains to Jailbreak LLM Agents

Jailbreaking Large Language Models through Iterative Tool-Disguised Attacks via Reinforcement Learning

MCP-ITP: An Automated Framework for Implicit Tool Poisoning in MCP

Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search

Automatic Red Teaming LLM-based Agents with Model Context Protocol Tools