attack 2026

Sponge Tool Attack: Stealthy Denial-of-Efficiency against Tool-Augmented Agentic Reasoning

Qi Li , Xinchao Wang

3 citations · 45 references · arXiv

α

Published on arXiv

2601.17566

Model Denial of Service

OWASP LLM Top 10 — LLM04

Key Finding

STA successfully converts concise agentic reasoning trajectories into verbose, resource-intensive ones while preserving task semantics, validated across 6 models and 4 agentic frameworks under query-only access.

Sponge Tool Attack (STA)

Novel technique introduced


Enabling large language models (LLMs) to solve complex reasoning tasks is a key step toward artificial general intelligence. Recent work augments LLMs with external tools to enable agentic reasoning, achieving high utility and efficiency in a plug-and-play manner. However, the inherent vulnerabilities of such methods to malicious manipulation of the tool-calling process remain largely unexplored. In this work, we identify a tool-specific attack surface and propose Sponge Tool Attack (STA), which disrupts agentic reasoning solely by rewriting the input prompt under a strict query-only access assumption. Without any modification on the underlying model or the external tools, STA converts originally concise and efficient reasoning trajectories into unnecessarily verbose and convoluted ones before arriving at the final answer. This results in substantial computational overhead while remaining stealthy by preserving the original task semantics and user intent. To achieve this, we design STA as an iterative, multi-agent collaborative framework with explicit rewritten policy control, and generates benign-looking prompt rewrites from the original one with high semantic fidelity. Extensive experiments across 6 models (including both open-source models and closed-source APIs), 12 tools, 4 agentic frameworks, and 13 datasets spanning 5 domains validate the effectiveness of STA.


Key Contributions

  • Identifies 'Denial-of-Efficiency' (DoE) as a novel, underexplored attack surface in tool-augmented LLM agentic systems
  • Proposes Sponge Tool Attack (STA), an iterative multi-agent framework (Prompt Rewriter, Quality Judge, Policy Inductor) that rewrites user queries under strict query-only access to induce unnecessarily verbose and expensive reasoning trajectories
  • Validates STA extensively across 6 models, 12 tools, 4 agentic frameworks, and 13 datasets while demonstrating stealthiness by preserving original task semantics

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
black_boxinference_timetargeted
Datasets
PuzzleVQAMathVistaOmniMath
Applications
tool-augmented llm agentsagentic reasoning systemsllm api services