attack 2026

Beyond Max Tokens: Stealthy Resource Amplification via Tool Calling Chains in LLM Agents

2 citations · 1 influential · 74 references · arXiv

Published on arXiv

2601.10955

Model Denial of Service

OWASP LLM Top 10 — LLM04

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Key Finding

Attack inflates LLM agent costs by up to 658x and energy by 100–560x across six LLMs by extending task trajectories beyond 60,000 tokens through stealthy MCP tool server manipulation, while keeping final answers correct to evade validation.

MCTS-optimized Tool Response Manipulation

Novel technique introduced

The agent-tool communication loop is a critical attack surface in modern Large Language Model (LLM) agents. Existing Denial-of-Service (DoS) attacks, primarily triggered via user prompts or injected retrieval-augmented generation (RAG) context, are ineffective for this new paradigm. They are fundamentally single-turn and often lack a task-oriented approach, making them conspicuous in goal-oriented workflows and unable to exploit the compounding costs of multi-turn agent-tool interactions. We introduce a stealthy, multi-turn economic DoS attack that operates at the tool layer under the guise of a correctly completed task. Our method adjusts text-visible fields and a template-governed return policy in a benign, Model Context Protocol (MCP)-compatible tool server, optimizing these edits with a Monte Carlo Tree Search (MCTS) optimizer. These adjustments leave function signatures unchanged and preserve the final payload, steering the agent into prolonged, verbose tool-calling sequences using text-only notices. This compounds costs across turns, escaping single-turn caps while keeping the final answer correct to evade validation. Across six LLMs on the ToolBench and BFCL benchmarks, our attack expands tasks into trajectories exceeding 60,000 tokens, inflates costs by up to 658x, and raises energy by 100-560x. It drives GPU KV cache occupancy from <1% to 35-74% and cuts co-running throughput by approximately 50%. Because the server remains protocol-compatible and task outcomes are correct, conventional checks fail. These results elevate the agent-tool interface to a first-class security frontier, demanding a paradigm shift from validating final answers to monitoring the economic and computational cost of the entire agentic process.

Key Contributions

First multi-turn stealthy economic DoS attack operating at the tool/MCP layer, exploiting compounding costs across agent-tool dialogue turns rather than single-turn token limits
MCTS optimizer that adjusts text-visible tool server fields and return policies to maximize token trajectory length while preserving task correctness and protocol compatibility
Empirical demonstration across six LLMs on ToolBench and BFCL showing 658x cost inflation, 100–560x energy increase, and ~50% throughput reduction while evading conventional output validation

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_time

Datasets

ToolBenchBFCL

Applications

llm agentstool-augmented ai systemsmcp-based agentic workflowsapi-billed llm deployments

Read PDF arXiv DOI

Beyond Max Tokens: Stealthy Resource Amplification via Tool Calling Chains in LLM Agents

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Clawdrain: Exploiting Tool-Calling Chains for Stealthy Token Exhaustion in OpenClaw Agents

Overthinking Loops in Agents: A Structural Risk via MCP Tools

Rethinking Latency Denial-of-Service: Attacking the LLM Serving Framework, Not the Model

ToolFlood: Beyond Selection -- Hiding Valid Tools from LLM Agents via Semantic Covering

POT: Inducing Overthinking in LLMs via Black-Box Iterative Optimization

ThinkTrap: Denial-of-Service Attacks against Black-box LLM Services via Infinite Thinking

LeechHijack: Covert Computational Resource Exploitation in Intelligent Agent Systems

Attractive Metadata Attack: Inducing LLM Agents to Invoke Malicious Tools