attack 2026

Overthinking Loops in Agents: A Structural Risk via MCP Tools

Yohan Lee 1, Jisoo Jang 2, Seoyeon Choi 2, Sangyeop Kim 3, Seungtaek Choi 2

0 citations · 36 references · arXiv (Cornell University)

α

Published on arXiv

2602.14798

Model Denial of Service

OWASP LLM Top 10 — LLM04

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Key Finding

Malicious MCP tools induce cyclic agent trajectories causing up to 142.4× token inflation across multiple tool-capable LLMs, with decoding-time concision controls unable to prevent loop induction.

Structural Overthinking Attack

Novel technique introduced


Tool-using LLM agents increasingly coordinate real workloads by selecting and chaining third-party tools based on text-visible metadata such as tool names, descriptions, and return messages. We show that this convenience creates a supply-chain attack surface: a malicious MCP tool server can be co-registered alongside normal tools and induce overthinking loops, where individually trivial or plausible tool calls compose into cyclic trajectories that inflate end-to-end tokens and latency without any single step looking abnormal. We formalize this as a structural overthinking attack, distinguishable from token-level verbosity, and implement 14 malicious tools across three servers that trigger repetition, forced refinement, and distraction. Across heterogeneous registries and multiple tool-capable models, the attack causes severe resource amplification (up to $142.4\times$ tokens) and can degrade task outcomes. Finally, we find that decoding-time concision controls do not reliably prevent loop induction, suggesting defenses should reason about tool-call structure rather than tokens alone.


Key Contributions

  • Formalizes 'structural overthinking attacks' — cyclic tool-call trajectories in LLM agents that inflate tokens/latency without any single step appearing anomalous
  • Implements 14 malicious MCP tools across 3 servers embodying three loop strategies (repetition, forced refinement, distraction), achieving up to 142.4× token amplification
  • Demonstrates that decoding-time concision controls (e.g., verbosity penalties) are insufficient defenses, motivating tool-call-structure-aware mitigations

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
black_boxinference_time
Applications
llm tool-use agentsmcp-based agent frameworks