attack 2026

Clawdrain: Exploiting Tool-Calling Chains for Stealthy Token Exhaustion in OpenClaw Agents

Ben Dong , Hui Feng , Qian Wang

University of California, Merced

0 citations

Published on arXiv

2603.00902

Model Denial of Service

OWASP LLM Top 10 — LLM04

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Key Finding

Achieves 6-7x token amplification (up to ~9x in failure-cascade mode) via a Trojanized OpenClaw skill on Gemini 2.5 Pro, with emergent tool-composition behavior partially mitigating the attack while altering its stealth profile.

Clawdrain (Segmented Verification Protocol)

Novel technique introduced

Modern generative agents such as OpenClaw - an open-source, self-hosted personal assistant with a community skill ecosystem, are gaining attention and are used pervasively. However, the openness and rapid growth of these ecosystems often outpace systematic security evaluation. In this paper, we design, implement, and evaluate Clawdrain, a Trojanized skill that induces a multi-turn "Segmented Verification Protocol" via injected SKILL.md instructions and a companion script that returns PROGRESS/REPAIR/TERMINAL signals. We deploy Clawdrain in a production-like OpenClaw instance with real API billing and a production model (Gemini 2.5 Pro), and we measure 6-7x token amplification over a benign baseline, with a costly, failure configuration reaching approximately 9x. We observe a deployment-only phenomenon: the agent autonomously composes general-purpose tools (e.g., shell/Python) to route around brittle protocol steps, reducing amplification and altering attack dynamics. Finally, we identify production vectors enabled by OpenClaw's architecture, including SKILL.md prompt bloat, persistent tool-output pollution, cron/heartbeat frequency amplification, and behavioral instruction injection. Overall, we demonstrate that token-drain attacks remain feasible in real deployments, but their magnitude and observability are shaped by tool composition, recovery behavior, and interface design.

Key Contributions

Clawdrain: a Trojanized OpenClaw skill implementing a multi-turn Segmented Verification Protocol (SVP) with PROGRESS/REPAIR/TERMINAL signals that drives iterative calibration loops, achieving 6-7x token amplification (up to ~9x in costly-failure mode) on Gemini 2.5 Pro with real API billing.
Empirical characterization of a simulator-deployment gap: agents autonomously compose general-purpose tools (shell/Python) to route around brittle SVP steps, partially mitigating amplification and altering attack dynamics in ways not observed in offline simulation.
Analysis of four deployment-grounded attack vectors in OpenClaw's architecture: SKILL.md prompt bloat, persistent tool-output pollution in context history, cron/heartbeat frequency amplification, and behavioral instruction injection — each representing a distinct cost surface beyond output-token amplification.

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_timeblack_box

Applications

llm agentspersonal assistant agentstool-calling agent frameworks

Read PDF arXiv

Clawdrain: Exploiting Tool-Calling Chains for Stealthy Token Exhaustion in OpenClaw Agents

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Overthinking Loops in Agents: A Structural Risk via MCP Tools

Beyond Max Tokens: Stealthy Resource Amplification via Tool Calling Chains in LLM Agents

Rethinking Latency Denial-of-Service: Attacking the LLM Serving Framework, Not the Model

ToolFlood: Beyond Selection -- Hiding Valid Tools from LLM Agents via Semantic Covering

LeechHijack: Covert Computational Resource Exploitation in Intelligent Agent Systems

Attractive Metadata Attack: Inducing LLM Agents to Invoke Malicious Tools

Sponge Tool Attack: Stealthy Denial-of-Efficiency against Tool-Augmented Agentic Reasoning

POT: Inducing Overthinking in LLMs via Black-Box Iterative Optimization