attack 2026

Clawdrain: Exploiting Tool-Calling Chains for Stealthy Token Exhaustion in OpenClaw Agents

Ben Dong , Hui Feng , Qian Wang

0 citations

α

Published on arXiv

2603.00902

Model Denial of Service

OWASP LLM Top 10 — LLM04

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Key Finding

Achieves 6-7x token amplification (up to ~9x in failure-cascade mode) via a Trojanized OpenClaw skill on Gemini 2.5 Pro, with emergent tool-composition behavior partially mitigating the attack while altering its stealth profile.

Clawdrain (Segmented Verification Protocol)

Novel technique introduced


Modern generative agents such as OpenClaw - an open-source, self-hosted personal assistant with a community skill ecosystem, are gaining attention and are used pervasively. However, the openness and rapid growth of these ecosystems often outpace systematic security evaluation. In this paper, we design, implement, and evaluate Clawdrain, a Trojanized skill that induces a multi-turn "Segmented Verification Protocol" via injected SKILL.md instructions and a companion script that returns PROGRESS/REPAIR/TERMINAL signals. We deploy Clawdrain in a production-like OpenClaw instance with real API billing and a production model (Gemini 2.5 Pro), and we measure 6-7x token amplification over a benign baseline, with a costly, failure configuration reaching approximately 9x. We observe a deployment-only phenomenon: the agent autonomously composes general-purpose tools (e.g., shell/Python) to route around brittle protocol steps, reducing amplification and altering attack dynamics. Finally, we identify production vectors enabled by OpenClaw's architecture, including SKILL.md prompt bloat, persistent tool-output pollution, cron/heartbeat frequency amplification, and behavioral instruction injection. Overall, we demonstrate that token-drain attacks remain feasible in real deployments, but their magnitude and observability are shaped by tool composition, recovery behavior, and interface design.


Key Contributions

  • Clawdrain: a Trojanized OpenClaw skill implementing a multi-turn Segmented Verification Protocol (SVP) with PROGRESS/REPAIR/TERMINAL signals that drives iterative calibration loops, achieving 6-7x token amplification (up to ~9x in costly-failure mode) on Gemini 2.5 Pro with real API billing.
  • Empirical characterization of a simulator-deployment gap: agents autonomously compose general-purpose tools (shell/Python) to route around brittle SVP steps, partially mitigating amplification and altering attack dynamics in ways not observed in offline simulation.
  • Analysis of four deployment-grounded attack vectors in OpenClaw's architecture: SKILL.md prompt bloat, persistent tool-output pollution in context history, cron/heartbeat frequency amplification, and behavioral instruction injection — each representing a distinct cost surface beyond output-token amplification.

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
inference_timeblack_box
Applications
llm agentspersonal assistant agentstool-calling agent frameworks