Execution Is the New Attack Surface: Survivability-Aware Agentic Crypto Trading with OpenClaw-Style Local Executors
Ailiya Borjigin 1, Igor Stadnyk 1, Ben Bilski 1, Serhii Hovorov 2, Sofiia Pidturkina 2
Published on arXiv
2603.10092
Insecure Plugin Design
OWASP LLM Top 10 — LLM07
Excessive Agency
OWASP LLM Top 10 — LLM08
Key Finding
SAE reduces maximum drawdown by 93.1% (0.4643 → 0.0319) and attack success from 1.00 to 0.728 with zero false blocks on a crypto trading replay, confirmed by block bootstrap and Wilcoxon tests.
SAE (Survivability-Aware Execution)
Novel technique introduced
OpenClaw-style agent stacks turn language into privileged execution: LLM intents flow through tool interception, policy gates, and a local executor. In parallel, skill marketplaces such as skills.sh make capability acquisition as easy as installing skills and CLIs, creating a growing capability supply chain. Together, these trends shift the dominant safety failure mode from "wrong answers" to execution-induced loss, where untrusted prompts, compromised skills, or narrative manipulation can trigger real trades and irreversible side effects. We propose Survivability-Aware Execution (SAE), an execution-layer survivability standard for OpenClaw-style systems and skill-enabled agents. SAE sits as middleware between a strategy engine (LLM or non-LLM) and the exchange executor. It defines an explicit execution contract (ExecutionRequest, ExecutionContext, ExecutionDecision) and enforces non-bypassable last-mile invariants: projection-based exposure budgets, cooldown and order-rate limits, slippage bounds, staged execution, and tool/venue allowlists. To make delegated execution testable under supply-chain risk, we operationalize the Delegation Gap (DG) via a logged Intended Policy Spec that enables deterministic out-of-scope labeling and reproducible DG metrics. On an offline replay using official Binance USD-M BTCUSDT/ETHUSDT perpetual data (15m; 2025-09-01--2025-12-01, incl. funding), SAE improves survivability: MDD drops from 0.4643 to 0.0319 (Full; 93.1%), |CVaR_0.99| shrinks from 4.025e-3 to ~1.02e-4 (~97.5%), and DG loss proxy falls from 0.647 to 0.019 (~97.0%). AttackSuccess decreases from 1.00 to 0.728 with zero FalseBlock in this run. Block bootstrap, paired Wilcoxon, and two-proportion tests confirm the shifts. SAE reframes agentic trading safety for the OpenClaw+skills era: treat upstream intent and skills as untrusted, and enforce survivability where actions become side effects.
Key Contributions
- SAE middleware with an explicit execution contract (ExecutionRequest/ExecutionContext/ExecutionDecision) enforcing non-bypassable last-mile invariants — exposure budgets, cooldown/order-rate limits, slippage bounds, and tool/venue allowlists — against untrusted LLM intents and compromised skills
- Delegation Gap (DG) metric operationalized via a logged Intended Policy Spec, enabling deterministic out-of-scope labeling and reproducible auditing of agent execution against intended policy
- Empirical evaluation on Binance USD-M perpetual data showing 93.1% MDD reduction, 97.5% CVaR tail-loss reduction, and AttackSuccess drop from 1.00 to 0.728 with zero false blocks versus a no-SAE baseline