defense 2026

Execution Is the New Attack Surface: Survivability-Aware Agentic Crypto Trading with OpenClaw-Style Local Executors

Ailiya Borjigin 1, Igor Stadnyk 1, Ben Bilski 1, Serhii Hovorov 2, Sofiia Pidturkina 2

0 citations

α

Published on arXiv

2603.10092

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

SAE reduces maximum drawdown by 93.1% (0.4643 → 0.0319) and attack success from 1.00 to 0.728 with zero false blocks on a crypto trading replay, confirmed by block bootstrap and Wilcoxon tests.

SAE (Survivability-Aware Execution)

Novel technique introduced


OpenClaw-style agent stacks turn language into privileged execution: LLM intents flow through tool interception, policy gates, and a local executor. In parallel, skill marketplaces such as skills.sh make capability acquisition as easy as installing skills and CLIs, creating a growing capability supply chain. Together, these trends shift the dominant safety failure mode from "wrong answers" to execution-induced loss, where untrusted prompts, compromised skills, or narrative manipulation can trigger real trades and irreversible side effects. We propose Survivability-Aware Execution (SAE), an execution-layer survivability standard for OpenClaw-style systems and skill-enabled agents. SAE sits as middleware between a strategy engine (LLM or non-LLM) and the exchange executor. It defines an explicit execution contract (ExecutionRequest, ExecutionContext, ExecutionDecision) and enforces non-bypassable last-mile invariants: projection-based exposure budgets, cooldown and order-rate limits, slippage bounds, staged execution, and tool/venue allowlists. To make delegated execution testable under supply-chain risk, we operationalize the Delegation Gap (DG) via a logged Intended Policy Spec that enables deterministic out-of-scope labeling and reproducible DG metrics. On an offline replay using official Binance USD-M BTCUSDT/ETHUSDT perpetual data (15m; 2025-09-01--2025-12-01, incl. funding), SAE improves survivability: MDD drops from 0.4643 to 0.0319 (Full; 93.1%), |CVaR_0.99| shrinks from 4.025e-3 to ~1.02e-4 (~97.5%), and DG loss proxy falls from 0.647 to 0.019 (~97.0%). AttackSuccess decreases from 1.00 to 0.728 with zero FalseBlock in this run. Block bootstrap, paired Wilcoxon, and two-proportion tests confirm the shifts. SAE reframes agentic trading safety for the OpenClaw+skills era: treat upstream intent and skills as untrusted, and enforce survivability where actions become side effects.


Key Contributions

  • SAE middleware with an explicit execution contract (ExecutionRequest/ExecutionContext/ExecutionDecision) enforcing non-bypassable last-mile invariants — exposure budgets, cooldown/order-rate limits, slippage bounds, and tool/venue allowlists — against untrusted LLM intents and compromised skills
  • Delegation Gap (DG) metric operationalized via a logged Intended Policy Spec, enabling deterministic out-of-scope labeling and reproducible auditing of agent execution against intended policy
  • Empirical evaluation on Binance USD-M perpetual data showing 93.1% MDD reduction, 97.5% CVaR tail-loss reduction, and AttackSuccess drop from 1.00 to 0.728 with zero false blocks versus a no-SAE baseline

🛡️ Threat Analysis


Details

Domains
nlpreinforcement-learning
Model Types
llm
Threat Tags
inference_timeblack_box
Datasets
Binance USD-M BTCUSDT/ETHUSDT perpetual futures (15m, 2025-09-01 to 2025-12-01)
Applications
algorithmic tradingcrypto perpetual tradingllm agent systems