survey 2026

Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study

Luyao Xu 1,2, Xiang Chen 1,2

0 citations

α

Published on arXiv

2604.27464

Prompt Injection

OWASP LLM Top 10 — LLM01

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Identifies that threats in agent frameworks propagate across layers from input manipulation to ecosystem-wide impact, revealing need for integrated defenses


Autonomous agent frameworks built upon large language models (LLMs) are evolving into complex, tool-integrated, and continuously operating systems, introducing security risks beyond traditional prompt-level vulnerabilities. As this paradigm is still at an early stage of development, a timely and systematic understanding of its security implications is increasingly important. Although a growing body of work has examined different attack surfaces and defense problems in agent systems, existing studies remain scattered across individual aspects of agent security, and there is still a lack of a layered review on this topic. To address this gap, this survey presents a layered review of security risks and defense strategies in autonomous agent frameworks, with OpenClaw as a case study. We organize the analysis into four security-relevant layers: the context and instruction layer, the tool and action layer, the state and persistence layer, and the ecosystem and automation layer. For each layer, we summarize its functional role, representative security risks, and corresponding defense strategies. Based on this layered analysis, we further identify that threats in autonomous agent frameworks may propagate across layers, from manipulated inputs to unsafe actions, persistent state contamination, and broader ecosystem-level impact. Finally, we highlight potential key challenges, including research imbalance across layers, the lack of long-horizon evaluation, and weak ecosystem trust models, and outline future directions toward more systematic and integrated defenses.


Key Contributions

  • First layered security framework for autonomous agent systems organizing threats into context/instruction, tool/action, state/persistence, and ecosystem/automation layers
  • Identifies cross-layer threat propagation from manipulated inputs to unsafe actions and persistent contamination
  • Highlights research gaps including layer imbalance, lack of long-horizon evaluation, and weak ecosystem trust models

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
inference_time
Applications
autonomous agentsllm-based frameworkstool-integrated ai systems