survey 2026

Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study

Luyao Xu ^1,2, Xiang Chen ^1,2

¹ Nantong University

² Nanjing University

0 citations

Published on arXiv

2604.27464

Prompt Injection

OWASP LLM Top 10 — LLM01

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Identifies that threats in agent frameworks propagate across layers from input manipulation to ecosystem-wide impact, revealing need for integrated defenses

Autonomous agent frameworks built upon large language models (LLMs) are evolving into complex, tool-integrated, and continuously operating systems, introducing security risks beyond traditional prompt-level vulnerabilities. As this paradigm is still at an early stage of development, a timely and systematic understanding of its security implications is increasingly important. Although a growing body of work has examined different attack surfaces and defense problems in agent systems, existing studies remain scattered across individual aspects of agent security, and there is still a lack of a layered review on this topic. To address this gap, this survey presents a layered review of security risks and defense strategies in autonomous agent frameworks, with OpenClaw as a case study. We organize the analysis into four security-relevant layers: the context and instruction layer, the tool and action layer, the state and persistence layer, and the ecosystem and automation layer. For each layer, we summarize its functional role, representative security risks, and corresponding defense strategies. Based on this layered analysis, we further identify that threats in autonomous agent frameworks may propagate across layers, from manipulated inputs to unsafe actions, persistent state contamination, and broader ecosystem-level impact. Finally, we highlight potential key challenges, including research imbalance across layers, the lack of long-horizon evaluation, and weak ecosystem trust models, and outline future directions toward more systematic and integrated defenses.

Key Contributions

First layered security framework for autonomous agent systems organizing threats into context/instruction, tool/action, state/persistence, and ecosystem/automation layers
Identifies cross-layer threat propagation from manipulated inputs to unsafe actions and persistent contamination
Highlights research gaps including layer imbalance, lack of long-horizon evaluation, and weak ecosystem trust models

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_time

Applications

autonomous agentsllm-based frameworkstool-integrated ai systems

Read PDF arXiv

Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

SoK: Trust-Authorization Mismatch in LLM Agent Interactions

Human Society-Inspired Approaches to Agentic AI Security: The 4C Framework

From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

A Survey on Agentic Security: Applications, Threats and Defenses

Taming OpenClaw: Security Analysis and Mitigation of Autonomous LLM Agent Threats

Systems Security Foundations for Agentic Computing

Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains

Securing the Model Context Protocol (MCP): Risks, Controls, and Governance