From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents
Xiaolei Zhang 1, Lu Zhou 1,2, Xiaogang Xu 3, Jiafei Wu 4, Tianyu Du 5, Heqing Huang 6, Hao Peng 7, Zhe Liu 5,1
1 Nanjing University of Aeronautics and Astronautics
2 Collaborative Innovation Center of Novel Software Technology and Industrialization
3 The Chinese University of Hong Kong
6 Huawei
Published on arXiv
2603.07496
Prompt Injection
OWASP LLM Top 10 — LLM01
Insecure Plugin Design
OWASP LLM Top 10 — LLM07
Excessive Agency
OWASP LLM Top 10 — LLM08
Key Finding
Identifies three distinct tiers of LLM agent security threats and shows that existing frameworks fail to address cross-tier systemic vulnerabilities in multi-agent ecosystems
Hierarchical Autonomy Evolution (HAE) framework
Novel technique introduced
Artificial Intelligence (AI) agents have evolved from passive predictive tools into active entities capable of autonomous decision-making and environmental interaction, driven by the reasoning capabilities of Large Language Models (LLMs). However, this evolution has introduced critical security vulnerabilities that existing frameworks fail to address. The Hierarchical Autonomy Evolution (HAE) framework organizes agent security into three tiers: Cognitive Autonomy (L1) targets internal reasoning integrity; Execution Autonomy (L2) covers tool-mediated environmental interaction; Collective Autonomy (L3) addresses systemic risks in multi-agent ecosystems. We present a taxonomy of threats spanning cognitive manipulation, physical environment disruption, and multi-agent systemic failures, and evaluate existing defenses while identifying key research gaps. The findings aim to guide the development of multilayered, autonomy-aware defense architectures for trustworthy AI agent systems.
Key Contributions
- Hierarchical Autonomy Evolution (HAE) framework organizing AI agent security into three tiers: Cognitive (L1), Execution (L2), and Collective (L3) autonomy
- Taxonomy of threats spanning cognitive manipulation, physical environment disruption via tools, and multi-agent systemic failures
- Evaluation of existing defenses per tier with identification of key research gaps for multilayered autonomy-aware architectures