survey 2025

SoK: Trust-Authorization Mismatch in LLM Agent Interactions

Guanquan Shi 1, Haohua Du 1, Zhiqiang Wang 1, Xiaoyu Liang 2, Weiwenpei Liu 1, Song Bian 1, Zhenyu Guan 1

2 citations · 1 influential · 134 references · arXiv

α

Published on arXiv

2512.06914

Prompt Injection

OWASP LLM Top 10 — LLM01

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Shows that diverse LLM agent threats — from prompt injection to tool poisoning — share a common structural root cause (static permissions decoupled from runtime trust), and proposes B-I-P as a unifying lens to identify gaps in current defenses

Belief-Intention-Permission (B-I-P) framework

Novel technique introduced


Large Language Models (LLMs) are evolving into autonomous agents capable of executing complex workflows via standardized protocols (e.g., MCP). However, this paradigm shifts control from deterministic code to probabilistic inference, creating a fundamental Trust-Authorization Mismatch: static permissions are structurally decoupled from the agent's fluctuating runtime trustworthiness. In this Systematization of Knowledge (SoK), we survey more than 200 representative papers to categorize the emerging landscape of agent security. We propose the Belief-Intention-Permission (B-I-P) framework as a unifying formal lens. By decomposing agent execution into three distinct stages-Belief Formation, Intent Generation, and Permission Grant-we demonstrate that diverse threats, from prompt injection to tool poisoning, share a common root cause: the desynchronization between dynamic trust states and static authorization boundaries. Using the B-I-P lens, we systematically map existing attacks and defenses and identify critical gaps where current mechanisms fail to bridge this mismatch. Finally, we outline a research agenda for shifting from static Role-Based Access Control (RBAC) to dynamic, risk-adaptive authorization.


Key Contributions

  • Proposes the Belief-Intention-Permission (B-I-P) framework to decompose LLM agent execution into three stages and unify analysis of diverse agent security threats
  • Identifies Trust-Authorization Mismatch — the structural decoupling of static permissions from fluctuating runtime trustworthiness — as the shared root cause across agent attack categories
  • Surveys 200+ papers to systematically map existing attacks and defenses, identifies critical gaps, and outlines a research agenda for dynamic, risk-adaptive authorization replacing static RBAC

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
inference_time
Applications
llm agentsautonomous ai systemsmcp-based workflows