defense 2026

Beyond Input Guardrails: Reconstructing Cross-Agent Semantic Flows for Execution-Aware Attack Detection

Yangyang Wei ¹, Yijie Xu ¹, Zhenyuan Li ^1,1, Xiangmin Shen ², Shouling Ji ¹

¹ Zhejiang University

² HOFSTRA University

0 citations

Published on arXiv

2603.04469

Prompt Injection

OWASP LLM Top 10 — LLM01

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

MAScope achieves F1-scores of 85.3% and 66.7% for node-level and path-level detection of compound attack vectors, including indirect prompt injection, across multi-agent LLM systems.

MAScope

Novel technique introduced

Multi-Agent System is emerging as the \textit{de facto} standard for complex task orchestration. However, its reliance on autonomous execution and unstructured inter-agent communication introduces severe risks, such as indirect prompt injection, that easily circumvent conventional input guardrails. To address this, we propose \SysName, a framework that shifts the defensive paradigm from static input filtering to execution-aware analysis. By extracting and reconstructing Cross-Agent Semantic Flows, \SysName synthesizes fragmented operational primitives into contiguous behavioral trajectories, enabling a holistic view of system activity. We leverage a Supervisor LLM to scrutinize these trajectories, identifying anomalies across data flow violations, control flow deviations, and intent inconsistencies. Empirical evaluations demonstrate that \SysName effectively detects over ten distinct compound attack vectors, achieving F1-scores of 85.3\% and 66.7\% for node-level and path-level end-to-end attack detection, respectively. The source code is available at https://anonymous.4open.science/r/MAScope-71DC.

Key Contributions

Cross-Agent Semantic Flow reconstruction that synthesizes fragmented inter-agent operational primitives into contiguous behavioral trajectories for holistic MAS monitoring
Supervisor LLM-based anomaly detection across data flow violations, control flow deviations, and intent inconsistencies in multi-agent pipelines
Empirical evaluation detecting over 10 distinct compound attack vectors with F1-scores of 85.3% (node-level) and 66.7% (path-level) for end-to-end attack detection

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_timeblack_box

Applications

multi-agent systemsllm orchestrationagentic ai pipelines

Read PDF arXiv Code

Beyond Input Guardrails: Reconstructing Cross-Agent Semantic Flows for Execution-Aware Attack Detection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Secure and Efficient Access Control for Computer-Use Agents via Context Space

Taming Various Privilege Escalation in LLM-Based Agent Systems: A Mandatory Access Control Framework

Agent Privilege Separation in OpenClaw: A Structural Defense Against Prompt Injection

ceLLMate: Sandboxing Browser AI Agents

AgentArmor: Enforcing Program Analysis on Agent Runtime Trace to Defend Against Prompt Injection

BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents

AgenTRIM: Tool Risk Mitigation for Agentic AI

AgentWatcher: A Rule-based Prompt Injection Monitor