defense 2026

Beyond Input Guardrails: Reconstructing Cross-Agent Semantic Flows for Execution-Aware Attack Detection

Yangyang Wei 1, Yijie Xu 1, Zhenyuan Li 1,1, Xiangmin Shen 2, Shouling Ji 1

0 citations

α

Published on arXiv

2603.04469

Prompt Injection

OWASP LLM Top 10 — LLM01

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

MAScope achieves F1-scores of 85.3% and 66.7% for node-level and path-level detection of compound attack vectors, including indirect prompt injection, across multi-agent LLM systems.

MAScope

Novel technique introduced


Multi-Agent System is emerging as the \textit{de facto} standard for complex task orchestration. However, its reliance on autonomous execution and unstructured inter-agent communication introduces severe risks, such as indirect prompt injection, that easily circumvent conventional input guardrails. To address this, we propose \SysName, a framework that shifts the defensive paradigm from static input filtering to execution-aware analysis. By extracting and reconstructing Cross-Agent Semantic Flows, \SysName synthesizes fragmented operational primitives into contiguous behavioral trajectories, enabling a holistic view of system activity. We leverage a Supervisor LLM to scrutinize these trajectories, identifying anomalies across data flow violations, control flow deviations, and intent inconsistencies. Empirical evaluations demonstrate that \SysName effectively detects over ten distinct compound attack vectors, achieving F1-scores of 85.3\% and 66.7\% for node-level and path-level end-to-end attack detection, respectively. The source code is available at https://anonymous.4open.science/r/MAScope-71DC.


Key Contributions

  • Cross-Agent Semantic Flow reconstruction that synthesizes fragmented inter-agent operational primitives into contiguous behavioral trajectories for holistic MAS monitoring
  • Supervisor LLM-based anomaly detection across data flow violations, control flow deviations, and intent inconsistencies in multi-agent pipelines
  • Empirical evaluation detecting over 10 distinct compound attack vectors with F1-scores of 85.3% (node-level) and 66.7% (path-level) for end-to-end attack detection

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
inference_timeblack_box
Applications
multi-agent systemsllm orchestrationagentic ai pipelines