defense 2025

Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation

Chengcan Wu , Zhixin Zhang , Mingqian Xu , Zeming Wei , Meng Sun

2 citations · 39 references · arXiv

α

Published on arXiv

2510.19420

Prompt Injection

OWASP LLM Top 10 — LLM01

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Significantly outperforms existing static MAS defense mechanisms across increasingly complex and dynamic multi-agent environments.

Node Evaluation-based Monitoring

Novel technique introduced


Large Language Model (LLM)-based Multi-Agent Systems (MAS) have become a popular paradigm of AI applications. However, trustworthiness issues in MAS remain a critical concern. Unlike challenges in single-agent systems, MAS involve more complex communication processes, making them susceptible to corruption attacks. To mitigate this issue, several defense mechanisms have been developed based on the graph representation of MAS, where agents represent nodes and communications form edges. Nevertheless, these methods predominantly focus on static graph defense, attempting to either detect attacks in a fixed graph structure or optimize a static topology with certain defensive capabilities. To address this limitation, we propose a dynamic defense paradigm for MAS graph structures, which continuously monitors communication within the MAS graph, then dynamically adjusts the graph topology, accurately disrupts malicious communications, and effectively defends against evolving and diverse dynamic attacks. Experimental results in increasingly complex and dynamic MAS environments demonstrate that our method significantly outperforms existing MAS defense mechanisms, contributing an effective guardrail for their trustworthy applications. Our code is available at https://github.com/ChengcanWu/Monitoring-LLM-Based-Multi-Agent-Systems.


Key Contributions

  • Dynamic defense paradigm for LLM-based MAS that continuously monitors communication graphs rather than relying on static graph analysis
  • Node evaluation mechanism that identifies malicious agents by assessing the integrity of their communications and adjusts graph topology accordingly
  • Demonstrated effectiveness against evolving and diverse dynamic corruption attacks in increasingly complex multi-agent environments

🛡️ Threat Analysis


Details

Domains
nlpgraph
Model Types
llm
Threat Tags
inference_timedigital
Applications
llm multi-agent systemsagent pipelines