attack 2026

Is Monitoring Enough? Strategic Agent Selection For Stealthy Attack in Multi-Agent Discussions

Qiuchi Xiang , Haoxuan Qu , Hossein Rahmani , Jun Liu

0 citations

α

Published on arXiv

2603.21194

Prompt Injection

OWASP LLM Top 10 — LLM01

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Shows existing attacks achieve >93.5% detection rates under monitoring and <7.6% success when adapted for stealth, motivating a new attack approach for the discussion-monitored scenario


Multi-agent discussions have been widely adopted, motivating growing efforts to develop attacks that expose their vulnerabilities. In this work, we study a practical yet largely unexplored attack scenario, the discussion-monitored scenario, where anomaly detectors continuously monitor inter-agent communications and block detected adversarial messages. Although existing attacks are effective without discussion monitoring, we show that they exhibit detectable patterns and largely fail under such monitoring constraints. But does this imply that monitoring alone is sufficient to secure multi-agent discussions? To answer this question, we develop a novel attack method explicitly tailored to the discussion-monitored scenario. Extensive experiments demonstrate that effective attacks remain possible even under continuous monitoring, indicating that monitoring alone does not eliminate adversarial risks.


Key Contributions

  • Identifies and formalizes the 'discussion-monitored scenario' where anomaly detectors continuously monitor multi-agent communications
  • Demonstrates that existing multi-agent attacks fail under continuous monitoring (>93.5% detection rate)
  • Develops novel stealth attack method tailored to evade anomaly detection while maintaining attack effectiveness in monitored multi-agent discussions

🛡️ Threat Analysis


Details

Domains
nlpmultimodal
Model Types
llmmultimodal
Threat Tags
inference_timeblack_box
Applications
multi-agent systemsmedical diagnosislegal judgmentsoftware development