attack arXiv Apr 14, 2026 · 3d ago
Yongxuan Wu, Xixun Lin, He Zhang et al. · Chinese Academy of Sciences · University of Chinese Academy of Sciences +2 more
Black-box attack inferring LLM multi-agent system communication topologies via adversarial queries, achieving 99% peak AUC
Model Theft Excessive Agency nlp
LLM-based Multi-Agent Systems (MAS) have demonstrated remarkable capabilities in solving complex tasks. Central to MAS is the communication topology which governs how agents exchange information internally. Consequently, the security of communication topologies has attracted increasing attention. In this paper, we investigate a critical privacy risk: MAS communication topologies can be inferred under a restrictive black-box setting, exposing system vulnerabilities and posing significant intellectual property threats. To explore this risk, we propose Communication Inference Attack (CIA), a novel attack that constructs new adversarial queries to induce intermediate agents' reasoning outputs and models their semantic correlations through the proposed global bias disentanglement and LLM-guided weak supervision. Extensive experiments on MAS with optimized communication topologies demonstrate the effectiveness of CIA, achieving an average AUC of 0.87 and a peak AUC of up to 0.99, thereby revealing the substantial privacy risk in MAS.
llm Chinese Academy of Sciences · University of Chinese Academy of Sciences · Griffith University +1 more