CIA: Inferring the Communication Topology from LLM-based Multi-Agent Systems
Yongxuan Wu 1,2, Xixun Lin 1,2, He Zhang 3, Nan Sun 1,2, Kun Wang 4, Chuan Zhou 1, Shirui Pan 3, Yanan Cao 1,2
Published on arXiv
2604.12461
Model Theft
OWASP ML Top 10 — ML05
Excessive Agency
OWASP LLM Top 10 — LLM08
Key Finding
Achieves average AUC of 0.87 and peak AUC of 0.99 in inferring MAS communication topologies under black-box settings
CIA (Communication Inference Attack)
Novel technique introduced
LLM-based Multi-Agent Systems (MAS) have demonstrated remarkable capabilities in solving complex tasks. Central to MAS is the communication topology which governs how agents exchange information internally. Consequently, the security of communication topologies has attracted increasing attention. In this paper, we investigate a critical privacy risk: MAS communication topologies can be inferred under a restrictive black-box setting, exposing system vulnerabilities and posing significant intellectual property threats. To explore this risk, we propose Communication Inference Attack (CIA), a novel attack that constructs new adversarial queries to induce intermediate agents' reasoning outputs and models their semantic correlations through the proposed global bias disentanglement and LLM-guided weak supervision. Extensive experiments on MAS with optimized communication topologies demonstrate the effectiveness of CIA, achieving an average AUC of 0.87 and a peak AUC of up to 0.99, thereby revealing the substantial privacy risk in MAS.
Key Contributions
- Novel Communication Inference Attack (CIA) that infers MAS communication topologies under restrictive black-box settings using adversarial query construction
- Global bias disentanglement and LLM-guided weak supervision to model semantic correlations between intermediate agents' reasoning outputs
- Demonstration of severe privacy risk with average AUC of 0.87 and peak AUC of 0.99 across multiple task scenarios
🛡️ Threat Analysis
The attack infers the communication topology of MAS, which represents valuable intellectual property encapsulating computational resources and expert knowledge. This is theft of proprietary system design/architecture information, not model weights themselves, but the paper explicitly frames it as IP theft of a valuable proprietary asset that undermines competitive advantage.