defense 2025

The Sum Leaks More Than Its Parts: Compositional Privacy Risks and Mitigations in Multi-Agent Collaboration

Vaidehi Patil ¹, Elias Stengel-Eskin ^1,2, Mohit Bansal ¹

¹ The University of Texas at Austin

² UNC Chapel Hill

0 citations

Published on arXiv

2509.14284

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

CoDef achieves 79.8% Balanced Outcome while ToM reaches up to 97% sensitive blocking rate, versus only ~39% blocking from chain-of-thought reasoning alone.

Collaborative Consensus Defense (CoDef)

Novel technique introduced

As large language models (LLMs) become integral to multi-agent systems, new privacy risks emerge that extend beyond memorization, direct inference, or single-turn evaluations. In particular, seemingly innocuous responses, when composed across interactions, can cumulatively enable adversaries to recover sensitive information, a phenomenon we term compositional privacy leakage. We present the first systematic study of such compositional privacy leaks and possible mitigation methods in multi-agent LLM systems. First, we develop a framework that models how auxiliary knowledge and agent interactions jointly amplify privacy risks, even when each response is benign in isolation. Next, to mitigate this, we propose and evaluate two defense strategies: (1) Theory-of-Mind defense (ToM), where defender agents infer a questioner's intent by anticipating how their outputs may be exploited by adversaries, and (2) Collaborative Consensus Defense (CoDef), where responder agents collaborate with peers who vote based on a shared aggregated state to restrict sensitive information spread. Crucially, we balance our evaluation across compositions that expose sensitive information and compositions that yield benign inferences. Our experiments quantify how these defense strategies differ in balancing the privacy-utility trade-off. We find that while chain-of-thought alone offers limited protection to leakage (~39% sensitive blocking rate), our ToM defense substantially improves sensitive query blocking (up to 97%) but can reduce benign task success. CoDef achieves the best balance, yielding the highest Balanced Outcome (79.8%), highlighting the benefit of combining explicit reasoning with defender collaboration. Together, our results expose a new class of risks in collaborative LLM deployments and provide actionable insights for designing safeguards against compositional, context-driven privacy leakage.

Key Contributions

Formalization of compositional privacy leakage: a new threat class where cross-agent response composition exposes sensitive attributes no single agent would disclose
Theory-of-Mind (ToM) defense where agents anticipate adversarial intent before responding, achieving up to 97% sensitive query blocking
Collaborative Consensus Defense (CoDef) where agents vote based on aggregated shared state, achieving the best privacy-utility balance (79.8% Balanced Outcome)

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_timeblack_box

Datasets

synthetic multi-agent tabular scenarios with medical/organizational sensitive attributes

Applications

multi-agent llm systemsenterprise ai assistantsorganizational ai deployments

Read PDF arXiv Code

The Sum Leaks More Than Its Parts: Compositional Privacy Risks and Mitigations in Multi-Agent Collaboration

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Beyond Data Privacy: New Privacy Risks for Large Language Models

The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability

Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation

AgentLeak: A Full-Stack Benchmark for Privacy Leakage in Multi-Agent LLM Systems

Improving User Privacy in Personalized Generation: Client-Side Retrieval-Augmented Modification of Server-Side Generated Speculations

Burn-After-Use for Preventing Data Leakage through a Secure Multi-Tenant Architecture in Enterprise LLM

Throttling Web Agents Using Reasoning Gates

CacheSolidarity: Preventing Prefix Caching Side Channels in Multi-tenant LLM Serving Systems