defense 2026

INFA-Guard: Mitigating Malicious Propagation via Infection-Aware Safeguarding in LLM-Based Multi-Agent Systems

Yijin Zhou ^1,2,3, Xiaoya Lu ^1,2, Dongrui Liu ², Junchi Yan ^1,3, Jing Shao ²

¹ Shanghai Jiao Tong University

² Shanghai Artificial Intelligence Laboratory

³ Shanghai Innovation Institute

0 citations · 36 references · arXiv

Published on arXiv

2601.14667

Prompt Injection

OWASP LLM Top 10 — LLM01

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

INFA-Guard reduces Attack Success Rate by an average of 33% across multi-agent system configurations while maintaining cross-model robustness and topological generalization.

INFA-Guard

Novel technique introduced

The rapid advancement of Large Language Model (LLM)-based Multi-Agent Systems (MAS) has introduced significant security vulnerabilities, where malicious influence can propagate virally through inter-agent communication. Conventional safeguards often rely on a binary paradigm that strictly distinguishes between benign and attack agents, failing to account for infected agents i.e., benign entities converted by attack agents. In this paper, we propose Infection-Aware Guard, INFA-Guard, a novel defense framework that explicitly identifies and addresses infected agents as a distinct threat category. By leveraging infection-aware detection and topological constraints, INFA-Guard accurately localizes attack sources and infected ranges. During remediation, INFA-Guard replaces attackers and rehabilitates infected ones, avoiding malicious propagation while preserving topological integrity. Extensive experiments demonstrate that INFA-Guard achieves state-of-the-art performance, reducing the Attack Success Rate (ASR) by an average of 33%, while exhibiting cross-model robustness, superior topological generalization, and high cost-effectiveness.

Key Contributions

Introduces a novel threat category of 'infected agents' (benign agents converted by attackers) distinct from both clean and attack agents, enabling more accurate threat modeling in MAS
Proposes infection-aware detection combined with topological constraints to localize attack sources and quantify infection spread across agent communication graphs
Develops a remediation strategy that replaces attacker agents and rehabilitates infected ones, reducing ASR by an average of 33% while preserving topological integrity

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

inference_timedigitalgrey_box

Applications

llm multi-agent systemsagentic ai pipelines

Read PDF arXiv DOI Code

INFA-Guard: Mitigating Malicious Propagation via Infection-Aware Safeguarding in LLM-Based Multi-Agent Systems

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Catching Contamination Before Generation: Spectral Kill Switches for Agents

From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration

Policy-as-Prompt: Turning AI Governance Rules into Guardrails for AI Agents

Factor(U,T): Controlling Untrusted AI by Monitoring their Plans

AgentGuardian: Learning Access Control Policies to Govern AI Agent Behavior

Verifiability-First Agents: Provable Observability and Lightweight Audit Agents for Controlling Autonomous LLM Systems

Async Control: Stress-testing Asynchronous Control Measures for LLM Agents

Structural Representations for Cross-Attack Generalization in AI Agent Threat Detection