attack 2025

FuncPoison: Poisoning Function Library to Hijack Multi-agent Autonomous Driving Systems

Yuzhen Long , Songze Li

1 citations · 68 references · arXiv

α

Published on arXiv

2509.24408

AI Supply Chain Attacks

OWASP ML Top 10 — ML06

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Key Finding

FuncPoison significantly degrades trajectory accuracy in LLM-based multi-agent autonomous driving systems, enables targeted agent manipulation, and evades diverse defense mechanisms

FuncPoison

Novel technique introduced


Autonomous driving systems increasingly rely on multi-agent architectures powered by large language models (LLMs), where specialized agents collaborate to perceive, reason, and plan. A key component of these systems is the shared function library, a collection of software tools that agents use to process sensor data and navigate complex driving environments. Despite its critical role in agent decision-making, the function library remains an under-explored vulnerability. In this paper, we introduce FuncPoison, a novel poisoning-based attack targeting the function library to manipulate the behavior of LLM-driven multi-agent autonomous systems. FuncPoison exploits two key weaknesses in how agents access the function library: (1) agents rely on text-based instructions to select tools; and (2) these tools are activated using standardized command formats that attackers can replicate. By injecting malicious tools with deceptive instructions, FuncPoison manipulates one agent s decisions--such as misinterpreting road conditions--triggering cascading errors that mislead other agents in the system. We experimentally evaluate FuncPoison on two representative multi-agent autonomous driving systems, demonstrating its ability to significantly degrade trajectory accuracy, flexibly target specific agents to induce coordinated misbehavior, and evade diverse defense mechanisms. Our results reveal that the function library, often considered a simple toolset, can serve as a critical attack surface in LLM-based autonomous driving systems, raising elevated concerns on their reliability.


Key Contributions

  • FuncPoison: a supply-chain poisoning attack that injects malicious tools with deceptive text descriptions into shared LLM agent function libraries without modifying model weights or prompt instructions
  • Demonstrates cascading cross-agent propagation where one compromised agent's corrupted outputs mislead downstream agents in collaborative multi-agent pipelines
  • Evaluates attack effectiveness (degraded trajectory accuracy), agent-specific targeting, and evasion of diverse defenses across two multi-agent autonomous driving systems

🛡️ Threat Analysis

AI Supply Chain Attacks

FuncPoison explicitly frames itself as a supply-chain attack: third-party function libraries are poisoned during distribution or version updates (analogous to npm/PyPI ecosystem compromises), and the malicious tools are seamlessly integrated and trusted by downstream LLM agents without any model-weight modification.


Details

Domains
nlpmultimodal
Model Types
llm
Threat Tags
black_boxinference_timetargeted
Applications
autonomous drivingmulti-agent llm systems