attack 2026

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain

0 citations

Published on arXiv

2604.08407

AI Supply Chain Attacks

OWASP ML Top 10 — ML06

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

9 of 428 real-world LLM API routers actively inject malicious code; 17 touch AWS credentials; 1 drains ETH; poisoning experiments generate 2B+ billed tokens and exfiltrate 99 credentials across 440 sessions

Mine

Novel technique introduced

Large language model (LLM) agents increasingly rely on third-party API routers to dispatch tool-calling requests across multiple upstream providers. These routers operate as application-layer proxies with full plaintext access to every in-flight JSON payload, yet no provider enforces cryptographic integrity between client and upstream model. We present the first systematic study of this attack surface. We formalize a threat model for malicious LLM API routers and define two core attack classes, payload injection (AC-1) and secret exfiltration (AC-2), together with two adaptive evasion variants: dependency-targeted injection (AC-1.a) and conditional delivery (AC-1.b). Across 28 paid routers purchased from Taobao, Xianyu, and Shopify-hosted storefronts and 400 free routers collected from public communities, we find 1 paid and 8 free routers actively injecting malicious code, 2 deploying adaptive evasion triggers, 17 touching researcher-owned AWS canary credentials, and 1 draining ETH from a researcher-owned private key. Two poisoning studies further show that ostensibly benign routers can be pulled into the same attack surface: a leaked OpenAI key generates 100M GPT-5.4 tokens and more than seven Codex sessions, while weakly configured decoys yield 2B billed tokens, 99 credentials across 440 Codex sessions, and 401 sessions already running in autonomous YOLO mode. We build Mine, a research proxy that implements all four attack classes against four public agent frameworks, and use it to evaluate three deployable client-side defenses: a fail-closed policy gate, response-side anomaly screening, and append-only transparency logging.

Key Contributions

First systematic threat model for malicious LLM API routers defining payload injection (AC-1) and secret exfiltration (AC-2) attack classes
Empirical measurement finding 9/428 real-world routers actively injecting code, 17 touching AWS credentials, and 1 draining ETH from researcher wallets
Mine research proxy implementing all four attack variants and evaluation of three client-side defenses (policy gate, anomaly screening, transparency logging)

🛡️ Threat Analysis

AI Supply Chain Attacks

Primary focus on compromised infrastructure in the LLM supply chain — malicious API routers purchased from marketplaces that intercept and manipulate LLM agent traffic, fitting the definition of supply chain attacks on ML ecosystem tooling.

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_time

Datasets

428 API routers (28 paid from Taobao/Xianyu/Shopify, 400 free from public communities)

Applications

llm agentsapi routingtool calling

Read PDF arXiv

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

IntentMiner: Intent Inversion Attack via Tool Call Analysis in the Model Context Protocol

SplitAgent: A Privacy-Preserving Distributed Architecture for Enterprise-Cloud Agent Collaboration

ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems

Red-Teaming Coding Agents from a Tool-Invocation Perspective: An Empirical Security Assessment

FuncPoison: Poisoning Function Library to Hijack Multi-agent Autonomous Driving Systems

MalTool: Malicious Tool Attacks on LLM Agents

Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

Trojan's Whisper: Stealthy Manipulation of OpenClaw through Injected Bootstrapped Guidance