attack 2026

Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

0 citations

Published on arXiv

2604.03081

AI Supply Chain Attacks

OWASP ML Top 10 — ML06

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Achieves 11.6% to 33.5% bypass rates across four frameworks and five models, while explicit instruction attacks achieve 0% under strong defenses; 2.5% evade both static detection and alignment

DDIPE

Novel technique introduced

LLM-based coding agents extend their capabilities via third-party agent skills distributed through open marketplaces without mandatory security review. Unlike traditional packages, these skills are executed as operational directives with system-level privileges, so a single malicious skill can compromise the host. Prior work has not examined whether supply-chain attacks can directly hijack an agent's action space, such as file writes, shell commands, and network requests, despite existing safeguards. We introduce Document-Driven Implicit Payload Execution (DDIPE), which embeds malicious logic in code examples and configuration templates within skill documentation. Because agents reuse these examples during normal tasks, the payload executes without explicit prompts. Using an LLM-driven pipeline, we generate 1,070 adversarial skills from 81 seeds across 15 MITRE ATTACK categories. Across four frameworks and five models, DDIPE achieves 11.6% to 33.5% bypass rates, while explicit instruction attacks achieve 0% under strong defenses. Static analysis detects most cases, but 2.5% evade both detection and alignment. Responsible disclosure led to four confirmed vulnerabilities and two fixes.

Key Contributions

Document-Driven Implicit Payload Execution (DDIPE) attack embedding malicious logic in skill documentation code examples
LLM-driven pipeline generating 1,070 adversarial skills across 15 MITRE ATT&CK categories
Empirical evaluation showing 11.6-33.5% bypass rates across four frameworks and five models, with 2.5% evading both static analysis and alignment

🛡️ Threat Analysis

AI Supply Chain Attacks

Trojanizes third-party agent skills distributed through open marketplaces without security review — classic supply chain compromise targeting the LLM agent ecosystem infrastructure.

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_timetargeted

Applications

llm coding agentsagent skill marketplaces

Read PDF arXiv

Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Takedown: How It's Done in Modern Coding Agent Exploits

Web Fraud Attacks Against LLM-Driven Multi-Agent Systems

ACRFence: Preventing Semantic Rollback Attacks in Agent Checkpoint-Restore

Servant, Stalker, Predator: How An Honest, Helpful, And Harmless (3H) Agent Unlocks Adversarial Skills

Systematization of Knowledge: Security and Safety in the Model Context Protocol Ecosystem

FuncPoison: Poisoning Function Library to Hijack Multi-agent Autonomous Driving Systems

MalTool: Malicious Tool Attacks on LLM Agents

LPS-Bench: Benchmarking Safety Awareness of Computer-Use Agents in Long-Horizon Planning under Benign and Adversarial Scenarios