tool 2026

Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

8 citations · 2 influential · 34 references · arXiv

Published on arXiv

2601.10338

AI Supply Chain Attacks

OWASP ML Top 10 — ML06

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

26.1% of 31,132 analyzed AI agent skills contain at least one vulnerability, with data exfiltration (13.3%) and privilege escalation (11.8%) most prevalent; skills bundling executable scripts are 2.12x more likely to be vulnerable than instruction-only skills.

SkillScan

Novel technique introduced

The rise of AI agent frameworks has introduced agent skills, modular packages containing instructions and executable code that dynamically extend agent capabilities. While this architecture enables powerful customization, skills execute with implicit trust and minimal vetting, creating a significant yet uncharacterized attack surface. We conduct the first large-scale empirical security analysis of this emerging ecosystem, collecting 42,447 skills from two major marketplaces and systematically analyzing 31,132 using SkillScan, a multi-stage detection framework integrating static analysis with LLM-based semantic classification. Our findings reveal pervasive security risks: 26.1% of skills contain at least one vulnerability, spanning 14 distinct patterns across four categories: prompt injection, data exfiltration, privilege escalation, and supply chain risks. Data exfiltration (13.3%) and privilege escalation (11.8%) are most prevalent, while 5.2% of skills exhibit high-severity patterns strongly suggesting malicious intent. We find that skills bundling executable scripts are 2.12x more likely to contain vulnerabilities than instruction-only skills (OR=2.12, p<0.001). Our contributions include: (1) a grounded vulnerability taxonomy derived from 8,126 vulnerable skills, (2) a validated detection methodology achieving 86.7% precision and 82.5% recall, and (3) an open dataset and detection toolkit to support future research. These results demonstrate an urgent need for capability-based permission systems and mandatory security vetting before this attack vector is further exploited.

Key Contributions

SkillScan: a multi-stage detection framework combining static analysis and LLM-based semantic classification that achieves 86.7% precision and 82.5% recall for identifying vulnerable agent skills
Grounded vulnerability taxonomy of 14 patterns across 4 categories (prompt injection, data exfiltration, privilege escalation, supply chain) derived from 8,126 vulnerable skills in the wild
Open dataset of 31,132 analyzed agent skills from two major marketplaces, revealing 26.1% contain at least one security vulnerability

🛡️ Threat Analysis

AI Supply Chain Attacks

Supply chain risks are one of the four explicit vulnerability categories studied; skills are distributed through public marketplaces with minimal vetting, creating a supply-chain attack vector analogous to trojaned models on HuggingFace. The paper explicitly taxonomizes supply chain risks as a major threat category.

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_time

Datasets

Custom dataset of 42,447 agent skills from two major marketplaces

Applications

ai agent frameworksagent skill marketplacesllm plugin ecosystems

Read PDF arXiv DOI

Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

MCPGuard : Automatically Detecting Vulnerabilities in MCP Servers

Malicious Agent Skills in the Wild: A Large-Scale Security Empirical Study

Trojan's Whisper: Stealthy Manipulation of OpenClaw through Injected Bootstrapped Guidance

Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study

MCP-SandboxScan: WASM-based Secure Execution and Runtime Analysis for MCP Tools

OpenClaw PRISM: A Zero-Fork, Defense-in-Depth Runtime Security Layer for Tool-Augmented LLM Agents

Systematization of Knowledge: Security and Safety in the Model Context Protocol Ecosystem

Unsafer in Many Turns: Benchmarking and Defending Multi-Turn Safety Risks in Tool-Using Agents