defense 2026

Formal Analysis and Supply Chain Security for Agentic AI Skills

Varun Pratap Bhardwaj

0 citations

α

Published on arXiv

2603.00195

AI Supply Chain Attacks

OWASP ML Top 10 — ML06

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Key Finding

SkillFortify achieves 96.95% F1 with 100% precision and 0% false positive rate on 540 agent skills, outperforming heuristic tools by detecting attack patterns invisible to pure pattern matching through combined static analysis and information flow analysis

SkillFortify

Novel technique introduced


The rapid proliferation of agentic AI skill ecosystems -- exemplified by OpenClaw (228,000 GitHub stars) and Anthropic Agent Skills (75,600 stars) -- has introduced a critical supply chain attack surface. The ClawHavoc campaign (January-February 2026) infiltrated over 1,200 malicious skills into the OpenClaw marketplace, while MalTool catalogued 6,487 malicious tools that evade conventional detection. In response, twelve reactive security tools emerged, yet all rely on heuristic methods that provide no formal guarantees. We present SkillFortify, the first formal analysis framework for agent skill supply chains, with six contributions: (1) the DY-Skill attacker model, a Dolev-Yao adaptation to the five-phase skill lifecycle with a maximality proof; (2) a sound static analysis framework grounded in abstract interpretation; (3) capability-based sandboxing with a confinement proof; (4) an Agent Dependency Graph with SAT-based resolution and lockfile semantics; (5) a trust score algebra with formal monotonicity; and (6) SkillFortifyBench, a 540-skill benchmark. SkillFortify achieves 96.95% F1 (95% CI: [95.1%, 98.4%]) with 100% precision and 0% false positive rate on 540 skills, while SAT-based resolution handles 1,000-node graphs in under 100 ms.


Key Contributions

  • DY-Skill attacker model — a formal Dolev-Yao adaptation to the five-phase agent skill lifecycle (authorship → registry → installation → runtime → state persistence) with a maximality proof that any symbolic supply-chain attacker is simulable by a DY-Skill trace
  • Sound static analysis (abstract interpretation over a four-element capability lattice) and capability-based sandboxing with a formal confinement proof guaranteeing no authority amplification beyond declared capabilities
  • SkillFortifyBench (540-skill benchmark drawn from MalTool and ClawHavoc) on which SkillFortify achieves 96.95% F1 with 100% precision and 0% false positive rate, and SAT-based dependency resolution completes 1,000-node graphs in under 100 ms

🛡️ Threat Analysis

AI Supply Chain Attacks

The paper's primary framing and title are explicitly about AI agent skill supply chain security — detecting and preventing malicious skills distributed via agent marketplaces (analogous to trojaned packages in npm/PyPI). The five-phase skill lifecycle threat model, SAT-based dependency resolution, and trust score algebra all target the supply chain attack surface before/during installation.


Details

Domains
nlp
Model Types
llm
Threat Tags
training_timeinference_time
Datasets
SkillFortifyBenchMalToolClawHavoc
Applications
llm agent skill marketplacesagentic ai systemsai agent plugin ecosystems