tool 2025

PickleBall: Secure Deserialization of Pickle-based Machine Learning Models (Extended Report)

Andreas D. Kellas 1, Neophytos Christou 2, Wenxin Jiang 3,4, Penghui Li 1, Laurent Simon 5, Yaniv David 6, Vasileios P. Kemerlis 2, James C. Davis 3, Junfeng Yang 1

0 citations

α

Published on arXiv

2508.15987

AI Supply Chain Attacks

OWASP ML Top 10 — ML06

Key Finding

PickleBall rejects 100% of malicious pickle models while correctly loading 79.8% of benign models, outperforming state-of-the-art loaders by 22% on benign coverage and model scanners on malicious detection.

PickleBall

Novel technique introduced


Machine learning model repositories such as the Hugging Face Model Hub facilitate model exchanges. However, bad actors can deliver malware through compromised models. Existing defenses such as safer model formats, restrictive (but inflexible) loading policies, and model scanners have shortcomings: 44.9% of popular models on Hugging Face still use the insecure pickle format, 15% of these cannot be loaded by restrictive loading policies, and model scanners have both false positives and false negatives. Pickle remains the de facto standard for model exchange, and the ML community lacks a tool that offers transparent safe loading. We present PickleBall to help machine learning engineers load pickle-based models safely. PickleBall statically analyzes the source code of a given machine learning library and computes a custom policy that specifies a safe load-time behavior for benign models. PickleBall then dynamically enforces the policy during load time as a drop-in replacement for the pickle module. PickleBall generates policies that correctly load 79.8% of benign pickle-based models in our dataset, while rejecting all (100%) malicious examples in our dataset. In comparison, evaluated model scanners fail to identify known malicious models, and the state-of-art loader loads 22% fewer benign models than PickleBall. PickleBall removes the threat of arbitrary function invocation from malicious pickle-based models, raising the bar for attackers to depend on code reuse techniques.


Key Contributions

  • Static analysis of ML library source code to automatically compute safe pickle-loading policies tailored to specific frameworks
  • Dynamic policy enforcement as a drop-in replacement for Python's pickle module, blocking arbitrary code execution at load time
  • Empirical study showing 44.9% of popular Hugging Face models use insecure pickle format, with PickleBall correctly loading 79.8% of benign models while rejecting 100% of malicious examples

🛡️ Threat Analysis

AI Supply Chain Attacks

Directly addresses the supply chain attack vector of trojaned/malware-embedded models distributed via public model hubs (Hugging Face). The threat is bad actors uploading malicious pickle files that execute arbitrary code when loaded by unsuspecting users — a canonical ML supply chain attack. PickleBall is a defense that intercepts this supply chain threat at load time.


Details

Threat Tags
black_boxinference_time
Datasets
Hugging Face Model Hub (benign and malicious pickle models)
Applications
ml model distributionmodel hub securitymodel loading pipelines