Provable Model Provenance Set for Large Language Models
Xiaoqi Qiu , Hao Zeng , Zhiyu Hou , Hongxin Wei
Published on arXiv
2602.00772
Model Theft
OWASP ML Top 10 — ML05
Key Finding
MPS reliably achieves target provenance coverage while strictly limiting inclusion of unrelated models, outperforming heuristic fingerprint-matching baselines on binary provenance verification
Model Provenance Set (MPS)
Novel technique introduced
The growing prevalence of unauthorized model usage and misattribution has increased the need for reliable model provenance analysis. However, existing methods largely rely on heuristic fingerprint-matching rules that lack provable error control and often overlook the existence of multiple sources, leaving the reliability of their provenance claims unverified. In this work, we first formalize the model provenance problem with provable guarantees, requiring rigorous coverage of all true provenances at a prescribed confidence level. Then, we propose the Model Provenance Set (MPS), which employs a sequential test-and-exclusion procedure to adaptively construct a small set satisfying the guarantee. The key idea of MPS is to test the significance of provenance existence within a candidate pool, thereby establishing a provable asymptotic guarantee at a user-specific confidence level. Extensive experiments demonstrate that MPS effectively achieves target provenance coverage while strictly limiting the inclusion of unrelated models, and further reveal its potential for practical provenance analysis in attribution and auditing tasks.
Key Contributions
- Formalizes model provenance as a statistical testing problem requiring provable coverage of all true source models at a user-specified confidence level
- Proposes Model Provenance Set (MPS): a sequential test-and-exclusion procedure that constructs a compact candidate set with asymptotic provenance coverage guarantees
- Validates practical utility for LLM attribution, unauthorized derivation screening, and non-infringement quantification across 455 LLMs spanning up to three generations of fine-tuning lineage
🛡️ Threat Analysis
Proposes model fingerprinting and provenance analysis specifically to detect unauthorized LLM derivation — a defense against model theft where fine-tuned derivatives are misattributed as independently developed. The method identifies which base models a suspected model was derived from, with provable coverage guarantees, directly serving IP protection and auditing goals.