defense 2025

Unlocking the Effectiveness of LoRA-FP for Seamless Transfer Implantation of Fingerprints in Downstream Models

Zhenhua Xu 1, Zhaokun Yan 2, Binhan Xu 3,4, Xin Tong 3,4, Haitao Xu 1, Yourong Chen 5, Meng Han 1,3

0 citations

α

Published on arXiv

2509.00820

Model Theft

OWASP ML Top 10 — ML05

Model Theft

OWASP LLM Top 10 — LLM10

Key Finding

LoRA-FP achieves greater fingerprint robustness under incremental training and model fusion compared to direct fingerprint injection, while significantly reducing computational overhead via LoRA-based constrained fine-tuning.

LoRA-FP

Novel technique introduced


With the rapid advancement of large language models (LLMs), safeguarding intellectual property (IP) has become increasingly critical. To address the challenges of high costs and potential contamination in fingerprint integration, we propose LoRA-FP, a lightweight, plug-and-play framework that embeds backdoor fingerprints into LoRA adapters through constrained fine-tuning. This design enables seamless fingerprint transplantation via parameter fusion, eliminating the need for full-parameter updates while preserving model integrity. Experimental results demonstrate that LoRA-FP not only significantly reduces computational overhead compared to conventional approaches but also achieves superior robustness across diverse scenarios, including incremental training and model fusion. Our code and datasets are publicly available at https://github.com/Xuzhenhua55/LoRA-FP.


Key Contributions

  • LoRA-FP framework that stores backdoor fingerprints in LoRA adapters via constrained fine-tuning, enabling parameter fusion into downstream models without full retraining
  • Formalization of fingerprint decoupling and transferability principles, separating ownership encoding from task learning
  • Demonstrated superior robustness of transferred fingerprints over direct injection under adversarial scenarios including incremental training and model fusion

🛡️ Threat Analysis

Model Theft

LoRA-FP embeds fingerprints IN THE MODEL (via LoRA adapter weights fused into downstream models) to prove ownership and detect unauthorized use — this is model watermarking for IP protection, not content provenance. Defends against model theft/unauthorized redistribution of LLMs.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
black_boxtraining_time
Applications
llm ip protectionmodel ownership verificationdownstream model fingerprinting