Unlocking the Effectiveness of LoRA-FP for Seamless Transfer Implantation of Fingerprints in Downstream Models
Zhenhua Xu 1, Zhaokun Yan 2, Binhan Xu 3,4, Xin Tong 3,4, Haitao Xu 1, Yourong Chen 5, Meng Han 1,3
2 China Academy of Information and Communications Technology
Published on arXiv
2509.00820
Model Theft
OWASP ML Top 10 — ML05
Model Theft
OWASP LLM Top 10 — LLM10
Key Finding
LoRA-FP achieves greater fingerprint robustness under incremental training and model fusion compared to direct fingerprint injection, while significantly reducing computational overhead via LoRA-based constrained fine-tuning.
LoRA-FP
Novel technique introduced
With the rapid advancement of large language models (LLMs), safeguarding intellectual property (IP) has become increasingly critical. To address the challenges of high costs and potential contamination in fingerprint integration, we propose LoRA-FP, a lightweight, plug-and-play framework that embeds backdoor fingerprints into LoRA adapters through constrained fine-tuning. This design enables seamless fingerprint transplantation via parameter fusion, eliminating the need for full-parameter updates while preserving model integrity. Experimental results demonstrate that LoRA-FP not only significantly reduces computational overhead compared to conventional approaches but also achieves superior robustness across diverse scenarios, including incremental training and model fusion. Our code and datasets are publicly available at https://github.com/Xuzhenhua55/LoRA-FP.
Key Contributions
- LoRA-FP framework that stores backdoor fingerprints in LoRA adapters via constrained fine-tuning, enabling parameter fusion into downstream models without full retraining
- Formalization of fingerprint decoupling and transferability principles, separating ownership encoding from task learning
- Demonstrated superior robustness of transferred fingerprints over direct injection under adversarial scenarios including incremental training and model fusion
🛡️ Threat Analysis
LoRA-FP embeds fingerprints IN THE MODEL (via LoRA adapter weights fused into downstream models) to prove ownership and detect unauthorized use — this is model watermarking for IP protection, not content provenance. Defends against model theft/unauthorized redistribution of LLMs.