FPEdit: Robust LLM Fingerprinting through Localized Parameter Editing

Large language models represent significant investments in computation, data, and engineering expertise, making them extraordinarily valuable intellectual assets. Nevertheless, these AI assets remain vulnerable to unauthorized redistribution and commercial exploitation through fine-tuning or black-box deployment. Current fingerprinting approaches face a fundamental trade-off: intrinsic methods require full parameter access, while backdoor-based techniques employ statistically anomalous triggers easily detected and filtered by adversaries. To address these limitations, we introduce FPEdit, a novel framework that leverages knowledge editing to inject semantically coherent natural language fingerprints through sparse, targeted modifications to model weights. Our approach introduces Promote-Suppress Value Vector Optimization, which simultaneously enhances target token likelihood while suppressing competing tokens, ensuring robust fingerprint integration without degrading core model functionality. Extensive experiments show that FPEdit achieves 95-100% fingerprint retention under both full-parameter fine-tuning and parameter-efficient adaptation, while preserving performance on downstream benchmarks. Moreover, FPEdit remains robust under quantization, pruning, and stochastic decoding, and can embed 10 fingerprint pairs into LLaMA2-7B in under 2 minutes using less than 30 GB of GPU memory, which represents a substantial reduction in resource requirements. These advances establish FPEdit as the first fingerprinting approach to simultaneously achieve robustness against adaptation, resistance to detection, and preservation of model utility, thereby providing a minimally invasive solution for reliable provenance verification of large language models in adversarial deployment scenarios.

Key Contributions

FPEdit framework using knowledge editing to inject semantically coherent natural language fingerprints via sparse, targeted weight modifications
Promote-Suppress Value Vector Optimization that simultaneously enhances target token likelihood while suppressing competing tokens for robust fingerprint integration
First fingerprinting approach achieving simultaneous robustness against full fine-tuning/PEFT, resistance to statistical detection, and utility preservation — embedding 10 pairs in LLaMA2-7B in under 2 minutes with <30 GB GPU memory

🛡️ Threat Analysis

Model Theft

FPEdit embeds fingerprints directly into model weights (not outputs) to prove ownership and detect unauthorized redistribution of stolen LLMs — this is the canonical ML05 model ownership watermarking use case. The defense is evaluated against fine-tuning, PEFT, pruning, and quantization as adversarial removal attempts.

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

white_boxblack_boxtraining_time

Datasets

LLaMA2-7B downstream benchmarks

Applications

2025 0 cit.

Model Theft

93%