PDF: PUF-based DNN Fingerprinting for Knowledge Distillation Traceability
Ning Lyu , Yuntao Liu , Yonghong Bai , Zhiyuan Yan
Published on arXiv
2602.23587
Model Theft
OWASP ML Top 10 — ML05
Transfer Learning Attack
OWASP ML Top 10 — ML07
Key Finding
Achieves high PUF key recovery rate with negligible model accuracy loss, surviving knowledge distillation-based model cloning attacks.
PDF (PUF-based DNN Fingerprinting)
Novel technique introduced
Knowledge distillation transfers large teacher models to compact student models, enabling deployment on resource-limited platforms while suffering minimal performance degradation. However, this paradigm could lead to various security risks, especially model theft. Existing defenses against model theft, such as watermarking and secure enclaves, focus primarily on identity authentication and incur significant resource costs. Aiming to provide post-theft accountability and traceability, we propose a novel fingerprinting framework that superimposes device-specific Physical Unclonable Function (PUF) signatures onto teacher logits during distillation. Compared with watermarking or secure enclaves, our approach is lightweight, requires no architectural changes, and enables traceability of any leaked or cloned model. Since the signatures are based on PUFs, this framework is robust against reverse engineering and tampering attacks. In this framework, the signature recovery process consists of two stages: first a neural network-based decoder and then a Hamming distance decoder. Furthermore, we also propose a bit compression scheme to support a large number of devices. Experiment results demonstrate that our framework achieves high key recovery rate and negligible accuracy loss while allowing a tunable trade-off between these two key metrics. These results show that the proposed framework is a practical and robust solution for protecting distilled models.
Key Contributions
- PUF-based fingerprinting framework that superimposes device-specific hardware signatures onto teacher logits during knowledge distillation for post-theft traceability
- Two-stage signature recovery pipeline combining a neural network-based decoder with a Hamming distance decoder
- Bit compression scheme enabling scalable support for a large number of devices with tunable accuracy/recovery tradeoff
🛡️ Threat Analysis
The paper directly defends against model theft by embedding PUF-based fingerprints INTO the model (via teacher logits during distillation) so that stolen or cloned student models can be traced back to the originating device — this is model IP protection and ownership traceability, the core of ML05.
A central technical challenge addressed is that traditional watermarks are erased by knowledge distillation (a transfer learning attack vector); the paper specifically designs signatures that survive the distillation process, making the robustness of fingerprinting through the distillation/transfer learning pipeline a primary contribution.