PDF: PUF-based DNN Fingerprinting for Knowledge Distillation Traceability

Knowledge distillation transfers large teacher models to compact student models, enabling deployment on resource-limited platforms while suffering minimal performance degradation. However, this paradigm could lead to various security risks, especially model theft. Existing defenses against model theft, such as watermarking and secure enclaves, focus primarily on identity authentication and incur significant resource costs. Aiming to provide post-theft accountability and traceability, we propose a novel fingerprinting framework that superimposes device-specific Physical Unclonable Function (PUF) signatures onto teacher logits during distillation. Compared with watermarking or secure enclaves, our approach is lightweight, requires no architectural changes, and enables traceability of any leaked or cloned model. Since the signatures are based on PUFs, this framework is robust against reverse engineering and tampering attacks. In this framework, the signature recovery process consists of two stages: first a neural network-based decoder and then a Hamming distance decoder. Furthermore, we also propose a bit compression scheme to support a large number of devices. Experiment results demonstrate that our framework achieves high key recovery rate and negligible accuracy loss while allowing a tunable trade-off between these two key metrics. These results show that the proposed framework is a practical and robust solution for protecting distilled models.

Key Contributions

PUF-based fingerprinting framework that superimposes device-specific hardware signatures onto teacher logits during knowledge distillation for post-theft traceability
Two-stage signature recovery pipeline combining a neural network-based decoder with a Hamming distance decoder
Bit compression scheme enabling scalable support for a large number of devices with tunable accuracy/recovery tradeoff

🛡️ Threat Analysis

Model Theft

The paper directly defends against model theft by embedding PUF-based fingerprints INTO the model (via teacher logits during distillation) so that stolen or cloned student models can be traced back to the originating device — this is model IP protection and ownership traceability, the core of ML05.

Transfer Learning Attack

A central technical challenge addressed is that traditional watermarks are erased by knowledge distillation (a transfer learning attack vector); the paper specifically designs signatures that survive the distillation process, making the robustness of fingerprinting through the distillation/transfer learning pipeline a primary contribution.

Details

Domains

vision

Model Types

cnn

Threat Tags

black_boxtraining_time

Applications

2025 0 cit.

Model Theft

62%