Authority Backdoor: A Certifiable Backdoor Mechanism for Authoring DNNs
Han Yang 1, Shaofeng Li 1, Tian Dong 2, Xiangyu Xu 1, Guangchi Liu 1, Zhen Ling 1
Published on arXiv
2512.10600
Model Theft
OWASP ML Top 10 — ML05
Model Poisoning
OWASP ML Top 10 — ML10
Key Finding
Protected ResNet-18 on CIFAR-10 maintains 94.13% accuracy for authorized users while dropping to 6.02% for unauthorized users; certifiable robustness suppresses adaptive trigger-recovery to 9.25% (near random chance at 9.47%).
Authority Backdoor
Novel technique introduced
Deep Neural Networks (DNNs), as valuable intellectual property, face unauthorized use. Existing protections, such as digital watermarking, are largely passive; they provide only post-hoc ownership verification and cannot actively prevent the illicit use of a stolen model. This work proposes a proactive protection scheme, dubbed ``Authority Backdoor," which embeds access constraints directly into the model. In particular, the scheme utilizes a backdoor learning framework to intrinsically lock a model's utility, such that it performs normally only in the presence of a specific trigger (e.g., a hardware fingerprint). But in its absence, the DNN's performance degrades to be useless. To further enhance the security of the proposed authority scheme, the certifiable robustness is integrated to prevent an adaptive attacker from removing the implanted backdoor. The resulting framework establishes a secure authority mechanism for DNNs, combining access control with certifiable robustness against adversarial attacks. Extensive experiments on diverse architectures and datasets validate the effectiveness and certifiable robustness of the proposed framework.
Key Contributions
- Authority Backdoor scheme that locks DNN utility to a hardware-specific trigger (e.g., TPM/PUF fingerprint), degrading unauthorized performance to random-chance accuracy
- Certifiable robustness integration via randomized smoothing to withstand adaptive trigger-recovery attacks that attempt to bypass the authority mechanism
- Extensive validation across ResNet, VGG, and ViT architectures on CIFAR-10/100, GTSRB, and Tiny ImageNet
🛡️ Threat Analysis
The primary contribution is proactive DNN IP protection against model theft and unauthorized use — an active defense where the model itself enforces access control via a hardware-specific trigger, directly preventing utility for anyone who steals the model.
The technical mechanism is a deliberately embedded backdoor, and a key contribution is certifiable robustness (via randomized smoothing) against adaptive adversaries who try to reverse-engineer or remove the implanted trigger — a direct contribution to backdoor robustness and resilience.