defense 2025

Defense against Unauthorized Distillation in Image Restoration via Feature Space Perturbation

Han Hu 1, Zhuoran Zheng 2, Chen Lyu 1

0 citations · 64 references · Neurocomputing

α

Published on arXiv

2510.08925

Model Theft

OWASP ML Top 10 — ML05

Key Finding

ASVP reduces distilled student PSNR by up to 4 dB and SSIM by 60–75% while preserving teacher output quality, outperforming prior classification-oriented KD defenses on generative restoration tasks.

ASVP (Adaptive Singular Value Perturbation)

Novel technique introduced


Knowledge distillation (KD) attacks pose a significant threat to deep model intellectual property by enabling adversaries to train student networks using a teacher model's outputs. While recent defenses in image classification have successfully disrupted KD by perturbing output probabilities, extending these methods to image restoration is difficult. Unlike classification, restoration is a generative task with continuous, high-dimensional outputs that depend on spatial coherence and fine details. Minor perturbations are often insufficient, as students can still learn the underlying mapping.To address this, we propose Adaptive Singular Value Perturbation (ASVP), a runtime defense tailored for image restoration models. ASVP operates on internal feature maps of the teacher using singular value decomposition (SVD). It amplifies the topk singular values to inject structured, high-frequency perturbations, disrupting the alignment needed for distillation. This hinders student learning while preserving the teacher's output quality.We evaluate ASVP across five image restoration tasks: super-resolution, low-light enhancement, underwater enhancement, dehazing, and deraining. Experiments show ASVP reduces student PSNR by up to 4 dB and SSIM by 60-75%, with negligible impact on the teacher's performance. Compared to prior methods, ASVP offers a stronger and more consistent defense.Our approach provides a practical solution to protect open-source restoration models from unauthorized knowledge distillation.


Key Contributions

  • ASVP (Adaptive Singular Value Perturbation): a runtime defense that amplifies top-k singular values of internal feature maps via SVD to inject structured perturbations that disrupt knowledge distillation
  • First defense specifically tailored for image restoration models against KD attacks, addressing the failure of classification-domain defenses on generative/continuous-output tasks
  • Evaluation across five restoration tasks (super-resolution, low-light, underwater, dehazing, deraining) showing up to 4 dB PSNR reduction and 60–75% SSIM degradation in student performance with negligible teacher quality loss

🛡️ Threat Analysis

Model Theft

The paper explicitly defends against model theft via knowledge distillation — adversaries query the teacher model and train student clones from its outputs. ASVP is a model IP protection technique that makes unauthorized distillation fail, directly fitting the model theft defense use case.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
black_boxinference_time
Datasets
image restoration benchmarks (super-resolution, low-light, underwater, dehazing, deraining)
Applications
image super-resolutionlow-light enhancementunderwater image enhancementimage dehazingimage deraining