defense 2026

Perturbation-Induced Linearization: Constructing Unlearnable Data with Solely Linear Classifiers

Jinlin Liu , Wei Chen , Xiaojin Zhang

0 citations · 39 references · arXiv

α

Published on arXiv

2601.19967

Data Poisoning Attack

OWASP ML Top 10 — ML02

Key Finding

PIL generates effective unlearnable perturbations in under 1 GPU minute on CIFAR-10 (vs. 15+ GPU hours for REM), achieving comparable or superior protection across architectures and adversarial training defenses.

Perturbation-Induced Linearization (PIL)

Novel technique introduced


Collecting web data to train deep models has become increasingly common, raising concerns about unauthorized data usage. To mitigate this issue, unlearnable examples introduce imperceptible perturbations into data, preventing models from learning effectively. However, existing methods typically rely on deep neural networks as surrogate models for perturbation generation, resulting in significant computational costs. In this work, we propose Perturbation-Induced Linearization (PIL), a computationally efficient yet effective method that generates perturbations using only linear surrogate models. PIL achieves comparable or better performance than existing surrogate-based methods while reducing computational time dramatically. We further reveal a key mechanism underlying unlearnable examples: inducing linearization to deep models, which explains why PIL can achieve competitive results in a very short time. Beyond this, we provide an analysis about the property of unlearnable examples under percentage-based partial perturbation. Our work not only provides a practical approach for data protection but also offers insights into what makes unlearnable examples effective.


Key Contributions

  • Proposes PIL, which generates unlearnable examples using only linear surrogate models, cutting generation time from 15+ GPU hours (REM) to under 1 GPU minute on CIFAR-10.
  • Reveals that inducing linearization of deep models is the underlying mechanism behind effective unlearnable examples, explaining why simple linear surrogates transfer to complex DNNs.
  • Analyzes the fundamental limitation of unlearnable examples under percentage-based partial perturbation, showing they cannot substantially reduce test accuracy when only a subset of data is perturbed.

🛡️ Threat Analysis

Data Poisoning Attack

PIL corrupts training data with imperceptible perturbations (availability attack) that degrade model generalization when the data is used for unauthorized training — a training-time data poisoning mechanism, explicitly called 'availability attacks' and 'generalization attacks' in the related work.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
training_timetargeteddigital
Datasets
CIFAR-10ImageNet-100
Applications
image classificationdata protection against unauthorized web scraping