How Far Are We from True Unlearnability?
Kai Ye , Liangcai Su , Chenxiong Qian
Published on arXiv
2509.08058
Data Poisoning Attack
OWASP ML Top 10 — ML02
Key Finding
Existing unlearnable example methods fail cross-task unlearnability, remaining exploitable for tasks like semantic segmentation, exposing fundamental limits quantified via the proposed UD metric.
Unlearnable Distance (UD) / Sharpness-Aware Learnability (SAL)
Novel technique introduced
High-quality data plays an indispensable role in the era of large models, but the use of unauthorized data for model training greatly damages the interests of data owners. To overcome this threat, several unlearnable methods have been proposed, which generate unlearnable examples (UEs) by compromising the training availability of data. Clearly, due to unknown training purposes and the powerful representation learning capabilities of existing models, these data are expected to be unlearnable for models across multiple tasks, i.e., they will not help improve the model's performance. However, unexpectedly, we find that on the multi-task dataset Taskonomy, UEs still perform well in tasks such as semantic segmentation, failing to exhibit cross-task unlearnability. This phenomenon leads us to question: How far are we from attaining truly unlearnable examples? We attempt to answer this question from the perspective of model optimization. To this end, we observe the difference in the convergence process between clean and poisoned models using a simple model architecture. Subsequently, from the loss landscape we find that only a part of the critical parameter optimization paths show significant differences, implying a close relationship between the loss landscape and unlearnability. Consequently, we employ the loss landscape to explain the underlying reasons for UEs and propose Sharpness-Aware Learnability (SAL) to quantify the unlearnability of parameters based on this explanation. Furthermore, we propose an Unlearnable Distance (UD) to measure the unlearnability of data based on the SAL distribution of parameters in clean and poisoned models. Finally, we conduct benchmark tests on mainstream unlearnable methods using the proposed UD, aiming to promote community awareness of the capability boundaries of existing unlearnable methods.
Key Contributions
- Discovers that existing unlearnable examples (UEs) fail to exhibit cross-task unlearnability on multi-task vision datasets like Taskonomy (e.g., still useful for semantic segmentation)
- Proposes Sharpness-Aware Learnability (SAL) to quantify parameter-level unlearnability by analyzing loss landscape differences between clean and poisoned models
- Introduces Unlearnable Distance (UD) metric based on SAL distributions for benchmarking existing unlearnable methods and measuring their capability boundaries
🛡️ Threat Analysis
Unlearnable examples are a form of availability poisoning — data owners deliberately corrupt their training data so that unauthorized models trained on it fail to learn. The paper analyzes why this defensive poisoning technique fails in cross-task settings, proposes SAL and UD metrics to quantify poisoning effectiveness, and benchmarks mainstream unlearnable methods.