defense 2025

Learning to Forget with Information Divergence Reweighted Objectives for Noisy Labels

Jeremiah Birrell 1, Reza Ebrahimi 2

0 citations

α

Published on arXiv

2508.06622

Data Poisoning Attack

OWASP ML Top 10 — ML02

Key Finding

ANTIDOTE outperforms leading comparable loss functions under symmetric, asymmetric, human-annotation, and real-world label noise across five datasets while matching cross-entropy training time complexity.

ANTIDOTE

Novel technique introduced


We introduce ANTIDOTE, a new class of objectives for learning under noisy labels which are defined in terms of a relaxation over an information-divergence neighborhood. Using convex duality, we provide a reformulation as an adversarial training method that has similar computational cost to training with standard cross-entropy loss. We show that our approach adaptively reduces the influence of the samples with noisy labels during learning, exhibiting a behavior that is analogous to forgetting those samples. ANTIDOTE is effective in practical environments where label noise is inherent in the training data or where an adversary can alter the training labels. Extensive empirical evaluations on different levels of symmetric, asymmetric, human annotation, and real-world label noise show that ANTIDOTE outperforms leading comparable losses in the field and enjoys a time complexity that is very close to that of the standard cross entropy loss.


Key Contributions

  • Introduces ANTIDOTE, a theoretically grounded class of loss objectives based on f-divergence neighborhood relaxation that adaptively down-weights noisy/poisoned labels during training.
  • Provides a convex duality reformulation enabling efficient adversarial training with computational cost near that of standard cross-entropy loss.
  • Proves that true labels are the unique solution to the relaxed problem even in the presence of label noise, and demonstrates state-of-the-art performance across symmetric, asymmetric, human-annotation, and real-world label noise settings.

🛡️ Threat Analysis

Data Poisoning Attack

The paper explicitly addresses adversarial label manipulation ('poisoning attacks in which an adversary may alter the labels of the training set') and proposes ANTIDOTE as a defense that reduces the influence of poisoned/noisy labels during training via a min-min relaxation-optimization framework — directly defending against label-flipping data poisoning.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
training_time
Applications
image classification