Robust Online Learning
Published on arXiv
2602.06775
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
The optimal mistake bound in robust online learning equals a new Littlestone-like dimension, with agnostic expected regret scaling as O(sqrt(dT)) up to logarithmic factors
Robust Littlestone Dimension
Novel technique introduced
We study the problem of learning robust classifiers where the classifier will receive a perturbed input. Unlike robust PAC learning studied in prior work, here the clean data and its label are also adversarially chosen. We formulate this setting as an online learning problem and consider both the realizable and agnostic learnability of hypothesis classes. We define a new dimension of classes and show it controls the mistake bounds in the realizable setting and the regret bounds in the agnostic setting. In contrast to the dimension that characterizes learnability in the PAC setting, our dimension is rather simple and resembles the Littlestone dimension. We generalize our dimension to multiclass hypothesis classes and prove similar results in the realizable case. Finally, we study the case where the learner does not know the set of allowed perturbations for each point and only has some prior on them.
Key Contributions
- Formulates robust online learning as an interactive adversarial game where both clean data/labels and input perturbations are adversarially chosen
- Defines a new combinatorial dimension resembling the Littlestone dimension that yields exact mistake bounds in the realizable setting and O(sqrt(dT)) regret in the agnostic setting
- Generalizes the dimension to multiclass hypothesis classes and proves results for uncertain perturbation sets known only up to a finite family
🛡️ Threat Analysis
The core problem is learning classifiers resilient to adversarial input perturbations at inference time — an adversary chooses perturbed inputs to cause misprediction, directly addressing the theoretical foundations of adversarial example robustness.