defense arXiv Oct 23, 2025 · Oct 2025
Antônio H. Ribeiro, David Vävinggren, Dave Zachariah et al. · Uppsala University · PSL Research University +1 more
Defends against adversarial input perturbations by reformulating adversarial training as feature-space perturbations in RKHS, enabling exact inner maximization and adaptive regularization
Input Manipulation Attack
Adversarial training has emerged as a key technique to enhance model robustness against adversarial input perturbations. Many of the existing methods rely on computationally expensive min-max problems that limit their application in practice. We propose a novel formulation of adversarial training in reproducing kernel Hilbert spaces, shifting from input to feature-space perturbations. This reformulation enables the exact solution of inner maximization and efficient optimization. It also provides a regularized estimator that naturally adapts to the noise level and the smoothness of the underlying function. We establish conditions under which the feature-perturbed formulation is a relaxation of the original problem and propose an efficient optimization algorithm based on iterative kernel ridge regression. We provide generalization bounds that help to understand the properties of the method. We also extend the formulation to multiple kernel learning. Empirical evaluation shows good performance in both clean and adversarial settings.
traditional_ml Uppsala University · PSL Research University · INRIA