defense 2025

Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization

Antônio H. Ribeiro 1, David Vävinggren 1, Dave Zachariah 1, Thomas B. Schön 1, Francis Bach 2,3

1 citations · 48 references · arXiv

α

Published on arXiv

2510.20883

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

The feature-perturbed RKHS formulation upper-bounds the input-perturbed objective and empirically achieves robustness and clean-data performance comparable to cross-validated kernel ridge regression without requiring hyperparameter tuning on the noise level.

Adversarial Kernel Training (feature-space perturbation formulation)

Novel technique introduced


Adversarial training has emerged as a key technique to enhance model robustness against adversarial input perturbations. Many of the existing methods rely on computationally expensive min-max problems that limit their application in practice. We propose a novel formulation of adversarial training in reproducing kernel Hilbert spaces, shifting from input to feature-space perturbations. This reformulation enables the exact solution of inner maximization and efficient optimization. It also provides a regularized estimator that naturally adapts to the noise level and the smoothness of the underlying function. We establish conditions under which the feature-perturbed formulation is a relaxation of the original problem and propose an efficient optimization algorithm based on iterative kernel ridge regression. We provide generalization bounds that help to understand the properties of the method. We also extend the formulation to multiple kernel learning. Empirical evaluation shows good performance in both clean and adversarial settings.


Key Contributions

  • Reformulates adversarial training by shifting perturbations from input space to RKHS feature space, allowing exact closed-form solution of the inner maximization
  • Proposes an efficient optimization algorithm based on iterative kernel ridge regression and extends the formulation to multiple kernel learning
  • Proves generalization bounds showing the method achieves adaptive regularization (near-oracle performance) without requiring knowledge of the noise level or hyperparameter tuning

🛡️ Threat Analysis

Input Manipulation Attack

Proposes a defense against adversarial input perturbations via adversarial training in reproducing kernel Hilbert spaces; the threat model is norm-bounded input perturbations (the classic adversarial example setting), and the contribution is a more computationally efficient adversarial training algorithm that upper-bounds the original min-max objective.


Details

Model Types
traditional_ml
Threat Tags
white_boxinference_timetraining_time
Applications
regressionclassification