defense 2025

Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization

Antônio H. Ribeiro ¹, David Vävinggren ¹, Dave Zachariah ¹, Thomas B. Schön ¹, Francis Bach ^2,3

¹ Uppsala University

² PSL Research University

³ INRIA

1 citations · 48 references · arXiv

Published on arXiv

2510.20883

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

The feature-perturbed RKHS formulation upper-bounds the input-perturbed objective and empirically achieves robustness and clean-data performance comparable to cross-validated kernel ridge regression without requiring hyperparameter tuning on the noise level.

Adversarial Kernel Training (feature-space perturbation formulation)

Novel technique introduced

Adversarial training has emerged as a key technique to enhance model robustness against adversarial input perturbations. Many of the existing methods rely on computationally expensive min-max problems that limit their application in practice. We propose a novel formulation of adversarial training in reproducing kernel Hilbert spaces, shifting from input to feature-space perturbations. This reformulation enables the exact solution of inner maximization and efficient optimization. It also provides a regularized estimator that naturally adapts to the noise level and the smoothness of the underlying function. We establish conditions under which the feature-perturbed formulation is a relaxation of the original problem and propose an efficient optimization algorithm based on iterative kernel ridge regression. We provide generalization bounds that help to understand the properties of the method. We also extend the formulation to multiple kernel learning. Empirical evaluation shows good performance in both clean and adversarial settings.

Key Contributions

Reformulates adversarial training by shifting perturbations from input space to RKHS feature space, allowing exact closed-form solution of the inner maximization
Proposes an efficient optimization algorithm based on iterative kernel ridge regression and extends the formulation to multiple kernel learning
Proves generalization bounds showing the method achieves adaptive regularization (near-oracle performance) without requiring knowledge of the noise level or hyperparameter tuning

🛡️ Threat Analysis

Input Manipulation Attack

Proposes a defense against adversarial input perturbations via adversarial training in reproducing kernel Hilbert spaces; the threat model is norm-bounded input perturbations (the classic adversarial example setting), and the contribution is a more computationally efficient adversarial training algorithm that upper-bounds the original min-max objective.

Details

Model Types

traditional_ml

Threat Tags

white_boxinference_timetraining_time

Applications

regressionclassification

Read PDF arXiv DOI

Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Certifiably robust malware detectors by design

Tracking Finite-Time Lyapunov Exponents to Robustify Neural ODEs

Robust Online Learning

Distributionally Robust Safety Verification of Neural Networks via Worst-Case CVaR

Reward-Preserving Attacks For Robust Reinforcement Learning

Sample-wise Adaptive Weighting for Transfer Consistency in Adversarial Distillation

Adversarial generalization of unfolding (model-based) networks

TCRL: Temporal-Coupled Adversarial Training for Robust Constrained Reinforcement Learning in Worst-Case Scenarios