Published on arXiv
2604.25965
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
NTK networks with early stopping achieve minimax optimal adversarial robustness rates in Sobolev spaces, while overfitted minimum norm interpolants are provably vulnerable
NTK with Early Stopping
Novel technique introduced
Deep learning models are widely deployed in safety-critical domains, but remain vulnerable to adversarial attacks. In this paper, we study the adversarial robustness of NTK neural networks in the context of nonparametric regression. We establish minimax optimal rates for adversarial regression in Sobolev spaces and then show that NTK neural networks, trained via gradient flow with early stopping, can achieve this optimal rate. However, in the overfitting regime, we prove that the minimum norm interpolant is vulnerable to adversarial perturbations.
Key Contributions
- Establishes minimax optimal rates for adversarial regression in Sobolev spaces
- Proves NTK neural networks with gradient flow and early stopping achieve optimal adversarial robustness
- Demonstrates that minimum norm interpolants (overfitting regime) are provably vulnerable to adversarial perturbations
🛡️ Threat Analysis
Paper analyzes adversarial robustness of neural networks against input perturbations in the adversarial regression setting, establishing both defense guarantees (early stopping achieves optimal robustness) and vulnerability results (overfitting regime is vulnerable).