defense 2026

S2O: Enhancing Adversarial Training with Second-Order Statistics of Weights

Gaojie Jin 1, Xinping Yi 2, Wei Huang 3, Sven Schewe 3, Xiaowei Huang 3

0 citations

α

Published on arXiv

2603.01264

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

S²O tightens the PAC-Bayesian robust generalization bound and empirically improves adversarial robustness both as a standalone method and when combined with existing adversarial training techniques

S²O (Second-Order Statistics Optimization)

Novel technique introduced


Adversarial training has emerged as a highly effective way to improve the robustness of deep neural networks (DNNs). It is typically conceptualized as a min-max optimization problem over model weights and adversarial perturbations, where the weights are optimized using gradient descent methods, such as SGD. In this paper, we propose a novel approach by treating model weights as random variables, which paves the way for enhancing adversarial training through \textbf{S}econd-Order \textbf{S}tatistics \textbf{O}ptimization (S$^2$O) over model weights. We challenge and relax a prevalent, yet often unrealistic, assumption in prior PAC-Bayesian frameworks: the statistical independence of weights. From this relaxation, we derive an improved PAC-Bayesian robust generalization bound. Our theoretical developments suggest that optimizing the second-order statistics of weights can substantially tighten this bound. We complement this theoretical insight by conducting an extensive set of experiments that demonstrate that S$^2$O not only enhances the robustness and generalization of neural networks when used in isolation, but also seamlessly augments other state-of-the-art adversarial training techniques. The code is available at https://github.com/Alexkael/S2O.


Key Contributions

  • Relaxes the statistical independence assumption in PAC-Bayesian frameworks by modeling second-order weight statistics (correlation/covariance matrices), yielding a tighter robust generalization bound
  • Proposes S²O (Second-Order Statistics Optimization), a novel adversarial training paradigm that treats weights as random variables and optimizes their covariance to improve robustness
  • Demonstrates empirically that S²O both stands alone as a robustness enhancer and composably augments existing state-of-the-art adversarial training methods

🛡️ Threat Analysis

Input Manipulation Attack

Directly proposes a defense against adversarial input manipulation attacks by enhancing adversarial training — the standard min-max framework — with second-order weight statistics to tighten robust generalization bounds and improve model robustness at inference time.


Details

Domains
vision
Model Types
cnn
Threat Tags
white_boxinference_timedigital
Datasets
CIFAR-10CIFAR-100ImageNet
Applications
image classification