S2O: Enhancing Adversarial Training with Second-Order Statistics of Weights

Adversarial training has emerged as a highly effective way to improve the robustness of deep neural networks (DNNs). It is typically conceptualized as a min-max optimization problem over model weights and adversarial perturbations, where the weights are optimized using gradient descent methods, such as SGD. In this paper, we propose a novel approach by treating model weights as random variables, which paves the way for enhancing adversarial training through \textbf{S}econd-Order \textbf{S}tatistics \textbf{O}ptimization (S$^2$O) over model weights. We challenge and relax a prevalent, yet often unrealistic, assumption in prior PAC-Bayesian frameworks: the statistical independence of weights. From this relaxation, we derive an improved PAC-Bayesian robust generalization bound. Our theoretical developments suggest that optimizing the second-order statistics of weights can substantially tighten this bound. We complement this theoretical insight by conducting an extensive set of experiments that demonstrate that S$^2$O not only enhances the robustness and generalization of neural networks when used in isolation, but also seamlessly augments other state-of-the-art adversarial training techniques. The code is available at https://github.com/Alexkael/S2O.

Key Contributions

Relaxes the statistical independence assumption in PAC-Bayesian frameworks by modeling second-order weight statistics (correlation/covariance matrices), yielding a tighter robust generalization bound
Proposes S²O (Second-Order Statistics Optimization), a novel adversarial training paradigm that treats weights as random variables and optimizes their covariance to improve robustness
Demonstrates empirically that S²O both stands alone as a robustness enhancer and composably augments existing state-of-the-art adversarial training methods

🛡️ Threat Analysis

Input Manipulation Attack

Directly proposes a defense against adversarial input manipulation attacks by enhancing adversarial training — the standard min-max framework — with second-order weight statistics to tighten robust generalization bounds and improve model robustness at inference time.

Details

Domains

vision

Model Types

cnn

Threat Tags

white_boxinference_timedigital

Datasets

CIFAR-10CIFAR-100ImageNet

Applications

2026 0 cit.

Input Manipulation Attack

100%