Tight Robustness Certificates and Wasserstein Distributional Attacks for Deep Neural Networks

Wasserstein distributionally robust optimization (WDRO) provides a framework for adversarial robustness, yet existing methods based on global Lipschitz continuity or strong duality often yield loose upper bounds or require prohibitive computation. We address these limitations with a primal approach and adopt a notion of exact Lipschitz certificates to tighten this upper bound of WDRO. For ReLU networks, we leverage the piecewise-affine structure on activation cells to obtain an exact tractable characterization of the corresponding WDRO problem. We further extend our analysis to modern architectures with smooth activations (e.g., GELU, SiLU), such as Transformers. Additionally, we propose novel Wasserstein Distributional Attacks (WDA, WDA++) that construct candidates for the worst-case distribution. Compared to existing attacks that are restricted to point-wise perturbations, our methods offer greater flexibility in the number and location of attack points. Extensive evaluations demonstrate that our proposed framework achieves competitive robust accuracy against state-of-the-art baselines while offering tighter certificates than existing methods. Our code is available at https://github.com/OLab-Repo/WDA.

Key Contributions

Exact tractable WDRO upper/lower bounds for ReLU networks using piecewise-affine activation cell structure and tight local Lipschitz certificates, with a sufficient condition for bound tightness.
Extension of WDRO analysis to smooth-activation architectures (GELU, SiLU, Transformers) via gradient-norm-based exact Lipschitz characterization.
Novel Wasserstein Distributional Attacks (WDA and adaptive WDA++) that optimize over distributions supported on 2N points, strictly generalizing point-wise perturbation attacks and consistently outperforming AutoAttack-era baselines.

🛡️ Threat Analysis

Input Manipulation Attack

The paper's dual contribution — tighter certified robustness bounds via exact Lipschitz certificates (defense) and novel Wasserstein Distributional Attacks WDA/WDA++ that craft worst-case adversarial distributions at inference time (attack) — directly targets adversarial example robustness, the core of ML01.

Details

Domains

visionnlp

Model Types

cnntransformer

Threat Tags

white_boxinference_timeuntargeteddigital

Datasets

CIFAR-10CIFAR-100ImageNet

Applications

2025 0 cit.

Input Manipulation Attack

80%