Probably Approximately Global Robustness Certification

We propose and investigate probabilistic guarantees for the adversarial robustness of classification algorithms. While traditional formal verification approaches for robustness are intractable and sampling-based approaches do not provide formal guarantees, our approach is able to efficiently certify a probabilistic relaxation of robustness. The key idea is to sample an $ε$-net and invoke a local robustness oracle on the sample. Remarkably, the size of the sample needed to achieve probably approximately global robustness guarantees is independent of the input dimensionality, the number of classes, and the learning algorithm itself. Our approach can, therefore, be applied even to large neural networks that are beyond the scope of traditional formal verification. Experiments empirically confirm that it characterizes robustness better than state-of-the-art sampling-based approaches and scales better than formal methods.

Key Contributions

Probabilistic global robustness certification framework using ε-net sampling, parameterized by prediction confidence
Sample size bounds that are independent of input dimensionality, number of classes, and learning algorithm — enabling application to large NNs
Oracle-agnostic approach compatible with both formal verification and adversarial attack methods as local robustness checkers

🛡️ Threat Analysis

Input Manipulation Attack

The paper proposes a global robustness certification framework against adversarial perturbations — a direct defense against input manipulation attacks. It invokes local robustness oracles (FGSM, PGD, C&W, formal verifiers) to provide high-probability guarantees that classifiers are not vulnerable to adversarial examples across the input space.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

white_boxinference_timedigital

Datasets

CIFAR-10

Applications

2025 0 cit.

Input Manipulation Attack

100%