defense 2025

Probably Approximately Global Robustness Certification

Peter Blohm , Patrick Indri , Thomas Gärtner , Sagar Malhotra

0 citations · 40 references · ICML

α

Published on arXiv

2511.06495

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Empirically characterizes global robustness better than state-of-the-art sampling-based approaches while scaling to larger networks than formal verification methods

Probably Approximately Global Robustness (PAGR) Certification

Novel technique introduced


We propose and investigate probabilistic guarantees for the adversarial robustness of classification algorithms. While traditional formal verification approaches for robustness are intractable and sampling-based approaches do not provide formal guarantees, our approach is able to efficiently certify a probabilistic relaxation of robustness. The key idea is to sample an $ε$-net and invoke a local robustness oracle on the sample. Remarkably, the size of the sample needed to achieve probably approximately global robustness guarantees is independent of the input dimensionality, the number of classes, and the learning algorithm itself. Our approach can, therefore, be applied even to large neural networks that are beyond the scope of traditional formal verification. Experiments empirically confirm that it characterizes robustness better than state-of-the-art sampling-based approaches and scales better than formal methods.


Key Contributions

  • Probabilistic global robustness certification framework using ε-net sampling, parameterized by prediction confidence
  • Sample size bounds that are independent of input dimensionality, number of classes, and learning algorithm — enabling application to large NNs
  • Oracle-agnostic approach compatible with both formal verification and adversarial attack methods as local robustness checkers

🛡️ Threat Analysis

Input Manipulation Attack

The paper proposes a global robustness certification framework against adversarial perturbations — a direct defense against input manipulation attacks. It invokes local robustness oracles (FGSM, PGD, C&W, formal verifiers) to provide high-probability guarantees that classifiers are not vulnerable to adversarial examples across the input space.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
white_boxinference_timedigital
Datasets
CIFAR-10
Applications
image classificationneural network robustness certification