benchmark 2025

Get Global Guarantees: On the Probabilistic Nature of Perturbation Robustness

Wenchuan Mu , Kwan Hui Lim

0 citations

α

Published on arXiv

2508.19183

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Tower robustness achieves comprehensive coverage of the perturbation vicinity with maintained precision, outperforming existing probabilistic methods that risk overestimating robustness by missing critical adversarial instances.

Tower Robustness

Novel technique introduced


In safety-critical deep learning applications, robustness measures the ability of neural models that handle imperceptible perturbations in input data, which may lead to potential safety hazards. Existing pre-deployment robustness assessment methods typically suffer from significant trade-offs between computational cost and measurement precision, limiting their practical utility. To address these limitations, this paper conducts a comprehensive comparative analysis of existing robustness definitions and associated assessment methodologies. We propose tower robustness to evaluate robustness, which is a novel, practical metric based on hypothesis testing to quantitatively evaluate probabilistic robustness, enabling more rigorous and efficient pre-deployment assessments. Our extensive comparative evaluation illustrates the advantages and applicability of our proposed approach, thereby advancing the systematic understanding and enhancement of model robustness in safety-critical deep learning applications.


Key Contributions

  • Proposes 'tower robustness,' a novel probabilistic robustness metric grounded in hypothesis testing that provides statistical guarantees on failure probability estimates
  • Conducts comprehensive comparative analysis of existing robustness definitions and pre-deployment assessment methodologies, highlighting precision–cost trade-offs
  • Demonstrates that the proposed method yields more accurate and reliable robustness estimates than state-of-the-art baselines on large-scale DNNs

🛡️ Threat Analysis

Input Manipulation Attack

The paper's entire contribution centers on evaluating and quantifying model robustness against adversarial perturbations (imperceptible input manipulations) — tower robustness is a new metric for measuring resistance to adversarial examples at inference time.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
inference_timedigital
Datasets
MNIST
Applications
image classificationsafety-critical deep learning systemsautonomous drivingmedical diagnosis