benchmark 2026

Robustness Quantification and Uncertainty Quantification: Comparing Two Methods for Assessing the Reliability of Classifier Predictions

Adrián Detavernier , Jasper De Bock

0 citations

α

Published on arXiv

2603.22988

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Combined RQ+UQ approach achieves better reliability assessment than either method alone on benchmark datasets


We consider two approaches for assessing the reliability of the individual predictions of a classifier: Robustness Quantification (RQ) and Uncertainty Quantification (UQ). We explain the conceptual differences between the two approaches, compare both approaches on a number of benchmark datasets and show that RQ is capable of outperforming UQ, both in a standard setting and in the presence of distribution shift. Beside showing that RQ can be competitive with UQ, we also demonstrate the complementarity of RQ and UQ by showing that a combination of both approaches can lead to even better reliability assessments.


Key Contributions

  • Comprehensive comparison of robustness quantification and uncertainty quantification on real benchmark datasets
  • Demonstrates that combining RQ and UQ leads to better reliability assessments than either alone
  • Shows RQ can outperform UQ in standard settings and under distribution shift

🛡️ Threat Analysis

Input Manipulation Attack

Robustness quantification measures how much perturbation a model can handle before changing predictions - this is evaluating adversarial robustness at inference time, even though the paper frames it as a reliability assessment tool rather than proposing attacks or defenses.


Details

Domains
visiontabular
Model Types
traditional_ml
Threat Tags
inference_time
Datasets
MNISTCIFAR-10
Applications
image classificationtabular classification