benchmark 2025

Volatility in Certainty (VC): A Metric for Detecting Adversarial Perturbations During Inference in Neural Network Classifiers

Vahid Hemmati , Ahmad Mohammadi , Abdul-Rauf Nuhu , Reza Ahmari , Parham Kebria , Abdollah Homaifar

0 citations · 28 references · INTCEC

α

Published on arXiv

2511.11834

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

log(VC) achieves Pearson correlation ρ < -0.90 with classification accuracy under varying FGSM perturbation magnitudes, enabling label-free adversarial drift detection at inference time.

Volatility in Certainty (VC)

Novel technique introduced


Adversarial robustness remains a critical challenge in deploying neural network classifiers, particularly in real-time systems where ground-truth labels are unavailable during inference. This paper investigates \textit{Volatility in Certainty} (VC), a recently proposed, label-free metric that quantifies irregularities in model confidence by measuring the dispersion of sorted softmax outputs. Specifically, VC is defined as the average squared log-ratio of adjacent certainty values, capturing local fluctuations in model output smoothness. We evaluate VC as a proxy for classification accuracy and as an indicator of adversarial drift. Experiments are conducted on artificial neural networks (ANNs) and convolutional neural networks (CNNs) trained on MNIST, as well as a regularized VGG-like model trained on CIFAR-10. Adversarial examples are generated using the Fast Gradient Sign Method (FGSM) across varying perturbation magnitudes. In addition, mixed test sets are created by gradually introducing adversarial contamination to assess VC's sensitivity under incremental distribution shifts. Our results reveal a strong negative correlation between classification accuracy and log(VC) (correlation rho < -0.90 in most cases), suggesting that VC effectively reflects performance degradation without requiring labeled data. These findings position VC as a scalable, architecture-agnostic, and real-time performance metric suitable for early-warning systems in safety-critical applications.


Key Contributions

  • Introduces Volatility in Certainty (VC), a label-free metric based on average squared log-ratio of adjacent sorted softmax values to quantify confidence irregularities
  • Demonstrates a strong negative correlation (ρ < -0.90) between log(VC) and classification accuracy under FGSM adversarial perturbations across MNIST and CIFAR-10
  • Evaluates VC under incremental adversarial contamination (mixed test sets), showing sensitivity to gradual distribution shifts without requiring ground-truth labels

🛡️ Threat Analysis

Input Manipulation Attack

VC is proposed as a detection/early-warning metric specifically for identifying adversarial examples (FGSM-generated input manipulations) at inference time — a direct countermeasure for input manipulation attacks.


Details

Domains
vision
Model Types
cnn
Threat Tags
white_boxinference_timedigitaluntargeted
Datasets
MNISTCIFAR-10
Applications
image classificationsafety-critical edge ai systemsreal-time adversarial monitoring