benchmark 2025

Critical Evaluation of Quantum Machine Learning for Adversarial Robustness

Saeefa Rubaiyet Nowmi , Jesus Lopez , Md Mahmudul Alam Imon , Shahrooz Pouryousef , Mohammad Saidur Rahman

0 citations · 89 references · arXiv

α

Published on arXiv

2511.14989

Input Manipulation Attack

OWASP ML Top 10 — ML01

Data Poisoning Attack

OWASP ML Top 10 — ML02

Key Finding

Amplitude encoding collapses below 5% accuracy under adversarial perturbation and depolarization noise (p=0.01), while quantum noise weakens the QUID poisoning attack by disrupting Hilbert-space correlations, suggesting noise as an inadvertent defense in NISQ systems.

QUID

Novel technique introduced


Quantum Machine Learning (QML) integrates quantum computational principles into learning algorithms, offering improved representational capacity and computational efficiency. Nevertheless, the security and robustness of QML systems remain underexplored, especially under adversarial conditions. In this paper, we present a systematization of adversarial robustness in QML, integrating conceptual organization with empirical evaluation across three threat models-black-box, gray-box, and white-box. We implement representative attacks in each category, including label-flipping for black-box, QUID encoder-level data poisoning for gray-box, and FGSM and PGD for white-box, using Quantum Neural Networks (QNNs) trained on two datasets from distinct domains: MNIST from computer vision and AZ-Class from Android malware, across multiple circuit depths (2, 5, 10, and 50 layers) and two encoding schemes (angle and amplitude). Our evaluation shows that amplitude encoding yields the highest clean accuracy (93% on MNIST and 67% on AZ-Class) in deep, noiseless circuits; however, it degrades sharply under adversarial perturbations and depolarization noise (p=0.01), dropping accuracy below 5%. In contrast, angle encoding, while offering lower representational capacity, remains more stable in shallow, noisy regimes, revealing a trade-off between capacity and robustness. Moreover, the QUID attack attains higher attack success rates, though quantum noise channels disrupt the Hilbert-space correlations it exploits, weakening its impact in image domains. This suggests that noise can act as a natural defense mechanism in Noisy Intermediate-Scale Quantum (NISQ) systems. Overall, our findings guide the development of secure and resilient QML architectures for practical deployment. These insights underscore the importance of designing threat-aware models that remain reliable under real-world noise in NISQ settings.


Key Contributions

  • Systematization of adversarial threat models (black-box, gray-box, white-box) for Quantum Neural Networks with empirical evaluation across circuit depths (2–50 layers) and two encoding schemes.
  • Empirical finding that amplitude encoding achieves highest clean accuracy (93% on MNIST) but degrades catastrophically under adversarial perturbation and depolarization noise (dropping below 5%), while angle encoding is more robust in noisy shallow circuits.
  • Evidence that quantum depolarization noise disrupts Hilbert-space correlations exploited by the QUID attack, suggesting NISQ noise can function as a natural defense mechanism.

🛡️ Threat Analysis

Input Manipulation Attack

Implements white-box inference-time attacks (FGSM and PGD) against QNNs, evaluating adversarial perturbation effects across circuit depths and encoding schemes — direct input manipulation attack evaluation.

Data Poisoning Attack

Implements black-box label-flipping and gray-box QUID encoder-level data poisoning attacks against QNN training pipelines, evaluating attack success rates and impact on model accuracy.


Details

Domains
vision
Model Types
traditional_ml
Threat Tags
white_boxgrey_boxblack_boxtraining_timeinference_timeuntargeteddigital
Datasets
MNISTAZ-Class
Applications
image classificationandroid malware classification