attack 2026

Adversarial Vulnerabilities in Neural Operator Digital Twins: Gradient-Free Attacks on Nuclear Thermal-Hydraulic Surrogates

Samrendra Roy ¹, Kazuma Kobayashi ¹, Souvik Chakraborty ², Rizwan-uddin ¹, Syed Bahauddin Alam ^1,3

¹ University of Illinois Urbana-Champaign

² Indian Institute of Technology Delhi

³ National Center for Supercomputing Applications

0 citations

Published on arXiv

2603.22525

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Sparse adversarial perturbations (< 1% of inputs) increase relative L2 error from ~1.5% to 37-63% across four operator architectures, with 100% of single-point attacks passing z-score anomaly detection

Gradient-free differential evolution attack on neural operators

Novel technique introduced

Operator learning models are rapidly emerging as the predictive core of digital twins for nuclear and energy systems, promising real-time field reconstruction from sparse sensor measurements. Yet their robustness to adversarial perturbations remains uncharacterized, a critical gap for deployment in safety-critical systems. Here we show that neural operators are acutely vulnerable to extremely sparse (fewer than 1% of inputs), physically plausible perturbations that exploit their sensitivity to boundary conditions. Using gradient-free differential evolution across four operator architectures, we demonstrate that minimal modifications trigger catastrophic prediction failures, increasing relative $L_2$ error from $\sim$1.5% (validated accuracy) to 37-63% while remaining completely undetectable by standard validation metrics. Notably, 100% of successful single-point attacks pass z-score anomaly detection. We introduce the effective perturbation dimension $d_{\text{eff}}$, a Jacobian-based diagnostic that, together with sensitivity magnitude, yields a two-factor vulnerability model explaining why architectures with extreme sensitivity concentration (POD-DeepONet, $d_{\text{eff}} \approx 1$) are not necessarily the most exploitable, since low-rank output projections cap maximum error, while moderate concentration with sufficient amplification (S-DeepONet, $d_{\text{eff}} \approx 4$) produces the highest attack success. Gradient-free search outperforms gradient-based alternatives (PGD) on architectures with gradient pathologies, while random perturbations of equal magnitude achieve near-zero success rates, confirming that the discovered vulnerabilities are structural. Our findings expose a previously overlooked attack surface in operator learning models and establish that these models require robustness guarantees beyond standard validation before deployment.

Key Contributions

First demonstration that neural operator models are vulnerable to extremely sparse adversarial perturbations (< 1% of inputs) that cause catastrophic prediction failures while passing standard validation
Introduction of effective perturbation dimension (d_eff) as a Jacobian-based diagnostic that predicts vulnerability across operator architectures
Demonstration that gradient-free differential evolution outperforms gradient-based attacks (PGD) on architectures with gradient pathologies, while random perturbations achieve near-zero success rates

🛡️ Threat Analysis

Input Manipulation Attack

Paper demonstrates adversarial perturbation attacks on neural operator models at inference time, crafting minimal input perturbations (sparse boundary condition modifications) that cause catastrophic prediction failures (L2 error increasing from 1.5% to 37-63%). Uses both gradient-free differential evolution and gradient-based PGD attacks to manipulate inputs and cause model failures. This is a clear input manipulation attack on ML models during inference.

Details

Domains

vision

Model Types

traditional_ml

Threat Tags

black_boxinference_timetargeteddigital

Datasets

Nuclear thermal-hydraulic simulation data

Applications

nuclear reactor thermal-hydraulic monitoringdigital twinsfield reconstruction from sensor data

Read PDF arXiv

Adversarial Vulnerabilities in Neural Operator Digital Twins: Gradient-Free Attacks on Nuclear Thermal-Hydraulic Surrogates

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

MS-GAGA: Metric-Selective Guided Adversarial Generation Attack

Adversarial Patch Attack for Ship Detection via Localized Augmentation

On the Adversarial Robustness of Learning-based Conformal Novelty Detection

Everywhere Attack: Attacking Locally and Globally to Boost Targeted Transferability

Out-of-the-box: Black-box Causal Attacks on Object Detectors

SEGA: A Transferable Signed Ensemble Gaussian Black-Box Attack against No-Reference Image Quality Assessment Models

Physical ID-Transfer Attacks against Multi-Object Tracking via Adversarial Trajectory

eXIAA: eXplainable Injections for Adversarial Attack