Distributionally Robust Safety Verification of Neural Networks via Worst-Case CVaR
Published on arXiv
2509.17413
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
Proposed SDP-based framework provides tail-risk-aware safety certificates that systematically trade conservatism for tolerance to worst-case distributional tail events while preserving the computational tractability of prior QC/SDP verification methods.
WC-CVaR QC/SDP Verification
Novel technique introduced
Ensuring the safety of neural networks under input uncertainty is a fundamental challenge in safety-critical applications. This paper builds on and expands Fazlyab's quadratic-constraint (QC) and semidefinite-programming (SDP) framework for neural network verification to a distributionally robust and tail-risk-aware setting by integrating worst-case Conditional Value-at-Risk (WC-CVaR) over a moment-based ambiguity set with fixed mean and covariance. The resulting conditions remain SDP-checkable and explicitly account for tail risk. This integration broadens input-uncertainty geometry-covering ellipsoids, polytopes, and hyperplanes-and extends applicability to safety-critical domains where tail-event severity matters. Applications to closed-loop reachability of control systems and classification are demonstrated through numerical experiments, illustrating how the risk level $\varepsilon$ trades conservatism for tolerance to tail events-while preserving the computational structure of prior QC/SDP methods for neural network verification and robustness analysis.
Key Contributions
- Extends Fazlyab's QC/SDP neural network verification framework to incorporate Worst-Case Conditional Value-at-Risk (WC-CVaR) over a moment-based distributional ambiguity set
- Derives SDP-checkable safety conditions that explicitly account for tail risk, broadening input-uncertainty geometry to ellipsoids, polytopes, and hyperplanes
- Establishes a formal equivalence between WC-CVaR and confidence-ellipsoid methods on special ellipsoidal sets, with numerical validation on closed-loop reachability and classification tasks
🛡️ Threat Analysis
Proposes a certified robustness / safety verification method for neural networks under input perturbation — extending QC/SDP-based robustness analysis (a known ML01 defense technique) with distributionally robust WC-CVaR to provide formal safety certificates against worst-case tail-event inputs.